Creating a Dummy variable for every first duplicate observation (using -Duplicates Drop- without deleting duplicate observations)

Hey all,

I was wondering if there is a formula to create a dummy variable for the first observation only inside of a group.

For example, in the following dataset (dataex provided below):

1. Under each group of IDs, can I create a dummy variable to indicate the first Treatment == 1, but 0 for any subsequent Treatment == 1.

2. For example, for ID 111, can I create a new dummy (Treat_First) that would indicate 1 for year 2000, but zero for rest of the observations under 111 (and so on for rest of the IDs)

Here is a sample code of what I'm trying to achieve:

Code:

*To see which IDs have duplicates
duplicates tag id treat if treat==1, gen(duplicates)
browse if !missing(duplicates) & duplicates > 0


*Duplicates drop example:
duplicates drop id duplicates, force
browse if !missing(duplicates) & duplicates > 0


*How can I keep the duplicate observations instead of dropping them?

* Example generated by -dataex-. To install: ssc install dataex clear input int(id year income) byte treat 111 2000 10 1 111 2001 40 0 111 2002 90 0 111 2003 100 1 111 2004 120 0 111 2005 190 0 333 2000 10 1 333 2001 45 1 333 2002 90 0 333 2003 110 1 333 2004 160 0 333 2005 240 1 333 2006 290 0 333 2007 380 0 555 2000 10 0 555 2001 20 1 555 2002 85 0 555 2003 195 0 555 2004 215 0 end

BJ Data Tech Solution

0 Response to Creating a Dummy variable for every first duplicate observation (using -Duplicates Drop- without deleting duplicate observations)

Post a Comment

Related Posts with Creating a Dummy variable for every first duplicate observation (using -Duplicates Drop- without deleting duplicate observations)

0 Response to Creating a Dummy variable for every first duplicate observation (using -Duplicates Drop- without deleting duplicate observations)

Post a Comment