Hi,
I have a data set that uniquely defines a household via 2 variables: conglome vivienda
Within a household, codperso identifies individuals.
A variable p210 tells me if a person has a spouse living in the same household.
I would like to delete all observations that have households with more than 2 individuals who have a spouse in the household i.e. households with more than one married couple living in it.
I have done the following so far:
. sort conglome vivienda
. quietly by conglome vivienda: gen dup = cond(_N==1,0,_n) if p210==1
. tabulate dup
dup Freq. Percent Cum.
0 15 1.13 1.13
1 618 46.64 47.77
2 618 46.64 94.42
3 41 3.09 97.51
4 33 2.49 100.00
Essentially, for any household that has dup reaching 3 or 4, I want to delete all observations in that household (not just the observations for which dup == 3 | dup == 4).
Could anyone advise on a solution?
Thank you!
Related Posts with Remove observations when there are more than 2 rows with the same ID
rngstate not foundHi all, I am trying to figure out the best way to set a seed and have reproducible results. Now, I …
how to split variables into several variablesHi All, I am using IFLS data and I have a height variable for every HH member (pidlink) within the s…
Panel unit root test for dummy variables and use of log on Index variableHello Altruists, My panel data consists of n=30 and t=27 where I am using two dummy variables. In m…
PCA Rotation confusionHello, I'm rather new to PCA, and I'm struggling to interpret and move forward with my output. The …
logistic regressions - interpreting odds ratios and marginsDear Statalist I have run a simple logistic regression using a child labour survey in an East Afric…
Subscribe to:
Post Comments (Atom)
0 Response to Remove observations when there are more than 2 rows with the same ID
Post a Comment