Dear all,
I am using Stata 16, on mac and have provided a sample of my dataset using dataex.
I am having a problem. My dataset is based on a longitudinal survey so the same participants were tracked from 1997-2009. The participants ID number is given the variable name PUBID_1997.The participants who dropped out of particular years is indicated by -5 for the age_months. For example, the second participant that was tracked from 1997-2009 has -5 for age_months since they dropped out of the study in 2006 and 2007. The problem is when I use the Stata code drop if age_months==-5 for example it doesn't not get rid of the second participant entirely but only deletes the second participant's rows for the years 2006 and 2007. Is there a way to write a Stata code that deletes a participant entirely if age_months==-5? So for example, since participant 2 dropped out of the study I would like to remove them completely from my analysis.
Here is a sample of my dataset using dataex
nput int(PUBID_1997 year) byte(sex_1997 race_1997) int age_months byte(ever_smoked ever_alcohol ever_marijuana)
1 1997 2 4 190 1 1 0
1 1998 2 4 206 -4 -4 -4
1 1999 2 4 219 -4 -4 -4
1 2000 2 4 231 -4 -4 -4
1 2001 2 4 243 . . .
1 2002 2 4 256 . . -4
1 2003 2 4 266 . . .
1 2004 2 4 279 -4 -4 -4
1 2005 2 4 290 -4 -4 -4
1 2006 2 4 302 . . .
1 2007 2 4 313 . . .
1 2008 2 4 325 . . .
1 2009 2 4 337 . . .
2 1997 1 2 178 0 0 0
2 1998 1 2 196 -4 -4 -4
2 1999 1 2 209 -4 -4 -4
2 2000 1 2 221 -4 -4 -4
2 2001 1 2 232 . . .
2 2002 1 2 245 . . -4
2 2003 1 2 256 . . .
2 2004 1 2 268 -4 -4 -4
2 2005 1 2 284 -4 -4 -4
2 2006 1 2 -5 . . .
2 2007 1 2 -5 . . .
2 2008 1 2 318 . . .
2 2009 1 2 330 . . .
3 1997 2 2 163 0 1 0
3 1998 2 2 182 -4 -4 -4
3 1999 2 2 197 -4 -4 -4
3 2000 2 2 210 -4 -4 -4
3 2001 2 2 222 . . .
3 2002 2 2 232 . . -4
3 2003 2 2 249 . . .
3 2004 2 2 255 -4 -4 -4
3 2005 2 2 -5 -5 -5 -5
3 2006 2 2 -5 . . .
3 2007 2 2 -5 . . .
3 2008 2 2 -5 . . .
3 2009 2 2 317 . . .
4 1997 2 2 192 0 1 0
4 1998 2 2 213 -4 -4 -4
4 1999 2 2 228 -4 -4 -4
4 2000 2 2 238 -4 -4 -4
4 2001 2 2 251 . . .
4 2002 2 2 262 . . -4
4 2003 2 2 276 . . .
4 2004 2 2 287 -4 -4 -4
4 2005 2 2 297 -4 -4 -4
4 2006 2 2 309 . . .
4 2007 2 2 320 . . .
4 2008 2 2 336 . . .
4 2009 2 2 344 . . .
5 1997 1 2 186 1 1 1
5 1998 1 2 194 -4 -4 -4
5 1999 1 2 205 -4 -4 -4
5 2000 1 2 218 -4 -4 -4
5 2001 1 2 234 . . .
5 2002 1 2 243 . . -4
5 2003 1 2 255 . . .
5 2004 1 2 266 -4 -4 -4
5 2005 1 2 277 -4 -4 -4
5 2006 1 2 289 . . .
5 2007 1 2 300 . . .
5 2008 1 2 312 . . .
5 2009 1 2 323 . . .
6 1997 2 2 188 0 0 0
6 1998 2 2 202 -4 -4 -4
6 1999 2 2 215 -4 -4 -4
6 2000 2 2 229 -4 -4 -4
6 2001 2 2 242 . . .
6 2002 2 2 251 . . -4
6 2003 2 2 266 . . .
6 2004 2 2 278 -4 -4 -4
6 2005 2 2 290 -4 -4 -4
6 2006 2 2 302 . . .
6 2007 2 2 310 . . .
6 2008 2 2 323 . . .
6 2009 2 2 334 . . .
7 1997 1 2 173 0 0 0
7 1998 1 2 187 -4 -4 -4
7 1999 1 2 200 -4 -4 -4
7 2000 1 2 214 -4 -4 -4
7 2001 1 2 224 . . .
7 2002 1 2 237 . . -4
7 2003 1 2 -5 . . .
7 2004 1 2 -5 -5 -5 -5
7 2005 1 2 275 -4 -4 -4
7 2006 1 2 283 . . .
7 2007 1 2 -5 . . .
7 2008 1 2 -5 . . .
7 2009 1 2 319 . . .
8 1997 2 4 202 0 0 0
8 1998 2 4 210 -4 -4 -4
8 1999 2 4 221 -4 -4 -4
8 2000 2 4 234 -4 -4 -4
8 2001 2 4 246 . . .
8 2002 2 4 258 . . -4
8 2003 2 4 273 . . .
8 2004 2 4 282 -4 -4 -4
8 2005 2 4 295 -4 -4 -4
end
label values PUBID_1997 vlR0000100
label def vlR0000100 1 "1 TO 999", modify
label values sex_1997 vlR0536300
label def vlR0536300 1 "Male", modify
label def vlR0536300 2 "Female", modify
label values race_1997 vlR1482600
label def vlR1482600 2 "Hispanic", modify
label def vlR1482600 4 "Non-Black / Non-Hispanic", modify
Thank you in advance for your help
Jason Browen
Related Posts with Drop rows of PUBID Observations based on condition
Multiplication errors using genI'm using the code below to generate a new variable that is the unique id multiplies by 100. However…
Storing and appending marginal effects for successive mlogit models in a single matrixDear members of the list, I am running a multinomial logit regression for each one of a series of c…
Multiplication errors using genI'm using the code below to generate a new variable that is the unique id multiplies by 100. However…
Heckman CorrectionDear All: I use logistic regression analysis to model the likelihood that a drug-development projec…
Exporting table to excelestimates table pre_crisis pre_crisis1 pre_crisis2 crisis crisis1 crisis2 post_crisis post_crisis1 p…
Subscribe to:
Post Comments (Atom)
0 Response to Drop rows of PUBID Observations based on condition
Post a Comment