Hi Statalist,
I am new her but I have learned a lot from you so far, so thank you all.
I am using STATA IC 15.1 and the problem I am facing is as follows:
There are 17 variables e.g. n_1_0 to n_3_2 and the values in these variables like this: number ranged from 1001 -99999 and (.).
The code I want is:
1044 for the cases
Missing (.) values for controls
Any other numbers excludes 1044 and 99999 and . will represents other diseases group
The data look like this:
id n_1_0 n_1_2 n_1_3 n_1_4 n_1_5
1 - - - - -
2 1022 1075 - - -
3 - - 99999 - -
4 - 1044 - 1044 --
5 1044 - - - 1006
6 - - 1044 - -
7 1010 - - 1044 -
etc.
Now I have coded the cases just fine. The code is
gen status = .
replace status = 1 if n_1_0==1044 | n_1_2==1044 | n_1_3==1044 | n_1_4==1044 | n_1_5==1044 <<< any time number 1044 recorded that's why I used | (OR)
and I got 4,123 hits
similar to controls:
replace status=2 if n_1_0==. & n_1_2==. & n_1_3==. & n_1_4==. & n_1_5==. <<< It has to be missing in all variables to be control, that's why I used & (AND)
and I got 457,300
Here the problem arise every time I try to code for other diseases. And the code I used is:
replace status = 3 if n_1_0 >=1001 & n_1_0 <99999 & n_1_0 !=1044 and repeat it for other variables.
What happen after this command is that the number of cases reduced to 3,745 and I think the issue comes from examples id 4 where number 1044 occur twice and id 5 where there is different number such as 1044 and 1006 and the number 1044 comes first and vice versa in id 7.
I hope anyone help me with this problem and what is the best way to solve it as I am going to deal with much larger data sets like this.
Thanks!
Related Posts with Coding for three groups - cases, controls , and other diseases - What a dilemma!
m:1 merge | r(9) code halting do fileHello all, I have a do file with a m:1 merge merging in mutation characteristics (missense mutation…
Questions about recoding in panel dataHi there! I have a problem in recoding the panel data. As you can see below, this panel data describ…
Keeping top 100 observations in a variableDear STATA users, I wanted to keep the top 100 observations in the lPCINC (log per capita income) v…
Dummy variablesPlease I'm very new to using stata, and I want to create dummy variable for values of 0-1 in equal h…
Bootstrapping standard errors for two-stage program combining cross-sectional and panel dataI have two samples from different populations from which I am conducting a two-stage estimation proc…
Subscribe to:
Post Comments (Atom)
0 Response to Coding for three groups - cases, controls , and other diseases - What a dilemma!
Post a Comment