I have a long format dataset with data error on the variables of kg5, k68, and k912 like this,
clear
input byte (id year gr kg5 k68 k912)
1 1 0 0 0 0
1 2 1 0 0 0
1 3 2 0 0 0
1 4 3 0 0 0
1 5 4 0 0 0
1 6 5 0 0 0
1 7 6 0 0 0
1 8 7 0 0 0
1 9 8 0 0 0
1 10 9 0 0 0
1 11 . 0 0 0
1 12 9 0 0 0
1 13 10 0 0 0
2 1 0 0 0 0
2 2 1 0 0 0
2 3 2 0 0 0
2 4 3 0 0 0
2 5 4 0 0 0
2 6 5 0 0 0
2 7 6 0 0 0
2 8 7 0 0 0
2 9 8 0 0 0
2 10 9 0 0 0
2 11 10 0 0 0
2 12 . 0 0 0
2 13 9 0 0 0
3 1 0 0 0 0
3 2 . 0 0 0
3 3 . 0 0 0
3 4 . 0 0 0
3 5 . 0 0 0
3 6 . 0 0 0
3 7 . 0 0 0
3 8 . 0 0 0
3 9 . 0 0 0
3 10 9 0 0 0
3 11 . 0 0 0
3 12 . 0 0 0
3 13 9 0 0 0
4 1 0 0 0 0
4 2 . 0 0 0
4 3 . 0 0 0
4 4 . 0 0 0
4 5 . 0 0 0
4 6 . 0 0 0
4 7 . 0 0 0
4 8 . 0 0 0
4 9 8 0 0 0
4 10 . 0 0 0
4 11 10 0 0 0
4 12 9 0 0 0
4 13 10 0 0 0
5 1 0 0 0 0
5 2 1 0 0 0
5 3 2 0 0 0
5 4 3 0 0 0
5 5 4 0 0 0
5 6 . 0 0 0
5 7 4 0 0 0
6 1 0 0 0 0
6 2 1 0 0 0
6 3 2 0 0 0
6 4 3 0 0 0
6 5 4 0 0 0
6 6 5 0 0 0
6 7 6 0 0 0
6 8 . 0 0 0
6 9 . 0 0 0
6 10 . 0 0 0
6 11 6 0 0 0
end
As can be seen on the dataset, all the values of variables that start with "k" are zero. However, it is not completely correct.
The correct rule is:
When the variable "gr" repeated grades within id, the value of corresponding "K" starting variable should be equal to 1.
For example, for the person with ID==1, gr repeated value of "9" in year 12, then k912 should be 1. (Because the student is retained in the range from 9 to 12 grades).
Like this, for the person with ID==6, gr repeated value of "6" in year 11., then k68 should be 1. (Because the student is retained in the range from 6 to 8 grades).
How can I use Stata code to correct the data error?
Thank you for your help!
Related Posts with How to Correct Data Errors in Longitudinal Format Data in Stata?
Dummy variable for neonatal Mortality, Infant mortality and Child mortalityHi, I am working on the Demographic health survey (DHS) data. I have to compute the neonatal Mortali…
ICD 10 to ICD-9 mapping using GEMsHi stata users, anybody has any experience in using GEMs files to map backwards from icd10 to ICD10 …
-margins- after -xtlogit,fe-Dear Statalist This is a question about interpreting the results from a panel data fixed-effects lo…
SEM model group analysis not concaveHello there ! I am trying to use Stata and check my SEM model for measurement in-variance among two…
Loop over multiple arrays via foreachHello together! trying now since a few hours to fix my problem - I am not able to come up with an a…
Subscribe to:
Post Comments (Atom)
0 Response to How to Correct Data Errors in Longitudinal Format Data in Stata?
Post a Comment