I have a long format dataset with data error on the variables of kg5, k68, and k912 like this,
clear
input byte (id year gr kg5 k68 k912)
1 1 0 0 0 0
1 2 1 0 0 0
1 3 2 0 0 0
1 4 3 0 0 0
1 5 4 0 0 0
1 6 5 0 0 0
1 7 6 0 0 0
1 8 7 0 0 0
1 9 8 0 0 0
1 10 9 0 0 0
1 11 . 0 0 0
1 12 9 0 0 0
1 13 10 0 0 0
2 1 0 0 0 0
2 2 1 0 0 0
2 3 2 0 0 0
2 4 3 0 0 0
2 5 4 0 0 0
2 6 5 0 0 0
2 7 6 0 0 0
2 8 7 0 0 0
2 9 8 0 0 0
2 10 9 0 0 0
2 11 10 0 0 0
2 12 . 0 0 0
2 13 9 0 0 0
3 1 0 0 0 0
3 2 . 0 0 0
3 3 . 0 0 0
3 4 . 0 0 0
3 5 . 0 0 0
3 6 . 0 0 0
3 7 . 0 0 0
3 8 . 0 0 0
3 9 . 0 0 0
3 10 9 0 0 0
3 11 . 0 0 0
3 12 . 0 0 0
3 13 9 0 0 0
4 1 0 0 0 0
4 2 . 0 0 0
4 3 . 0 0 0
4 4 . 0 0 0
4 5 . 0 0 0
4 6 . 0 0 0
4 7 . 0 0 0
4 8 . 0 0 0
4 9 8 0 0 0
4 10 . 0 0 0
4 11 10 0 0 0
4 12 9 0 0 0
4 13 10 0 0 0
5 1 0 0 0 0
5 2 1 0 0 0
5 3 2 0 0 0
5 4 3 0 0 0
5 5 4 0 0 0
5 6 . 0 0 0
5 7 4 0 0 0
6 1 0 0 0 0
6 2 1 0 0 0
6 3 2 0 0 0
6 4 3 0 0 0
6 5 4 0 0 0
6 6 5 0 0 0
6 7 6 0 0 0
6 8 . 0 0 0
6 9 . 0 0 0
6 10 . 0 0 0
6 11 6 0 0 0
end
As can be seen on the dataset, all the values of variables that start with "k" are zero. However, it is not completely correct.
The correct rule is:
When the variable "gr" repeated grades within id, the value of corresponding "K" starting variable should be equal to 1.
For example, for the person with ID==1, gr repeated value of "9" in year 12, then k912 should be 1. (Because the student is retained in the range from 9 to 12 grades).
Like this, for the person with ID==6, gr repeated value of "6" in year 11., then k68 should be 1. (Because the student is retained in the range from 6 to 8 grades).
How can I use Stata code to correct the data error?
Thank you for your help!
Related Posts with How to Correct Data Errors in Longitudinal Format Data in Stata?
keep observations between staryear and endyear.Dear All, I have this dataset, Code: * Example generated by -dataex-. To install: ssc install datae…
Panel data graphHi. I need help. I want to make a graph with panel data, in which I must include the trend of the c…
calculating the number by comparing this period and previous periodDear All, Suppose that I have this data Code: * Example generated by -dataex-. To install: ssc inst…
Force empty observations to be included in analysis?If you have data that was collected with the intent all observations be used in analysis because eac…
Extracting the Beta Coeff from regressionHello all, I am running a linear regression, with 2 sets of dummy variables age (25-65) and cohort(…
Subscribe to:
Post Comments (Atom)
0 Response to How to Correct Data Errors in Longitudinal Format Data in Stata?
Post a Comment