I have a long format dataset with data error on the variables of kg5, k68, and k912 like this,
clear
input byte (id year gr kg5 k68 k912)
1 1 0 0 0 0
1 2 1 0 0 0
1 3 2 0 0 0
1 4 3 0 0 0
1 5 4 0 0 0
1 6 5 0 0 0
1 7 6 0 0 0
1 8 7 0 0 0
1 9 8 0 0 0
1 10 9 0 0 0
1 11 . 0 0 0
1 12 9 0 0 0
1 13 10 0 0 0
2 1 0 0 0 0
2 2 1 0 0 0
2 3 2 0 0 0
2 4 3 0 0 0
2 5 4 0 0 0
2 6 5 0 0 0
2 7 6 0 0 0
2 8 7 0 0 0
2 9 8 0 0 0
2 10 9 0 0 0
2 11 10 0 0 0
2 12 . 0 0 0
2 13 9 0 0 0
3 1 0 0 0 0
3 2 . 0 0 0
3 3 . 0 0 0
3 4 . 0 0 0
3 5 . 0 0 0
3 6 . 0 0 0
3 7 . 0 0 0
3 8 . 0 0 0
3 9 . 0 0 0
3 10 9 0 0 0
3 11 . 0 0 0
3 12 . 0 0 0
3 13 9 0 0 0
4 1 0 0 0 0
4 2 . 0 0 0
4 3 . 0 0 0
4 4 . 0 0 0
4 5 . 0 0 0
4 6 . 0 0 0
4 7 . 0 0 0
4 8 . 0 0 0
4 9 8 0 0 0
4 10 . 0 0 0
4 11 10 0 0 0
4 12 9 0 0 0
4 13 10 0 0 0
5 1 0 0 0 0
5 2 1 0 0 0
5 3 2 0 0 0
5 4 3 0 0 0
5 5 4 0 0 0
5 6 . 0 0 0
5 7 4 0 0 0
6 1 0 0 0 0
6 2 1 0 0 0
6 3 2 0 0 0
6 4 3 0 0 0
6 5 4 0 0 0
6 6 5 0 0 0
6 7 6 0 0 0
6 8 . 0 0 0
6 9 . 0 0 0
6 10 . 0 0 0
6 11 6 0 0 0
end
As can be seen on the dataset, all the values of variables that start with "k" are zero. However, it is not completely correct.
The correct rule is:
When the variable "gr" repeated grades within id, the value of corresponding "K" starting variable should be equal to 1.
For example, for the person with ID==1, gr repeated value of "9" in year 12, then k912 should be 1. (Because the student is retained in the range from 9 to 12 grades).
Like this, for the person with ID==6, gr repeated value of "6" in year 11., then k68 should be 1. (Because the student is retained in the range from 6 to 8 grades).
How can I use Stata code to correct the data error?
Thank you for your help!
Related Posts with How to Correct Data Errors in Longitudinal Format Data in Stata?
How to expand the variable within group?Here is what I am trying to do I want to make each one of the observations in the following dataset …
Fixed effect regression questionsHello, i'm kinda new to stata and empirical research so sorry if this question is very basic. 1: S…
Force negative time values in 'stset'Dear Stata users, I wonder if there is any way to force negative time values (age-centred) in stset…
Stata 18 data editor displaying ampersands incorrectly in string variable observationsI am experiencing an issue with the data editor displaying string ampersands (ASCII Code 38) in Stat…
One-step stochastic frontier using StataGreetings, dear Statalist members! Could I use Stata to estimate a one-stage stochastic frontier wit…
Subscribe to:
Post Comments (Atom)
0 Response to How to Correct Data Errors in Longitudinal Format Data in Stata?
Post a Comment