Hello,
I'm very new to Stata and am trying to complete some data cleaning. I have a dataset with 5 variables and around 200 million observations. The variables are all numeric, and I would like to check that three of them have been encoded correctly, as they were originally categorical (string) variables. For example, I would like to know if the numerical code captures distinct countries for the country variable (there may be typos in the original categories, for instance).
The original string variables are not available, but Stata shows the country names in browse (the categorical variable), but treats the variable as numeric in the data editor. Is there any way to check what the equivalencies between the two are?
Thank you in advance for any help you might be able to give me!
Best wishes,
Clara
Related Posts with Data cleaning - checking correct encoding of variables
2022 Northern European Stata Conference2022 Northern European Stata Conference Oslo, Norway, Wednesday 12 October 2022 First announcement…
age of the firmDear Statalisters, i sort firms by their age. i define the firm age as the difference between the c…
Interpretation of 198% mediation effectHey everyone, I run an insignificant direkt effect of my X and Y variable but I want to test for a …
Graph bar with a grading palette of colorsHi there, I am trying to picture a bar graph. However, instead of randomly selected colours I would…
dropping the outlierHello, I want to know how can I identify and drop the outlier from my data, because when I did a sca…
Subscribe to:
Post Comments (Atom)
0 Response to Data cleaning - checking correct encoding of variables
Post a Comment