Hello,
I'm very new to Stata and am trying to complete some data cleaning. I have a dataset with 5 variables and around 200 million observations. The variables are all numeric, and I would like to check that three of them have been encoded correctly, as they were originally categorical (string) variables. For example, I would like to know if the numerical code captures distinct countries for the country variable (there may be typos in the original categories, for instance).
The original string variables are not available, but Stata shows the country names in browse (the categorical variable), but treats the variable as numeric in the data editor. Is there any way to check what the equivalencies between the two are?
Thank you in advance for any help you might be able to give me!
Best wishes,
Clara
Related Posts with Data cleaning - checking correct encoding of variables
Interpreting effect sizeDear all, I will please like to know how to discuss the result of a regression analysis e.g. Ols pr…
How to Create and Merge New VariablesHi All, I need to create two new variables (one for HIV questions and one for STD questions) that me…
please help me about this code error: invalid options:t(below)the code is below: asdoc reghdfe tq rds ,absorb(id) vce(r) dec(4) rep(t) t(below) nest replace Array…
Writing loop to identify individuals with consecutive positive values (n, n+1, n+2, ...)Hi, First time Statalist poster, long time follower. Thanks for this great resource. I am working …
Omission of variables while using interactions in panel dataDear All, is this normal to get a results like below while using interactions (my variables got omit…
Subscribe to:
Post Comments (Atom)
0 Response to Data cleaning - checking correct encoding of variables
Post a Comment