I am working with two rounds of survey data, that interviews individuals across different states (varname v024) in India. I want to append the datasets but there are a few issues with the encoded state names that I need to sort out.
For example in data from 2015-16
Code:
tab v024 state | Freq. Percent Cum. ----------------------------+----------------------------------- andaman and nicobar islands | 2,811 0.40 0.40 andhra pradesh | 10,428 1.49 1.89 arunachal pradesh | 14,294 2.04 3.94 assam | 28,447 4.07 8.00 bihar | 45,812 6.55 14.55 chandigarh | 746 0.11 14.65 chhattisgarh | 25,172 3.60 18.25 --------------------------------------------------------------
In the data from 2005-06, however, label names and values change:
Code:
tab v024 state | Freq. Percent Cum. -----------------------+----------------------------------- [jm] jammu and kashmir | 3,281 2.64 2.64 [hp] himachal pradesh | 3,193 2.57 5.20 [pj] punjab | 3,681 2.96 8.16 [uc] uttaranchal | 2,953 2.37 10.54 [hr] haryana | 2,790 2.24 12.78 [dl] delhi | 3,349 2.69 15.47 [rj] rajasthan | 3,892 3.13 18.60 [up] uttar pradesh | 12,183 9.79 28.40 [bh] bihar | 3,818 3.07 31.47 [sk] sikkim | 2,127 1.71 33.18 [ar] arunachal pradesh | 1,647 1.32 34.50 [na] nagaland | 3,896 3.13 37.63 [mn] manipur | 4,512 3.63 41.26 [mz] mizoram | 1,791 1.44 42.70 [tr] tripura | 1,906 1.53 44.23 [mg] meghalaya | 2,124 1.71 45.94 [as] assam | 3,840 3.09 49.03 [wb] west bengal | 6,794 5.46 54.49 [jh] jharkhand | 2,983 2.40 56.89 [or] orissa | 4,540 3.65 60.54 [ch] chhattisgarh | 3,810 3.06 63.60 [mp] madhya pradesh | 6,427 5.17 68.77 [gj] gujarat | 3,729 3.00 71.77 [mh] maharashtra | 9,034 7.26 79.03 [ap] andhra pradesh | 7,128 5.73 84.76 [ka] karnataka | 6,008 4.83 89.59 [go] goa | 3,464 2.78 92.37 [ke] kerala | 3,566 2.87 95.24 [tn] tamil nadu | 5,919 4.76 100.00 -----------------------+----------------------------------- Total | 124,385 100.00
I thought to fix this I could instead generate a new variable called state, replace values and define labels to match 2015-16, and then append the two, dataset after creating a variable called state in 2015-16.
Code:
gen state =. replace state = 2 if v024 == 28 replace state = 3 if v024 == 12 replace state = 4 if v024 == 18 label define 2 "andhra pradesh" 3 "arunachal pradesh" 4 "assam"
My question now is, given the rather large number of observations,how do I find the corresponding value behind each label without having to scroll through the data browser ie 1 - andaman and nicobar islands, 2- andhra pradesh 3 - arunachal pradesh etc? Also does the aforementioned method seem like the most efficient way to accomplish the correct append?
Thanks a lot!
Best,
Lori
0 Response to Finding the values behind encode
Post a Comment