I need to conduct state level analysis using individual level data. In order to achieve this I need to create a state-level, continuous variable that contains the number of observations by state and year. I also need to calculate a growth variable at the state level between two time points (1970) and (1980). But, I am struggling to create a state level variable in an individual-level dataset.
I am using Stata 16.1 on a Mac OS Catalina version. 10.15.5
I am using IPUMS microdata provided by IPUMS, which are individual level data. IPUMS provides one percent samples of the U.S. Census which is collected every ten years. My current dataset includes the years 1940, 1960, 1970, 1980, 2000, and 2010. My research is about Black educators, as such my dataset is restricted to observations identified as “Black” as defined by the census year and respondents with the occ1950 occupation code for “teacher”, and I further restricted it to individuals aged 20 or above.
I have tried generating new variables for each state that contain the correct number of teachers with such code as:
Gen NY1970 = NY if year==1970.
Where NY is a dummy variable. All observations in the dataset are black teachers.
I have dummy variables for years and states, but this code seems to generate a case of the variable for each observation as opposed to a state level variable.
I also have failed to generate a growth variable with such code as:
gen growth = 100 * (teacher70 - teacher80/teacher70)
I used a dummy variable for “teacher (which is everyone in the dataset) and then created dummy variables “teacher70” and “teacher80” for teachers in appropriate years.
For instance,
gen teacher70 = teacher if Y1970==1
However, the growth variable counted every observation in the dataset as missing data. I think it didn’t work because the dummy variables canceled each other out.
I then tried to create continuous variables to measure teachers with the code:
gen teacher70 = 2455 if year==1970
gen teacher80 = 3913 if Y1980=1
But that also resulted in all observations counted as missing. How can I generate the necessary state level variables? Do I just have to generate those variables in a separate file and merge to two files? This is my first time trying to accomplish such a task and I'm unsure of the procedure. I appreciate any suggestions and guidance. I will provide any additional information that might be helpful or clarifying. Below is a sample of my data. Please let me know if I can provide a better data sample.
Thank you!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int year long(sample serial cbserial) int hhwt double cluster byte(region statefip metro pernum) int perwt byte(sex age race) int(raced occ1950) long incwage float(teacher teacher70 teacher80 teacher801 teacher701 growth) 1940 194001 1110 . 100 1940000011101 33 40 1 6 100 2 50 2 200 93 640 1 . . . . . 1940 194001 1316 . 100 1940000013161 32 28 1 1 100 1 60 2 200 93 190 1 . . . . . 1940 194001 2922 . 100 1940000029221 31 37 1 3 100 2 22 2 200 93 740 1 . . . . . 1940 194001 3751 . 100 1940000037511 31 13 1 2 100 2 38 2 200 93 500 1 . . . . . 1940 194001 4075 . 100 1940000040751 22 29 1 2 100 2 28 2 200 93 405 1 . . . . . 1940 194001 4430 . 100 1940000044301 31 12 1 2 100 2 34 2 200 93 560 1 . . . . . 1940 194001 5247 . 100 1940000052471 33 48 1 2 100 2 37 2 200 93 816 1 . . . . . 1940 194001 5266 . 100 1940000052661 31 45 1 1 100 1 55 2 200 93 385 1 . . . . . 1940 194001 5266 . 100 1940000052661 31 45 1 2 100 2 53 2 200 93 350 1 . . . . . 1940 194001 5266 . 100 1940000052661 31 45 1 3 100 1 24 2 200 93 110 1 . . . . . 1940 194001 5266 . 100 1940000052661 31 45 1 4 100 2 21 2 200 93 110 1 . . . . . 1940 194001 6115 . 100 1940000061151 32 1 3 5 100 2 27 2 200 93 315 1 . . . . . 1940 194001 6182 . 100 1940000061821 22 29 3 1 100 2 46 2 200 93 2000 1 . . . . . 1940 194001 7141 . 77 1940000071411 12 36 3 2 77 2 50 2 200 93 2000 1 . . . . . 1940 194001 7641 . 100 1940000076411 31 37 1 1 100 1 30 2 200 93 750 1 . . . . . 1940 194001 7808 . 100 1940000078081 31 13 1 3 100 2 20 2 200 93 385 1 . . . . . 1940 194001 8206 . 100 1940000082061 33 48 1 3 100 1 24 2 200 93 675 1 . . . . . 1940 194001 8921 . 100 1940000089211 31 13 1 1 100 2 32 2 200 93 280 1 . . . . . 1940 194001 9193 . 100 1940000091931 21 39 3 1 100 1 39 2 200 93 1280 1 . . . . . 1940 194001 10479 . 100 1940000104791 33 48 1 1 100 1 48 2 200 93 1320 1 . . . . . 1940 194001 11556 . 87 1940000115561 31 51 1 2 87 2 31 2 200 93 405 1 . . . . . 1940 194001 11847 . 53 1940000118471 32 28 1 3 53 2 23 2 200 93 120 1 . . . . . 1940 194001 12360 . 100 1940000123601 21 39 3 3 100 2 25 2 200 93 1200 1 . . . . . 1940 194001 12937 . 100 1940000129371 33 48 1 2 100 2 25 2 200 93 250 1 . . . . . 1940 194001 12938 . 100 1940000129381 33 5 1 2 100 2 26 2 200 93 390 1 . . . . . 1940 194001 13696 . 100 1940000136961 31 37 1 2 100 2 33 2 200 93 637 1 . . . . . 1940 194001 13697 . 100 1940000136971 31 37 1 1 100 2 38 2 200 93 680 1 . . . . . 1940 194001 13697 . 100 1940000136971 31 37 1 2 100 2 38 2 200 93 1008 1 . . . . . 1940 194001 13853 . 78 1940000138531 21 39 3 2 78 1 25 2 200 93 852 1 . . . . . 1940 194001 14975 . 100 1940000149751 31 45 4 3 100 2 23 2 200 93 365 1 . . . . . 1940 194001 14989 . 100 1940000149891 31 45 4 2 100 2 36 2 200 93 1200 1 . . . . . 1940 194001 15489 . 100 1940000154891 31 45 4 1 100 2 46 2 200 93 754 1 . . . . . 1940 194001 15489 . 100 1940000154891 31 45 4 2 100 2 28 2 200 93 450 1 . . . . . 1940 194001 15986 . 78 1940000159861 31 24 3 3 78 2 24 2 200 93 200 1 . . . . . 1940 194001 16297 . 100 1940000162971 33 48 3 1 100 2 22 2 200 93 1225 1 . . . . . 1940 194001 16926 . 100 1940000169261 21 39 3 1 100 1 40 2 200 93 0 1 . . . . . 1940 194001 17323 . 100 1940000173231 33 48 3 2 100 2 45 2 200 93 1400 1 . . . . . 1940 194001 18244 . 100 1940000182441 33 48 1 3 100 1 27 2 200 93 540 1 . . . . . 1940 194001 18881 . 100 1940000188811 31 24 3 2 100 2 39 2 200 93 1500 1 . . . . . 1940 194001 21284 . 100 1940000212841 31 37 1 3 100 2 29 2 200 93 800 1 . . . . . 1940 194001 21284 . 100 1940000212841 31 37 1 6 100 2 29 2 200 93 660 1 . . . . . 1940 194001 21284 . 100 1940000212841 31 37 1 7 100 2 24 2 200 93 660 1 . . . . . 1940 194001 21811 . 100 1940000218111 33 40 1 5 100 1 38 2 200 93 800 1 . . . . . 1940 194001 21811 . 100 1940000218111 33 40 1 6 100 2 33 2 200 93 900 1 . . . . . 1940 194001 21854 . 100 1940000218541 31 12 1 2 100 2 30 2 200 93 540 1 . . . . . 1940 194001 22108 . 100 1940000221081 31 51 1 2 100 2 32 2 200 93 464 1 . . . . . 1940 194001 23183 . 100 1940000231831 31 11 3 4 100 2 41 2 200 93 1512 1 . . . . . 1940 194001 23476 . 100 1940000234761 32 28 4 1 100 2 36 2 200 93 320 1 . . . . . 1940 194001 23582 . 100 1940000235821 32 1 1 1 100 1 32 2 200 93 875 1 . . . . . 1940 194001 25609 . 100 1940000256091 31 13 1 3 100 2 27 2 200 93 225 1 . . . . . 1940 194001 26085 . 100 1940000260851 31 12 1 2 100 2 43 2 200 93 600 1 . . . . . 1940 194001 27371 . 100 1940000273711 32 1 1 2 100 2 29 2 200 93 350 1 . . . . . 1940 194001 28031 . 100 1940000280311 21 26 3 2 100 2 36 2 200 93 2360 1 . . . . . 1940 194001 28211 . 100 1940000282111 31 11 3 3 100 2 30 2 200 93 1700 1 . . . . . 1940 194001 28369 . 100 1940000283691 32 21 1 2 100 2 46 2 200 93 720 1 . . . . . 1940 194001 29237 . 100 1940000292371 12 42 3 2 100 2 25 2 200 93 500 1 . . . . . 1940 194001 29779 . 100 1940000297791 21 18 3 1 100 1 56 2 200 93 3000 1 . . . . . 1940 194001 29779 . 100 1940000297791 21 18 3 2 100 2 39 2 200 93 1200 1 . . . . . 1940 194001 30337 . 100 1940000303371 33 40 1 6 100 2 24 2 200 93 900 1 . . . . . 1940 194001 30681 . 70 1940000306811 22 29 3 2 70 2 40 2 200 93 1300 1 . . . . . 1940 194001 30770 . 100 1940000307701 31 37 1 2 100 1 37 2 200 93 500 1 . . . . . 1940 194001 31141 . 100 1940000311411 31 13 1 4 100 2 63 2 200 93 190 1 . . . . . 1940 194001 31361 . 100 1940000313611 12 34 3 4 100 1 32 2 200 93 800 1 . . . . . 1940 194001 32392 . 100 1940000323921 33 48 1 1 100 2 48 2 200 93 720 1 . . . . . 1940 194001 33355 . 100 1940000333551 33 48 1 2 100 2 27 2 200 93 100 1 . . . . . 1940 194001 33638 . 100 1940000336381 12 42 3 1 100 2 35 2 200 93 2200 1 . . . . . 1940 194001 35486 . 88 1940000354861 31 45 1 3 88 2 30 2 200 93 400 1 . . . . . 1940 194001 35486 . 88 1940000354861 31 45 1 4 88 1 26 2 200 93 399 1 . . . . . 1940 194001 35486 . 88 1940000354861 31 45 1 8 88 2 20 2 200 93 400 1 . . . . . 1940 194001 35721 . 100 1940000357211 32 1 3 3 100 2 21 2 200 93 1125 1 . . . . . 1940 194001 36040 . 100 1940000360401 33 48 3 2 100 2 27 2 200 93 540 1 . . . . . 1940 194001 36040 . 100 1940000360401 33 48 3 4 100 2 25 2 200 93 540 1 . . . . . 1940 194001 36646 . 100 1940000366461 31 11 3 1 100 2 56 2 200 93 3000 1 . . . . . 1940 194001 37106 . 100 1940000371061 33 5 1 1 100 1 37 2 200 93 240 1 . . . . . 1940 194001 38383 . 100 1940000383831 31 12 3 1 100 2 25 2 200 93 600 1 . . . . . 1940 194001 38641 . 100 1940000386411 21 26 1 1 100 1 50 2 200 93 2000 1 . . . . . 1940 194001 39198 . 100 1940000391981 33 48 1 2 100 2 45 2 200 93 585 1 . . . . . 1940 194001 40822 . 100 1940000408221 32 28 1 2 100 2 35 2 200 93 150 1 . . . . . 1940 194001 40822 . 100 1940000408221 32 28 1 7 100 1 28 2 200 93 600 1 . . . . . 1940 194001 41482 . 100 1940000414821 33 48 1 3 100 1 27 2 200 93 1500 1 . . . . . 1940 194001 41482 . 100 1940000414821 33 48 1 4 100 2 23 2 200 93 1500 1 . . . . . 1940 194001 41524 . 100 1940000415241 31 24 3 3 100 2 40 2 200 93 2000 1 . . . . . 1940 194001 44711 . 100 1940000447111 31 37 1 1 100 1 36 2 200 93 1140 1 . . . . . 1940 194001 44711 . 100 1940000447111 31 37 1 2 100 2 34 2 200 93 600 1 . . . . . 1940 194001 44843 . 100 1940000448431 33 48 3 2 100 2 32 2 200 93 360 1 . . . . . 1940 194001 44952 . 100 1940000449521 33 48 1 1 100 2 51 2 200 93 118 1 . . . . . 1940 194001 45019 . 63 1940000450191 32 1 3 8 63 2 33 2 200 93 1400 1 . . . . . 1940 194001 45148 . 100 1940000451481 31 37 1 3 100 2 32 2 200 93 720 1 . . . . . 1940 194001 45987 . 100 1940000459871 33 22 1 1 100 1 35 2 200 93 400 1 . . . . . 1940 194001 45987 . 100 1940000459871 33 22 1 2 100 2 32 2 200 93 280 1 . . . . . 1940 194001 47140 . 100 1940000471401 31 37 1 2 100 2 28 2 200 93 0 1 . . . . . 1940 194001 48640 . 100 1940000486401 33 48 1 2 100 2 28 2 200 93 94 1 . . . . . 1940 194001 50986 . 100 1940000509861 33 48 1 2 100 2 23 2 200 93 540 1 . . . . . 1940 194001 51946 . 100 1940000519461 32 28 1 1 100 2 40 2 200 93 216 1 . . . . . 1940 194001 53157 . 100 1940000531571 33 22 1 6 100 2 26 2 200 93 0 1 . . . . . 1940 194001 53185 . 100 1940000531851 31 37 1 1 100 1 33 2 200 93 736 1 . . . . . 1940 194001 53834 . 100 1940000538341 33 40 1 1 100 1 56 2 200 93 900 1 . . . . . 1940 194001 54040 . 87 1940000540401 33 22 4 5 87 2 26 2 200 93 1152 1 . . . . . 1940 194001 54040 . 87 1940000540401 33 22 4 7 87 2 21 2 200 93 24 1 . . . . . 1940 194001 54097 . 100 1940000540971 33 48 3 1 100 1 45 2 200 93 1500 1 . . . . . end label values year YEAR label def YEAR 1940 "1940", modify label values sample SAMPLE label def SAMPLE 194001 "1940 1%", modify label values region REGION label def REGION 12 "middle atlantic division", modify label def REGION 21 "east north central div", modify label def REGION 22 "west north central div", modify label def REGION 31 "south atlantic division", modify label def REGION 32 "east south central div", modify label def REGION 33 "west south central div", modify label values statefip STATEFIP label def STATEFIP 1 "alabama", modify label def STATEFIP 5 "arkansas", modify label def STATEFIP 11 "district of columbia", modify label def STATEFIP 12 "florida", modify label def STATEFIP 13 "georgia", modify label def STATEFIP 18 "indiana", modify label def STATEFIP 21 "kentucky", modify label def STATEFIP 22 "louisiana", modify label def STATEFIP 24 "maryland", modify label def STATEFIP 26 "michigan", modify label def STATEFIP 28 "mississippi", modify label def STATEFIP 29 "missouri", modify label def STATEFIP 34 "new jersey", modify label def STATEFIP 36 "new york", modify label def STATEFIP 37 "north carolina", modify label def STATEFIP 39 "ohio", modify label def STATEFIP 40 "oklahoma", modify label def STATEFIP 42 "pennsylvania", modify label def STATEFIP 45 "south carolina", modify label def STATEFIP 48 "texas", modify label def STATEFIP 51 "virginia", modify label values metro METRO label def METRO 1 "not in metropolitan area", modify label def METRO 3 "in metropolitan area: not in central/principal city", modify label def METRO 4 "in metropolitan area: central/principal city status indeterminable (mixed)", modify label values sex SEX label def SEX 1 "male", modify label def SEX 2 "female", modify label values age AGE label def AGE 20 "20", modify label def AGE 21 "21", modify label def AGE 22 "22", modify label def AGE 23 "23", modify label def AGE 24 "24", modify label def AGE 25 "25", modify label def AGE 26 "26", modify label def AGE 27 "27", modify label def AGE 28 "28", modify label def AGE 29 "29", modify label def AGE 30 "30", modify label def AGE 31 "31", modify label def AGE 32 "32", modify label def AGE 33 "33", modify label def AGE 34 "34", modify label def AGE 35 "35", modify label def AGE 36 "36", modify label def AGE 37 "37", modify label def AGE 38 "38", modify label def AGE 39 "39", modify label def AGE 40 "40", modify label def AGE 41 "41", modify label def AGE 43 "43", modify label def AGE 45 "45", modify label def AGE 46 "46", modify label def AGE 48 "48", modify label def AGE 50 "50", modify label def AGE 51 "51", modify label def AGE 53 "53", modify label def AGE 55 "55", modify label def AGE 56 "56", modify label def AGE 60 "60", modify label def AGE 63 "63", modify label values race RACE label def RACE 2 "black/african american/negro", modify label values raced RACED label def RACED 200 "black/african american/negro", modify label values occ1950 OCC1950 label def OCC1950 93 "teachers (n.e.c.)", modify
Listed 100 out of 24737 observations
Use the count() option to list more
0 Response to Problem with generating state level variable in individual level dataset
Post a Comment