Hello,

I need to conduct state level analysis using individual level data. In order to achieve this I need to create a state-level, continuous variable that contains the number of observations by state and year. I also need to calculate a growth variable at the state level between two time points (1970) and (1980). But, I am struggling to create a state level variable in an individual-level dataset.

I am using Stata 16.1 on a Mac OS Catalina version. 10.15.5
I am using IPUMS microdata provided by IPUMS, which are individual level data. IPUMS provides one percent samples of the U.S. Census which is collected every ten years. My current dataset includes the years 1940, 1960, 1970, 1980, 2000, and 2010. My research is about Black educators, as such my dataset is restricted to observations identified as “Black” as defined by the census year and respondents with the occ1950 occupation code for “teacher”, and I further restricted it to individuals aged 20 or above.

I have tried generating new variables for each state that contain the correct number of teachers with such code as:

Gen NY1970 = NY if year==1970.

Where NY is a dummy variable. All observations in the dataset are black teachers.


I have dummy variables for years and states, but this code seems to generate a case of the variable for each observation as opposed to a state level variable.

I also have failed to generate a growth variable with such code as:

gen growth = 100 * (teacher70 - teacher80/teacher70)

I used a dummy variable for “teacher (which is everyone in the dataset) and then created dummy variables “teacher70” and “teacher80” for teachers in appropriate years.
For instance,
gen teacher70 = teacher if Y1970==1

However, the growth variable counted every observation in the dataset as missing data. I think it didn’t work because the dummy variables canceled each other out.

I then tried to create continuous variables to measure teachers with the code:
gen teacher70 = 2455 if year==1970
gen teacher80 = 3913 if Y1980=1

But that also resulted in all observations counted as missing. How can I generate the necessary state level variables? Do I just have to generate those variables in a separate file and merge to two files? This is my first time trying to accomplish such a task and I'm unsure of the procedure. I appreciate any suggestions and guidance. I will provide any additional information that might be helpful or clarifying. Below is a sample of my data. Please let me know if I can provide a better data sample.

Thank you!

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int year long(sample serial cbserial) int hhwt double cluster byte(region statefip metro pernum) int perwt byte(sex age race) int(raced occ1950) long incwage float(teacher teacher70 teacher80 teacher801 teacher701 growth)
1940 194001  1110 . 100 1940000011101 33 40 1 6 100 2 50 2 200 93  640 1 . . . . .
1940 194001  1316 . 100 1940000013161 32 28 1 1 100 1 60 2 200 93  190 1 . . . . .
1940 194001  2922 . 100 1940000029221 31 37 1 3 100 2 22 2 200 93  740 1 . . . . .
1940 194001  3751 . 100 1940000037511 31 13 1 2 100 2 38 2 200 93  500 1 . . . . .
1940 194001  4075 . 100 1940000040751 22 29 1 2 100 2 28 2 200 93  405 1 . . . . .
1940 194001  4430 . 100 1940000044301 31 12 1 2 100 2 34 2 200 93  560 1 . . . . .
1940 194001  5247 . 100 1940000052471 33 48 1 2 100 2 37 2 200 93  816 1 . . . . .
1940 194001  5266 . 100 1940000052661 31 45 1 1 100 1 55 2 200 93  385 1 . . . . .
1940 194001  5266 . 100 1940000052661 31 45 1 2 100 2 53 2 200 93  350 1 . . . . .
1940 194001  5266 . 100 1940000052661 31 45 1 3 100 1 24 2 200 93  110 1 . . . . .
1940 194001  5266 . 100 1940000052661 31 45 1 4 100 2 21 2 200 93  110 1 . . . . .
1940 194001  6115 . 100 1940000061151 32  1 3 5 100 2 27 2 200 93  315 1 . . . . .
1940 194001  6182 . 100 1940000061821 22 29 3 1 100 2 46 2 200 93 2000 1 . . . . .
1940 194001  7141 .  77 1940000071411 12 36 3 2  77 2 50 2 200 93 2000 1 . . . . .
1940 194001  7641 . 100 1940000076411 31 37 1 1 100 1 30 2 200 93  750 1 . . . . .
1940 194001  7808 . 100 1940000078081 31 13 1 3 100 2 20 2 200 93  385 1 . . . . .
1940 194001  8206 . 100 1940000082061 33 48 1 3 100 1 24 2 200 93  675 1 . . . . .
1940 194001  8921 . 100 1940000089211 31 13 1 1 100 2 32 2 200 93  280 1 . . . . .
1940 194001  9193 . 100 1940000091931 21 39 3 1 100 1 39 2 200 93 1280 1 . . . . .
1940 194001 10479 . 100 1940000104791 33 48 1 1 100 1 48 2 200 93 1320 1 . . . . .
1940 194001 11556 .  87 1940000115561 31 51 1 2  87 2 31 2 200 93  405 1 . . . . .
1940 194001 11847 .  53 1940000118471 32 28 1 3  53 2 23 2 200 93  120 1 . . . . .
1940 194001 12360 . 100 1940000123601 21 39 3 3 100 2 25 2 200 93 1200 1 . . . . .
1940 194001 12937 . 100 1940000129371 33 48 1 2 100 2 25 2 200 93  250 1 . . . . .
1940 194001 12938 . 100 1940000129381 33  5 1 2 100 2 26 2 200 93  390 1 . . . . .
1940 194001 13696 . 100 1940000136961 31 37 1 2 100 2 33 2 200 93  637 1 . . . . .
1940 194001 13697 . 100 1940000136971 31 37 1 1 100 2 38 2 200 93  680 1 . . . . .
1940 194001 13697 . 100 1940000136971 31 37 1 2 100 2 38 2 200 93 1008 1 . . . . .
1940 194001 13853 .  78 1940000138531 21 39 3 2  78 1 25 2 200 93  852 1 . . . . .
1940 194001 14975 . 100 1940000149751 31 45 4 3 100 2 23 2 200 93  365 1 . . . . .
1940 194001 14989 . 100 1940000149891 31 45 4 2 100 2 36 2 200 93 1200 1 . . . . .
1940 194001 15489 . 100 1940000154891 31 45 4 1 100 2 46 2 200 93  754 1 . . . . .
1940 194001 15489 . 100 1940000154891 31 45 4 2 100 2 28 2 200 93  450 1 . . . . .
1940 194001 15986 .  78 1940000159861 31 24 3 3  78 2 24 2 200 93  200 1 . . . . .
1940 194001 16297 . 100 1940000162971 33 48 3 1 100 2 22 2 200 93 1225 1 . . . . .
1940 194001 16926 . 100 1940000169261 21 39 3 1 100 1 40 2 200 93    0 1 . . . . .
1940 194001 17323 . 100 1940000173231 33 48 3 2 100 2 45 2 200 93 1400 1 . . . . .
1940 194001 18244 . 100 1940000182441 33 48 1 3 100 1 27 2 200 93  540 1 . . . . .
1940 194001 18881 . 100 1940000188811 31 24 3 2 100 2 39 2 200 93 1500 1 . . . . .
1940 194001 21284 . 100 1940000212841 31 37 1 3 100 2 29 2 200 93  800 1 . . . . .
1940 194001 21284 . 100 1940000212841 31 37 1 6 100 2 29 2 200 93  660 1 . . . . .
1940 194001 21284 . 100 1940000212841 31 37 1 7 100 2 24 2 200 93  660 1 . . . . .
1940 194001 21811 . 100 1940000218111 33 40 1 5 100 1 38 2 200 93  800 1 . . . . .
1940 194001 21811 . 100 1940000218111 33 40 1 6 100 2 33 2 200 93  900 1 . . . . .
1940 194001 21854 . 100 1940000218541 31 12 1 2 100 2 30 2 200 93  540 1 . . . . .
1940 194001 22108 . 100 1940000221081 31 51 1 2 100 2 32 2 200 93  464 1 . . . . .
1940 194001 23183 . 100 1940000231831 31 11 3 4 100 2 41 2 200 93 1512 1 . . . . .
1940 194001 23476 . 100 1940000234761 32 28 4 1 100 2 36 2 200 93  320 1 . . . . .
1940 194001 23582 . 100 1940000235821 32  1 1 1 100 1 32 2 200 93  875 1 . . . . .
1940 194001 25609 . 100 1940000256091 31 13 1 3 100 2 27 2 200 93  225 1 . . . . .
1940 194001 26085 . 100 1940000260851 31 12 1 2 100 2 43 2 200 93  600 1 . . . . .
1940 194001 27371 . 100 1940000273711 32  1 1 2 100 2 29 2 200 93  350 1 . . . . .
1940 194001 28031 . 100 1940000280311 21 26 3 2 100 2 36 2 200 93 2360 1 . . . . .
1940 194001 28211 . 100 1940000282111 31 11 3 3 100 2 30 2 200 93 1700 1 . . . . .
1940 194001 28369 . 100 1940000283691 32 21 1 2 100 2 46 2 200 93  720 1 . . . . .
1940 194001 29237 . 100 1940000292371 12 42 3 2 100 2 25 2 200 93  500 1 . . . . .
1940 194001 29779 . 100 1940000297791 21 18 3 1 100 1 56 2 200 93 3000 1 . . . . .
1940 194001 29779 . 100 1940000297791 21 18 3 2 100 2 39 2 200 93 1200 1 . . . . .
1940 194001 30337 . 100 1940000303371 33 40 1 6 100 2 24 2 200 93  900 1 . . . . .
1940 194001 30681 .  70 1940000306811 22 29 3 2  70 2 40 2 200 93 1300 1 . . . . .
1940 194001 30770 . 100 1940000307701 31 37 1 2 100 1 37 2 200 93  500 1 . . . . .
1940 194001 31141 . 100 1940000311411 31 13 1 4 100 2 63 2 200 93  190 1 . . . . .
1940 194001 31361 . 100 1940000313611 12 34 3 4 100 1 32 2 200 93  800 1 . . . . .
1940 194001 32392 . 100 1940000323921 33 48 1 1 100 2 48 2 200 93  720 1 . . . . .
1940 194001 33355 . 100 1940000333551 33 48 1 2 100 2 27 2 200 93  100 1 . . . . .
1940 194001 33638 . 100 1940000336381 12 42 3 1 100 2 35 2 200 93 2200 1 . . . . .
1940 194001 35486 .  88 1940000354861 31 45 1 3  88 2 30 2 200 93  400 1 . . . . .
1940 194001 35486 .  88 1940000354861 31 45 1 4  88 1 26 2 200 93  399 1 . . . . .
1940 194001 35486 .  88 1940000354861 31 45 1 8  88 2 20 2 200 93  400 1 . . . . .
1940 194001 35721 . 100 1940000357211 32  1 3 3 100 2 21 2 200 93 1125 1 . . . . .
1940 194001 36040 . 100 1940000360401 33 48 3 2 100 2 27 2 200 93  540 1 . . . . .
1940 194001 36040 . 100 1940000360401 33 48 3 4 100 2 25 2 200 93  540 1 . . . . .
1940 194001 36646 . 100 1940000366461 31 11 3 1 100 2 56 2 200 93 3000 1 . . . . .
1940 194001 37106 . 100 1940000371061 33  5 1 1 100 1 37 2 200 93  240 1 . . . . .
1940 194001 38383 . 100 1940000383831 31 12 3 1 100 2 25 2 200 93  600 1 . . . . .
1940 194001 38641 . 100 1940000386411 21 26 1 1 100 1 50 2 200 93 2000 1 . . . . .
1940 194001 39198 . 100 1940000391981 33 48 1 2 100 2 45 2 200 93  585 1 . . . . .
1940 194001 40822 . 100 1940000408221 32 28 1 2 100 2 35 2 200 93  150 1 . . . . .
1940 194001 40822 . 100 1940000408221 32 28 1 7 100 1 28 2 200 93  600 1 . . . . .
1940 194001 41482 . 100 1940000414821 33 48 1 3 100 1 27 2 200 93 1500 1 . . . . .
1940 194001 41482 . 100 1940000414821 33 48 1 4 100 2 23 2 200 93 1500 1 . . . . .
1940 194001 41524 . 100 1940000415241 31 24 3 3 100 2 40 2 200 93 2000 1 . . . . .
1940 194001 44711 . 100 1940000447111 31 37 1 1 100 1 36 2 200 93 1140 1 . . . . .
1940 194001 44711 . 100 1940000447111 31 37 1 2 100 2 34 2 200 93  600 1 . . . . .
1940 194001 44843 . 100 1940000448431 33 48 3 2 100 2 32 2 200 93  360 1 . . . . .
1940 194001 44952 . 100 1940000449521 33 48 1 1 100 2 51 2 200 93  118 1 . . . . .
1940 194001 45019 .  63 1940000450191 32  1 3 8  63 2 33 2 200 93 1400 1 . . . . .
1940 194001 45148 . 100 1940000451481 31 37 1 3 100 2 32 2 200 93  720 1 . . . . .
1940 194001 45987 . 100 1940000459871 33 22 1 1 100 1 35 2 200 93  400 1 . . . . .
1940 194001 45987 . 100 1940000459871 33 22 1 2 100 2 32 2 200 93  280 1 . . . . .
1940 194001 47140 . 100 1940000471401 31 37 1 2 100 2 28 2 200 93    0 1 . . . . .
1940 194001 48640 . 100 1940000486401 33 48 1 2 100 2 28 2 200 93   94 1 . . . . .
1940 194001 50986 . 100 1940000509861 33 48 1 2 100 2 23 2 200 93  540 1 . . . . .
1940 194001 51946 . 100 1940000519461 32 28 1 1 100 2 40 2 200 93  216 1 . . . . .
1940 194001 53157 . 100 1940000531571 33 22 1 6 100 2 26 2 200 93    0 1 . . . . .
1940 194001 53185 . 100 1940000531851 31 37 1 1 100 1 33 2 200 93  736 1 . . . . .
1940 194001 53834 . 100 1940000538341 33 40 1 1 100 1 56 2 200 93  900 1 . . . . .
1940 194001 54040 .  87 1940000540401 33 22 4 5  87 2 26 2 200 93 1152 1 . . . . .
1940 194001 54040 .  87 1940000540401 33 22 4 7  87 2 21 2 200 93   24 1 . . . . .
1940 194001 54097 . 100 1940000540971 33 48 3 1 100 1 45 2 200 93 1500 1 . . . . .
end
label values year YEAR
label def YEAR 1940 "1940", modify
label values sample SAMPLE
label def SAMPLE 194001 "1940 1%", modify
label values region REGION
label def REGION 12 "middle atlantic division", modify
label def REGION 21 "east north central div", modify
label def REGION 22 "west north central div", modify
label def REGION 31 "south atlantic division", modify
label def REGION 32 "east south central div", modify
label def REGION 33 "west south central div", modify
label values statefip STATEFIP
label def STATEFIP 1 "alabama", modify
label def STATEFIP 5 "arkansas", modify
label def STATEFIP 11 "district of columbia", modify
label def STATEFIP 12 "florida", modify
label def STATEFIP 13 "georgia", modify
label def STATEFIP 18 "indiana", modify
label def STATEFIP 21 "kentucky", modify
label def STATEFIP 22 "louisiana", modify
label def STATEFIP 24 "maryland", modify
label def STATEFIP 26 "michigan", modify
label def STATEFIP 28 "mississippi", modify
label def STATEFIP 29 "missouri", modify
label def STATEFIP 34 "new jersey", modify
label def STATEFIP 36 "new york", modify
label def STATEFIP 37 "north carolina", modify
label def STATEFIP 39 "ohio", modify
label def STATEFIP 40 "oklahoma", modify
label def STATEFIP 42 "pennsylvania", modify
label def STATEFIP 45 "south carolina", modify
label def STATEFIP 48 "texas", modify
label def STATEFIP 51 "virginia", modify
label values metro METRO
label def METRO 1 "not in metropolitan area", modify
label def METRO 3 "in metropolitan area: not in central/principal city", modify
label def METRO 4 "in metropolitan area: central/principal city status indeterminable (mixed)", modify
label values sex SEX
label def SEX 1 "male", modify
label def SEX 2 "female", modify
label values age AGE
label def AGE 20 "20", modify
label def AGE 21 "21", modify
label def AGE 22 "22", modify
label def AGE 23 "23", modify
label def AGE 24 "24", modify
label def AGE 25 "25", modify
label def AGE 26 "26", modify
label def AGE 27 "27", modify
label def AGE 28 "28", modify
label def AGE 29 "29", modify
label def AGE 30 "30", modify
label def AGE 31 "31", modify
label def AGE 32 "32", modify
label def AGE 33 "33", modify
label def AGE 34 "34", modify
label def AGE 35 "35", modify
label def AGE 36 "36", modify
label def AGE 37 "37", modify
label def AGE 38 "38", modify
label def AGE 39 "39", modify
label def AGE 40 "40", modify
label def AGE 41 "41", modify
label def AGE 43 "43", modify
label def AGE 45 "45", modify
label def AGE 46 "46", modify
label def AGE 48 "48", modify
label def AGE 50 "50", modify
label def AGE 51 "51", modify
label def AGE 53 "53", modify
label def AGE 55 "55", modify
label def AGE 56 "56", modify
label def AGE 60 "60", modify
label def AGE 63 "63", modify
label values race RACE
label def RACE 2 "black/african american/negro", modify
label values raced RACED
label def RACED 200 "black/african american/negro", modify
label values occ1950 OCC1950
label def OCC1950 93 "teachers (n.e.c.)", modify
------------------ copy up to and including the previous line ------------------

Listed 100 out of 24737 observations
Use the count() option to list more