Hello!

I am stuck not knowing how to create a variable I need based on my current data.

I have a long format dataset of many individuals, each with an observation for each year from 1936 - 2018. Each observation has the following variables:
  • Age (i.e. how old was the person in that given year?)
  • Kthru12 (i.e. was the person enrolled in a K-12 school in that given year?)
  • postKthru12degree (i.e. if someone was enrolled in higher education, what degree they were they working towards?)
  • postKthru12fulltime (i.e. if someone was enrolled in higher education, where they enrolled part-time or full-time?)

As an example, the data looks like below for just one person of many persons in my data (although the labels are text as seen below, the actual values are numeric):

HTML Code:
year age Kthru12 postKthru12degree postKthru12fulltime
1936
1937
1938
1939
1940
1941 0 Not in Kthru12 - -
1942 1 Not in Kthru12 - -
1943 2 Not in Kthru12 - -
1944 3 Not in Kthru12 - -
1945 4 Not in Kthru12 - -
1946 5 In a Kthru12 - -
1947 6 In a Kthru12 - -
1948 7 In a Kthru12 - -
1949 8 In a Kthru12 - -
1950 9 In a Kthru12 - -
1951 10 In a Kthru12 - -
1952 11 In a Kthru12 - -
1953 12 In a Kthru12 - -
1954 13 In a Kthru12 - -
1955 14 In a Kthru12 - -
1956 15 In a Kthru12 - -
1957 16 In a Kthru12 - -
1958 17 In a Kthru12 - -
1959 18 Not in Kthru12 BS Full-time
1960 19 Not in Kthru12 BS Full-time
1961 20 Not in Kthru12 BS Full-time
1962 21 Not in Kthru12 BS Full-time
1963 22 Not in Kthru12 >BS Full-time
1964 23 Not in Kthru12 >BS Full-time
1965 24 Not in Kthru12 >BS Full-time
1966 25 Not in Kthru12 - -
1967 26 Not in Kthru12 - -
1968 27 Not in Kthru12 - -
1969 28 Not in Kthru12 - -
1970 29 Not in Kthru12 - -
1971 30 Not in Kthru12 - -
1972 31 Not in Kthru12 - -
1973 32 Not in Kthru12 - -
1974 33 Not in Kthru12 - -
1975 34 Not in Kthru12 - -
1976 35 Not in Kthru12 - -
1977 36 Not in Kthru12 - -
1978 37 Not in Kthru12 - -
1979 38 Not in Kthru12 - -
1980 39 Not in Kthru12 - -
1981 40 Not in Kthru12 - -
1982 41 Not in Kthru12 - -
1983 42 Not in Kthru12 - -
1984 43 Not in Kthru12 - -
1985 44 Not in Kthru12 - -
1986 45 Not in Kthru12 - -
1987 46 Not in Kthru12 - -
1988 47 Not in Kthru12 - -
1989 48 Not in Kthru12 - -
1990 49 Not in Kthru12 - -
1991 50 Not in Kthru12 - -
1992 51 Not in Kthru12 - -
1993 52 Not in Kthru12 - -
1994 53 Not in Kthru12 - -
1995 54 Not in Kthru12 - -
1996 55 Not in Kthru12 - -
1997 56 Not in Kthru12 - -
1998 57 Not in Kthru12 - -
1999 58 Not in Kthru12 - -
2000 59 Not in Kthru12 - -
2001 60 Not in Kthru12 - -
2002 61 Not in Kthru12 - -
2003 62 Not in Kthru12 - -
2004 63 Not in Kthru12 - -
2005 64 Not in Kthru12 - -
2006 65 Not in Kthru12 - -
2007 66 Not in Kthru12 - -
2008 67 Not in Kthru12 - -
2009 68 Not in Kthru12 - -
2010 69 Not in Kthru12 - -
2011 70 Not in Kthru12 - -
2012 71 Not in Kthru12 - -
2013 72 Not in Kthru12 - -
2014 73 Not in Kthru12 - -
2015 74 Not in Kthru12 - -
2016 75 Not in Kthru12 - -
2017
2018
I am now stuck, not knowing how to create a final "education state" variable. I would like this new variable to describe the educational status of each person at year by using information from the various years of the person. For example, while the person is in a Kthru12 school, this variable will be equal "Enrolled in Kthru12". However, for the subsequent years after that, when the person is no longer actively enrolled in Kthru12, this states variable will either equal "finished high school" IF the person was enrolled in Kthru12 school for atleast 12 years prior to that year, or equal "finished less than high school" if the person was previously enrolled in a Kthru12 for less than 12 years. Then this value will be carried forward in the new States variable for each year until either the person dies, or they enroll in college and their education status changes again.

In summary, I have planned for this final "State" variable to have the following values for each year of the data:

0 == no previous education
1 == is currently enrolled in Kthru12 school
2 == has completed < High School
3 == has completed High School
4 == is currently enrolled in college fulltime
5 == is currently enrolled in college parttime
6 == has completed <4 years of college
7 == has obtained BS degree
8 == has obtained >BS degree

I am very much confused what would be the best approach to create this new states variable, since the value at each row will depend on the combination of values of the previous rows, specifically, if "enough" of the previous rows had a certain value to warrant a given value. Therefore a simple carryforward command may not suffice.

Can you please advise what would be the best approach to code this?

With much gratitude,
S