Hi All, I am trying to make a variable that identifies whether a person's second to last work history entry was employment (as opposed to being unemployed or abroad). The problem I am running into is that I do not have a traditional panel data set, as each person's work history entries are classified by different start dates over a period of 30 years, and I want to identify each person's most recent work history entry and their second most recent work history entry using the L. command. So far, I have sorted my data by worker id (fwid) and work history entry start date, and I have tried to generate a variable that identifies their work history entry number by generating a running sum of ones (called file_num) for each individual (after sorting the data). My idea was to use the xtset command with the worker's id as the panel variable and the file_num as the time variable. Then I was going to identify the max value for the file_num variable and use the L. command to identify the second to last entry for each person. The weird this is that when I run this same exact set of code from start to finish, my variable of interest (separated_to_employ_v2) winds up with different summary statistics every time, and I cannot figure out why. Any help you can provide to resolve this issue would be greatly appreciated, as this issue is preventing me from replicating my regression results when I re-run the code.
use "C:\Users\Zach\Dropbox\H-2A\Generated Data Files\NAWS Workgrid with Main File Merged 1989-2018.dta", clear
rename *, lower
encode c06, gen(work_type)
gen abroad = work_type==1
gen farm_work = work_type==2
gen non_farm_work = work_type==3
gen non_employed = work_type==4
gen file_num = .
gen ones = 1
sort fwid start_date
by fwid: replace file_num = sum(ones)
xtset fwid file_num
gen separated_to_employv2 = (l.farm_work==1 | l.non_farm_work==1) & l.end_date<start_date
sum separated_to_employv2, d
Here are the summary stats for the variable "separated_to_employ_v2" from two separate runs of the code from start to finish...note the difference in the means.
Array Array
Related Posts with Please Help with Code
Databases mergeHi all, so, this is a quite border line question. Hence, sorry if it is out of topic a bit. The de…
multiple regression when significant difference between groups within 1 regression?Hi there, I am studying the effect of becoming a parent on the earnings of fathers versus mothers. …
horizontal bargraph with multiple binary variablesDear all, I am quite stuck with some descriptive statistics. I have a household survey with binary …
Merging two panel datasetI am facing problem to merge the individual data set e.g Dataset 1: prepaid data for location 1 has…
Can someone help?! Can not get the mean of observations vertically! (with missing values)I have several financial metrics for about 60,000 stocks (+2.5M observations), and a sample of my Re…
Subscribe to:
Post Comments (Atom)
0 Response to Please Help with Code
Post a Comment