Code:
input worker_id firm_id start_date end_date 1 1 1 4 1 2 5 10 2 1 2 8 2 2 9 10 3 1 6 7 4 3 2 7 end
The following code illustrates something like what I'm after
Code:
levelsof firm_id, local(firms) tempfile worker_ds save "`worker_ds'" clear gen worker_id = . tempfile coworker_ds save "`coworker_ds'" foreach j of local firms { di "" di "results for firm `j'" di "" use "`worker_ds'", clear keep if firm_id == `j' count local obs = r(N) total_coworkers = . * identify set of coworkers for each worker in firm j forvalues i = 1/`obs' { local coworkers`i' "" local n_coworkers = 0 forvalues ii = 1/`obs' { if start_date[`ii'] < end_date[`i'] & end_date[`ii'] > start_date[`i'] & `i' != `ii' { local nextworker = worker_id[`ii'] local coworkers`i' `coworkers`i'' `nextworker' local n_coworkers = `n_coworkers' + 1 } } replace total_coworkers = `n_coworkers' if _n == `i' } * create a worker-coworker level dataset expand total_coworkers_all, generate(expand_obs) gen coworker_id = . sort worker_id local n = 1 * fill in values for coworker_id variable forvalues i = 1/`obs' { di "`i''s coworkers are `coworkers`i''" foreach id of local coworkers`i' { replace coworker_id = `id' if _n == `n' local n = `n' + 1 } } append using "`coworker_ds'" save "`coworker_ds'", replace }
I realise this a broad, somewhat vague question, but it would already be great if anyone has suggestions for commands or features of stata's syntax I could exploit to radically speed this up and make is more robust, or could point me towards references that might have suggestions for how to go about solving this. Even improving individuals steps would be helpful.
There are many things I would like to subsequently do with these data. Two important ones are calculate the number of coworkers with certain charactersitics a worker has during a given spell, and see whether a worker has former coworkers in a firm in a subsequent spell with a different firm. I am using Stata 14.1
0 Response to Identifying coworkers in spell data
Post a Comment