I encounter a difficulty in identifying values in one variable that satisfies a condition, allow me to explain further below. The following is a simple data sample that I am working with.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float (id race female chain hire_period leave_period) 250531 . . 1 668 . 393424 7 0 1 . 668 354409 4 1 1 . 668 156446 7 0 1 . 668 92826 7 0 1 . 668 332179 7 0 1 . 668 325948 4 1 1 . 668 204382 4 0 1 . 668 142105 7 0 1 . 668 15572 7 1 1 . 668 4940 7 1 1 . 668 418791 4 0 1 669 . 41531 4 0 1 669 . 250794 7 0 1 669 . 379612 7 1 1 669 . 250907 4 1 1 669 . 382359 7 0 1 669 . 250531 . . 1 . 669 381092 7 0 1 . 669 101358 7 1 1 . 669 4363 7 1 1 . 669 148209 3 1 2 648 648 64140 3 0 2 648 . 148341 3 0 2 648 . 184754 3 0 2 . 648 56461 7 0 2 . 648 58953 2 1 2 . 648 395174 4 1 2 . 648 90147 3 0 2 . 648 277274 3 1 2 653 . 65539 2 0 2 653 . end label values race xethn label def xethn 2 "ASIAN", modify label def xethn 3 "BLACK", modify label def xethn 4 "HISPA", modify label def xethn 7 "WHITE", modify label values female lbl_female label def lbl_female 0 "Male", modify label def lbl_female 1 "Female", modify label var female "==1 if female"
id = person’s ID
race = a categorical variable indicating a person’s race
female = a dummy variable indicating a person’s sex
chain = the store ID for which a particular person belongs to (data sample contains two values, chain = 1 or 2)
hire_period = the period for which this particular person was hired by a particular store
leave_period = the period for which this particular person left a particular store
I am particularly interested in the last two variables, i.e. hire_period and leave_period. I would like to identify any values in the hire_period variable that is less than leave_period by at most 2. For example, for the 2nd to 11th observations in above sample data, if we look at the lead_month variable, I would like to know which values of hire_period that is less than leave_period (648) by at most 2, that is if there is anyone got hired in periods 648, 647 646 (in the hire_period variable) in a particular store (given that there are individuals in the sample who actually got hired in that period). If so, I would like to generate a variable that tells me which values of hire_period that corresponds to 648 for leave_period.
I have tried the following code using a loop that loops over all levels of hire_period (where 648(1)669 are all levels of hire_period across all stores.) However, this doesn’t quite work because different stores will have different levels of hire_period; and I want (store specific) hire_period values that is less than each of the leave_period values by at most 2.
Code:
forvalues i = 648(1)669{ bysort study_id: gen v_`i' = 1 if v_month >= `i' & v_month < `i'+3 & v_month !=. }
0 Response to Identify All Values in One Variable that Satisfies a Condition
Post a Comment