Identify All Values in One Variable that Satisfies a Condition

Hi Statalist members,

I encounter a difficulty in identifying values in one variable that satisfies a condition, allow me to explain further below. The following is a simple data sample that I am working with.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float
(id race female chain hire_period leave_period)
250531 . . 1 668   .
393424 7 0 1   . 668
354409 4 1 1   . 668
156446 7 0 1   . 668
 92826 7 0 1   . 668
332179 7 0 1   . 668
325948 4 1 1   . 668
204382 4 0 1   . 668
142105 7 0 1   . 668
 15572 7 1 1   . 668
  4940 7 1 1   . 668
418791 4 0 1 669   .
 41531 4 0 1 669   .
250794 7 0 1 669   .
379612 7 1 1 669   .
250907 4 1 1 669   .
382359 7 0 1 669   .
250531 . . 1   . 669
381092 7 0 1   . 669
101358 7 1 1   . 669
  4363 7 1 1   . 669
148209 3 1 2 648 648
 64140 3 0 2 648   .
148341 3 0 2 648   .
184754 3 0 2   . 648
 56461 7 0 2   . 648
 58953 2 1 2   . 648
395174 4 1 2   . 648
 90147 3 0 2   . 648
277274 3 1 2 653   .
 65539 2 0 2 653   .
end
label values race xethn
label def xethn 2 "ASIAN", modify
label def xethn 3 "BLACK", modify
label def xethn 4 "HISPA", modify
label def xethn 7 "WHITE", modify
label values female lbl_female
label def lbl_female 0 "Male", modify
label def lbl_female 1 "Female", modify
label var female "==1 if female"

It consists of 6 variables:
id = person’s ID
race = a categorical variable indicating a person’s race
female = a dummy variable indicating a person’s sex
chain = the store ID for which a particular person belongs to (data sample contains two values, chain = 1 or 2)
hire_period = the period for which this particular person was hired by a particular store
leave_period = the period for which this particular person left a particular store

I am particularly interested in the last two variables, i.e. hire_period and leave_period. I would like to identify any values in the hire_period variable that is less than leave_period by at most 2. For example, for the 2nd to 11th observations in above sample data, if we look at the lead_month variable, I would like to know which values of hire_period that is less than leave_period (648) by at most 2, that is if there is anyone got hired in periods 648, 647 646 (in the hire_period variable) in a particular store (given that there are individuals in the sample who actually got hired in that period). If so, I would like to generate a variable that tells me which values of hire_period that corresponds to 648 for leave_period.

I have tried the following code using a loop that loops over all levels of hire_period (where 648(1)669 are all levels of hire_period across all stores.) However, this doesn’t quite work because different stores will have different levels of hire_period; and I want (store specific) hire_period values that is less than each of the leave_period values by at most 2.

Code:

forvalues i = 648(1)669{
    bysort study_id: gen v_`i' = 1 if v_month >= `i' & v_month < `i'+3 & v_month !=.
    }

Huge thanks to anyone who can help! Since I am relatively new to the forum, I apologize in advance if I haven’t made myself clear or following the exact protocols.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Identify All Values in One Variable that Satisfies a Condition
Identify All Values in One Variable that Satisfies a Condition

0 Response to Identify All Values in One Variable that Satisfies a Condition

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Identify All Values in One Variable that Satisfies a Condition Identify All Values in One Variable that Satisfies a Condition

Related Posts with Identify All Values in One Variable that Satisfies a Condition

0 Response to Identify All Values in One Variable that Satisfies a Condition

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Identify All Values in One Variable that Satisfies a Condition
Identify All Values in One Variable that Satisfies a Condition