I have a large data set. Here is a mock example:
Code:
set obs 1000 set seed 1234 gen id = _n gen sex = runiformint(0,1) gen age = runiformint(18, 99) gen group = runiformint(0, 10) replace group = 1 if group>1 gen case_id=id if group==0 order id case_id group sex age
Cases are group == 0, and reference individuals are group == 1
Now, I would like to make a sub-set of data where I match all cases to individuals in the reference population.
For each case I would like to match 5 reference individuals. Giving cases and reference individuals the same sex and age (±2years).
The new dataset should include the following variables:
id, case_id (so that it would be possible to know what case each reference individual is matched to), sex, age and group
The matching should be as random as possible. If a 1:5 is not possible there should be some sort of mark to identify those cases that had fewer than 5 reference individuals.
Thank you.
Lars
0 Response to matching 'cases' to a 'reference group'
Post a Comment