matching 'cases' to a 'reference group'

Dear Listers

I have a large data set. Here is a mock example:

Code:

set obs 1000
set seed 1234

gen id = _n

gen sex = runiformint(0,1)

gen age = runiformint(18, 99)
gen group = runiformint(0, 10)
replace group = 1 if group>1

gen case_id=id if group==0
order id case_id group sex age

As you can see, there are many more individuals in the reference population than in the case population.
Cases are group == 0, and reference individuals are group == 1

Now, I would like to make a sub-set of data where I match all cases to individuals in the reference population.
For each case I would like to match 5 reference individuals. Giving cases and reference individuals the same sex and age (±2years).

The new dataset should include the following variables:
id, case_id (so that it would be possible to know what case each reference individual is matched to), sex, age and group

The matching should be as random as possible. If a 1:5 is not possible there should be some sort of mark to identify those cases that had fewer than 5 reference individuals.

Thank you.

Lars

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / matching 'cases' to a 'reference group'
matching 'cases' to a 'reference group'

0 Response to matching 'cases' to a 'reference group'

Post a Comment

Home / Data Cleaning / Data management / Data Processing / matching 'cases' to a 'reference group' matching 'cases' to a 'reference group'

Related Posts with matching 'cases' to a 'reference group'

0 Response to matching 'cases' to a 'reference group'

Post a Comment

Home / Data Cleaning / Data management / Data Processing / matching 'cases' to a 'reference group'
matching 'cases' to a 'reference group'