Dear Listers

I have a large data set. Here is a mock example:

Code:
set obs 1000
set seed 1234

gen id = _n

gen sex = runiformint(0,1)

gen age = runiformint(18, 99)
gen group = runiformint(0, 10)
replace group = 1 if group>1

gen case_id=id if group==0
order id case_id group sex age
As you can see, there are many more individuals in the reference population than in the case population.
Cases are group == 0, and reference individuals are group == 1

Now, I would like to make a sub-set of data where I match all cases to individuals in the reference population.
For each case I would like to match 5 reference individuals. Giving cases and reference individuals the same sex and age (±2years).

The new dataset should include the following variables:
id, case_id (so that it would be possible to know what case each reference individual is matched to), sex, age and group

The matching should be as random as possible. If a 1:5 is not possible there should be some sort of mark to identify those cases that had fewer than 5 reference individuals.

Thank you.

Lars