Hi,
I'm new on Stata and I have some difficulties to realise a pairwise matching. Here is a simplified overview of my database:
ID_SCHOOL ID_STRATUM Age of students Age of teachers
1 1 11 40
2 1 12 36
3 1 12 51
4 2 11 28
5 2 13 32
… … … …
1260 58 12 44
Within each stratum, I have to match a school with its nearest neighbour. To match it, I have to find the other school in the same stratum that minimizes the sum of differences (in absolute value) of 20 variables (including age of students and age of teachers). For example, for the stratum 1, the nearest neighbour of the school 1 is the school 2 because it minimizes the sum of the distances for age of students and for age of teachers (|11 – 12|+|40-36| = 5 for school 2 and |11 – 12|+|40-51| = 12 for school 3). Given that 2 schools may have the same nearest neighbour, I have to minimize the total distance (sum of all distances for each pair) in each stratum in order to match all schools.
All the matching commands that I checked don't allow this type of matching (calipmatch, gmatch, ccmatch or teffects with options) since I don't have a treatment group and a control group, and that I don't want to estimate a treatment effect. I just want to match 2 schools within each stratum. The problem is that I don't know how to calculate the distance for each observation in each stratum (I have 1260 observations distributed in 58 strata). I created a macro that includes my 20 variables for the matching and I thought that I had to use foreach to calculate the distance for each of the 20 variables in each stratum (bysort ID_STRATUM:). Then I have to make the sum of the 20 distances for each couple of variable in each stratum. Finally, I have to find the nearest neighbour for each school and minimize the total of distances for each stratum.
Can anyone please have some suggestions on how to calculate the distance for each couple of observations in each stratum ?
Thank you so much in advance,
Paul
0 Response to Pairwise matching: first nearest neighbour
Post a Comment