Hello everyone,

After hours of debugging of my code, I think I found a bug in the command nearmrg that appears only if I use it with more than one group_id variable.
I hope that some of you may help me pointing me what I am doing incorrectly and that eventually there is no bug but just my mistake.

Here is the code to replicate the bug:

Code:
clear
input int group_id long personal_id int tomatch
96   11702  0
96   45103  0
96   50003  1
96   62303  1
96   95003  1
96   97303  2
96  125403  2
96  126604  3
96  145404  4
96  172602  4
96  177304  5
96  217002  6
96  228603  6
96  229803  7
96  236302  7
96  236303  7
96  248503  8
96  267003  8
96  289303  8
96  295902  9
96  297403  9
96  316902 10
96  328703 10
96  328704 10
96  329003 11
96  329004 11
96  329504 11
96  348703 12
96  349304 13
96  349503 14
96  354803 15
96  365003 15
96  365004 15
96  383804 16
96  395002 17
96  425203 18
96  440403 18
96  441603 18
96  451503 19
96  486404 20
96  488404 20
96  496604 20
96  504303 20
96  504804 21
96  509403 21
96  547303 22
96  562903 23
96  566604 23
96  573904 23
96  585303 23
96  587304 24
96  588504 26
96  596102 26
96  619903 27
96  663205 28
96  676605 28
96  683403 29
96  698203 29
96  704502 30
96  710404 30
96  719003 31
96  730904 32
96  746702 32
96  788903 33
96  810003 34
96  817404 34
96  824203 35
96  831802 36
96  843304 36
96  873503 37
96  878103 37
96  878104 38
96  922603 38
96  927802 39
96  928804 40
96  956903 40
96  958002 42
96  958003 43
96  972303 44
96  977702 45
96  988902 46
96  991303 46
96  994305 47
96 1006003 48
96 1039904 48
96 1049503 48
96 1080304 49
96 1112704 49
96 1118703 50
96 1118705 50
96 1131504 50
96 1132904 51
96 1153902 51
96 1159303 51
96 1161904 52
96 1170903 52
96 1170904 52
96 1186602 53
96 1188404 54
96 1204004 54
96 1208603 54
96 1215903 55
96 1223603 55
96 1264703 55
96 1269602 56
96 1270704 56
96 1280903 57
96 1284904 57
96 1310403 58
96 1314804 58
96 1322902 58
96 1326903 59
96 1337603 59
96 1338103 59
96 1358404 60
96 1374503 60
96 1378203 61
96 1391604 61
96 1408603 62
96 1412503 63
96 1425803 63
96 1456803 64
96 1485403 65
96 1493903 65
96 1517104 66
96 1527503 66
96 1530303 67
96 1537003 67
96 1582502 68
96 1593102 68
96 1595505 68
96 1665603 69
96 1674602 69
96 1692703 69
96 1783704 70
96 1792603 71
96 1805503 72
96 1823203 72
96 1830103 72
96 1884203 73
96 1890104 74
96 1896604 74
96 1900502 75
96 1920302 75
96 1933802 76
96 1943503 76
96 1963603 77
96 1967304 77
96 1967404 78
96 1971603 78
96 2012003 78
96 2012004 79
96 2018904 79
96 2022403 79
96 2032703 80
96 2043803 80
96 2049204 80
96 2066604 81
96 2074803 81
96 2097203 82
96 2111603 83
96 2129403 83
96 2147803 85
96 2154803 85
96 2214004 86
96 2243403 87
96 2257603 87
96 2281303 88
96 2283602 88
96 2285702 89
96 2289702 89
96 2301304 91
96 2319803 91
96 2334003 92
96 2342202 92
96 2349404 92
96 2375003 92
96 2376303 93
96 2390704 93
96 2408404 93
96 2415802 93
96 2420803 94
96 2424905 94
96 2448002 95
96 2453904 96
96 2462603 96
96 2478602 97
96 2481202 97
96 2483102 97
96 2519302 97
96 2626703 98
96 2655003 98
96 2669202 99
96 2683203 99
98  103403 14
98  246303 22
98  613804 29
98  640802 32
98  985202 43
98 1169103 57
98 1207802 64
98 1328702 68
98 1362303 70
98 1551005 72
98 1880902 74
98 2031002 75
98 2193403 76
98 2450802 82
98 2700503 99
end

tempfile data1
save `data1'

clear
input int(group_id tomatch)
96 14
96 19
96 32
96 39
96 51
96 86
96 96
96 99
98 99
end

tempfile data2 
save `data2'

use `data1', clear

nearmrg group_id using `data2', near(tomatch)  genmatch(tomatch_data2) type(m:1)

order group_id tomatch tomatch_data2
sort group_id tomatch
if you scroll down looking at the "tomatch" variable you can see that for some istances it is undoubtely matched with the wrong original data. Another issue is that there is some source of randomness so the mismatching is not always the same and if you run the code multiple times you will end up with a different mismatch everytime!
This is an example of what I mean by mismatch: the tomatch_data2 variable should have a 39 when tomatch is 39 but oddly enough it gets 32.
group_id tomatch tomatch_data2
96 38 39
96 39 32
96 40 39

Obviously this is causing me a lot of pain, maybe someone can point me out what I am doing wrong? Or another command to perform a nearest matching between data?

Any help, of cours, would be greatly appreciated.
All the best,
D.