Hi Everyone! Currently running Stata 16.

I'm trying to do nearest neighbor matching with Kmatch using the md subcommand. I have used the code many times with different matching variables and different versions of my data, and it has worked fine. Now, however, I'm trying to match on a single control variable, that has a good amount of repetition in the data (there are about 100k unique values of the variable i'm matching on, and about 3 million observations).

Here is the line of code:

Code:
kmatch md treatment single_control_variable , nn(1) idgenerate(match) ematch(state date)
For this specific control regime, when I run the code, I inevitably run out of memory--after a good amount of poking, I am 99% sure this is because of the way the idgenerate() option works in Kmatch. When a control variable cannot be matched to a single treatment variable, due to a tie in distance, it creates a new variable with the ID in it for each match. The code has hit my upper variable limit (5,000 variables) several times due to this.

Does anyone know of a way to stop the program from producing more than one ID Variable in cases of a tie? All of the information would still be recovered, since the many tied treatment observations will still point to the control observation.

Alternatively, I could use a way to recover which observation matches with which that doesn't use the idgenerate() option.

Thank you in advance!