I am trying to create a balanced sample of paired observations in Stata based on a treated and control group. Each firm-year observation in my treated group should match with one firm-year observation in my control group. First, the observations must match on industry and year, then on assets. The asset match must be the closest value (so perhaps a nearest neighbor match?). While the industry and year match needs to be exact.
How should I go about creating this sample without replacement? I have many more control firms than treated firms. The dataset looks something like this:
Firm | Year | Treatment | Industry | Assets |
1 | 2020 | 0 | 1 | 140 |
2 | 2019 | 0 | 2 | 50 |
3 | 2019 | 1 | 2 | 100 |
4 | 2020 | 1 | 1 | 150 |
5 | 2020 | 0 | 1 | 200 |
6 | 2019 | 0 | 2 | 90 |
7 | 2018 | 0 | 2 | 25 |
8 | 2020 | 0 | 2 | 300 |
In this example, I would expect Firm # 3 to match with #6 and Firm #4 to match with #1. Giving me a sample of 4 observations (2 treated and 2 control).
Thank you in advance for the help.
0 Response to Creating observation pairs in Stata - balanced sample, no replacement
Post a Comment