Hello!

In my dataset, I'm looking to model the effect of a particular treatment on the number of patents a company applies for. As the DV (patents) is a count variable, and indiciates overdispersion, a negative binomial regression seems to be the way to go. However, I'd also like to employ coarsened exact matching (CEM) based on certain variables (SIC code, state, age) to generate a control group and then perform a difference-in-differences (diff-in-diff) analysis.

I have managed to find one paper that combines all of it, however, the authors do not go into details about exactly how they do it.

Does anyone have any advice or hints on how to go about this?

I have also attached an example of my dataset:

dyadid is the id
sic, ventureage and venturestate are for CEM
patents is the DV

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int dyadid str4 sic byte ventureage str14 venturestate byte treatment float post byte patents
1 "2836" 5 "California" 1 0 0
1 "2836" 5 "California" 1 0 0
1 "2836" 5 "California" 1 0 0
1 "2836" 5 "California" 1 0 0
1 "2836" 5 "California" 1 0 0
1 "2836" 5 "California" 1 0 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 1
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 1
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
1 "2836" 5 "California" 1 1 0
2 "2836" 4 "California" 0 0 0
2 "2836" 4 "California" 0 0 0
2 "2836" 4 "California" 0 0 0
2 "2836" 4 "California" 0 0 0
2 "2836" 4 "California" 0 0 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 1
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 1
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
2 "2836" 4 "California" 0 1 0
end
Thanks in advance!