Dear all,

I employ difference-in-differences estimation to analyze the effect of a legislative policy (with staggered adoption) on self-employment in US counties. The analysis includes the years from 2000 until 2016. Some counties remain untreated for the entire period.

Basic code: xtreg self-employment treatment i.year, fe cluster(county_id)

First results indicate a significant, negative effect on self-employment. However, I want to ensure that the findings are not idiosyncratic (e.g., due to serial correlation or spurious relationships). To ensure robustness, I would like to run a random implementation test/ falsification check where the treatment indicator (1 = treatment; 0 = no treatment) is randomly assigned within the county-year panel. The effect of the random entry of the treatment indicator should be estimated with the difference-in-differences model. This procedure should be replicated 1000 times.

Unfortunately, as I am new to STATA and Diff-in-Diff, I wasn't able to implement such a random implementation test/falsification check with the specifications desribed above. It would be a great help if you could provide me a solution/code for such a random implementation test/falsification check!

The data is structured like this (exemplary values):

COUNTY-ID YEAR SELF-EMPLOYMENT TREATMENT
01001 2000 7500 0
01001 2001 7500 0
01001 2002 7500 0
01001 2003 7700 0
01001 2004 7700 0
01001 2005 7700 0
01001 2006 7200 0
01001 2007 7100 0
01001 2008 7100 0
01001 2009 7200 0
01001 2010 7300 0
01001 2011 7700 1
01001 2012 7800 1
01001 2013 8200 1
01001 2014 8200 1
01001 2015 8400 1
01001 2016 8200 1
01003 2000 2400 0
01003 2001 2300 0
01003 2002 2200 0
01003 2003 2300 0
01003 2004 2400 0
01003 2005 2300 0
01003 2006 2400 0
01003 2007 2400 0
01003 2008 2500 1
01003 2009 2500 1
01003 2010 2400 1
01003 2011 2600 1
01003 2012 2600 1
01003 2013 2600 1
01003 2014 2700 1
01003 2015 2800 1
01003 2016 2800 1

Thank you!