Hi,

I have a dataset on compliance actions and penalties for violations of an environmental statute for individual plants in an EPA compliance database for the Clean Air Act. I have the start and end dates for a violation and overall penalties for the entire period of violation. What I want to do is construct a variable that takes a value of 1 for a plant i if a plant j in the same county was assessed a penalty in the previous year (I am interested in seeing the impact of such a variable on the duration of violation for a plant).

What makes it complicated is that my data does not have penalties assessed for each year of violation/noncompliance. I have overall penalties for periods of violation. I don't have a clue how to construct this variable. The data is set up like the following:

Code:
 

                  ID          County        Start_Date        End_Date       Duration of Violation (years)      Penalty         Reputation
                 -----         -----------      ----------------     ----------------      ---------------------------------------  --------------     ----------------
                  1             12345            2005               2006                                   1                           $15,000                   -
                  2             12345            2008               2009                                   1                           $ 3,000                    0
                  3             12345            2010               2013                                   3                           $30,000                   1
                  4             12345            2012               2014                                   2                           $ 9,000                    1

The reputation variable isn't constructed yet but I know what it would look like. As you can see Plant 4 has a value of 1 for the Reputation variable because Plant 3 in the same county had a penalty assessed/was in violation in 2011.

I know what the reputation variable is supposed to look like, I just have no idea how to accomplish this in Stata. The code also needs to take into account that the plants have to be in the same county.

Any help would be highly appreciated.