Hello, I am setting up a synthetic control based on data that looks something like this:
cityid | yearquarter | year | rent
1 | 2010q1 | 2010 | 40
1 | 2010q2 | 2010 | 42
1 | 2010q3 | 2010 | 41
1 | 2010q4 | 2010 | 40
2 | 2010q1 | 2010 | 39
2 | 2010q2 | 2010 | 36
2 | 2010q3 | 2010 | 22
2 | 2010q4 | 2010 | 33
1 | 2011q1 | 2011 | 45
1 | 2011q2 | 2011 | 46
1 | 2011q3 | 2011 | 45
1 | 2011q4 | 2011 | 44
2 | 2011q1 | 2011 | 30
2 | 2011q2 | 2011 | 32
2 | 2011q3 | 2011 | 33
2 | 2011q4 | 2011 | 31
I would like to create variables that summarize "rent" by city FOR a specific year -> e.g. create a new variable "avgrent_2010" that is the average 2010 rent in each city. These variables are pre-intervention levels of my outcome variable that will be used in my synthetic control as predictor variables. Ideally, the data would look like the following:
cityid | yearquarter | year | rent | avgrent_2010
1 | 2010q1 | 2010 | 40 | 40.75
1 | 2010q2 | 2010 | 42 | 40.75
1 | 2010q3 | 2010 | 41 | 40.75
1 | 2010q4 | 2010 | 40 | 40.75
2 | 2010q1 | 2010 | 39 | 32.5
2 | 2010q2 | 2010 | 36 | 32.5
2 | 2010q3 | 2010 | 22 | 32.5
2 | 2010q4 | 2010 | 33 | 32.5
1 | 2011q1 | 2011 | 45 | 40.75
1 | 2011q2 | 2011 | 46 | 40.75
1 | 2011q3 | 2011 | 45 | 40.75
1 | 2011q4 | 2011 | 44 | 40.75
2 | 2011q1 | 2011 | 30 | 32.5
2 | 2011q2 | 2011 | 32 | 32.5
2 | 2011q3 | 2011 | 33 | 32.5
2 | 2011q4 | 2011 | 31 | 32.5
Thus far, I have used:
egen rent2010=mean(rent)if year==2010, by(cityid)
Which gives me:
cityid | yearquarter | year | rent | avgrent_2010
1 | 2010q1 | 2010 | 40 | 40.75
1 | 2010q2 | 2010 | 42 | 40.75
1 | 2010q3 | 2010 | 41 | 40.75
1 | 2010q4 | 2010 | 40 | 40.75
2 | 2010q1 | 2010 | 39 | 32.5
2 | 2010q2 | 2010 | 36 | 32.5
2 | 2010q3 | 2010 | 22 | 32.5
2 | 2010q4 | 2010 | 33 | 32.5
1 | 2011q1 | 2011 | 45 | .
1 | 2011q2 | 2011 | 46 | .
1 | 2011q3 | 2011 | 45 | .
1 | 2011q4 | 2011 | 44 | .
2 | 2011q1 | 2011 | 30 | .
2 | 2011q2 | 2011 | 32 | .
2 | 2011q3 | 2011 | 33 | .
2 | 2011q4 | 2011 | 31 | .
How can I generate a variable that is a summary of a specific subset of data within a group (year AND cityid) but apply it to all observations that meet only one of those criteria (cityid)?
Thanks in advance for any help!
Related Posts with Generating summarized variables for synthetic control predictors
?Longitudinal analysisI am working on some clinical data looking at echocardiographic features of patients after surgery. …
Repeated Cross sectional regression commandsHello guys, I am working on repeated cross sectional data sets on women (health related) for 36 coun…
twoway rspike: how to make the spike it appear vertically in the legend?Dear Statalisters, please let me know if there is already a thread about this - I did not find one.…
appending tables with the collect commandDear Stata Forum, I am just starting to experiment with the new "collect" command, and I am having …
First differences - Ordered ProbitHi, I have a dataset (T=2 and N=4300 (2300 in the first wave of which 2000 panels are present in the…
Subscribe to:
Post Comments (Atom)
0 Response to Generating summarized variables for synthetic control predictors
Post a Comment