I have three variables: a FIRM(i.e company) identifier, a YEAR value, and a CODE value. I want to count the number of occurrences of "CODE" values in rolling three year windows while counting unique values of CODE only once in a window.
I have the following data:
firm year code
1 2010 12011
1 2011 12011
1 2012 12012
1 2012 12012
1 2012 120
1 2015 12015
2 2010 22010
2 2011 22011
2 2012 22012
2 2012 22013
2 2014 22014
2 2015 22015
I want to count the number of unique code values for three year windows for each firm (I count a single code once in three years even if it appears multiple times in that window). For example, my desired results would be as follows:
firm 1 2010 code count = 1 (i.e. code 12011)
firm 1 2011 code count = 1 (i.e. code 12011 occurs twice but counted only once)
firm 1 2012 code count = 3 (i.e. code 12011, 120, 12012)
I've tried a number of things (rangestat, asrol) but I couldn't account for duplicate CODE values in a window
so I tried the following:
forval i=2010(1)2015 {
keep if year ==`i' & year >`i'-4
duplicates drop firm code , force
collapse (count) numcode=code, by(firm)
}
and received the following output
input firm year code
firm year code
1. 1 2010 12011
2. 1 2011 12011
3. 1 2012 12012
4. 1 2012 12012
5. 1 2012 120
6. 1 2015 12015
7. 2 2010 22010
8. 2 2011 22011
9. 2 2012 22012
10. 2 2012 22013
11. 2 2014 22014
12. 2 2015 22015
13. end
.
.
. forval i=2010(1)2015 {
2. keep if year ==`i' & year >`i'-4
3. duplicates drop firm code , force
4. collapse (count) numcode=code, by(firm)
5. }
(10 observations deleted)
Duplicates in terms of firm code
(0 observations are duplicates)
year not found
r(111);
end of do-file
r(111);
Does anyone have any suggestions to improve my coding?
Thanks in advance
Ed
Related Posts with simple programming question
Store results for regression coefficients for a time-series regression over multiple industriesHello Stata Users, I have a problem regarding a time-series regression for several industries. I am…
How to run a do file using a loop with captureHi, I got a do file that only label variables and I need to use it by running it inside another 2 do…
Cross Sectional DependenceHi Dear, Can you please tell me if there is any requirement like N should be greater than T or T sh…
Propensity Score Matching - Exporting results using esttabHello all So I am doing an analysis using psmatch2. I have run all the required commands to get ATT…
Coefplot over different yearsHi All, I am currently running a regression of y on x, every year, for 100 years. I wish to display…
Subscribe to:
Post Comments (Atom)
0 Response to simple programming question
Post a Comment