I have three variables: a FIRM(i.e company) identifier, a YEAR value, and a CODE value. I want to count the number of occurrences of "CODE" values in rolling three year windows while counting unique values of CODE only once in a window.
I have the following data:
firm year code
1 2010 12011
1 2011 12011
1 2012 12012
1 2012 12012
1 2012 120
1 2015 12015
2 2010 22010
2 2011 22011
2 2012 22012
2 2012 22013
2 2014 22014
2 2015 22015
I want to count the number of unique code values for three year windows for each firm (I count a single code once in three years even if it appears multiple times in that window). For example, my desired results would be as follows:
firm 1 2010 code count = 1 (i.e. code 12011)
firm 1 2011 code count = 1 (i.e. code 12011 occurs twice but counted only once)
firm 1 2012 code count = 3 (i.e. code 12011, 120, 12012)
I've tried a number of things (rangestat, asrol) but I couldn't account for duplicate CODE values in a window
so I tried the following:
forval i=2010(1)2015 {
keep if year ==`i' & year >`i'-4
duplicates drop firm code , force
collapse (count) numcode=code, by(firm)
}
and received the following output
input firm year code
firm year code
1. 1 2010 12011
2. 1 2011 12011
3. 1 2012 12012
4. 1 2012 12012
5. 1 2012 120
6. 1 2015 12015
7. 2 2010 22010
8. 2 2011 22011
9. 2 2012 22012
10. 2 2012 22013
11. 2 2014 22014
12. 2 2015 22015
13. end
.
.
. forval i=2010(1)2015 {
2. keep if year ==`i' & year >`i'-4
3. duplicates drop firm code , force
4. collapse (count) numcode=code, by(firm)
5. }
(10 observations deleted)
Duplicates in terms of firm code
(0 observations are duplicates)
year not found
r(111);
end of do-file
r(111);
Does anyone have any suggestions to improve my coding?
Thanks in advance
Ed
Related Posts with simple programming question
Joinby function for Ticker and YearHello, I would like to merge MSCI and Compustat data using the following code: For MSCI data: egen y…
Margins and marginsplot - interaction and obtaining two different graphs by sex ?Dear Statalists, I need your help to find the right command of margins and marginsplot. Here's what…
Why are missing values treated as positive infinity in Stata?Up until now, I have never found any practical use for this. In fact, it generally results in confus…
SGMM - problem with Sargan/Hansen testsI have a question regarding SGMM. I have run the regression 'xtabond2 gdpg lgdpg d pop educ loginv, …
Problem in detecting influencial cases using cookDear Sir or Madam, I am trying to detect influencial cases using the "cook" approach with regress. …
Subscribe to:
Post Comments (Atom)
0 Response to simple programming question
Post a Comment