I have three variables: a FIRM(i.e company) identifier, a YEAR value, and a CODE value. I want to count the number of occurrences of "CODE" values in rolling three year windows while counting unique values of CODE only once in a window.
I have the following data:
firm year code
1 2010 12011
1 2011 12011
1 2012 12012
1 2012 12012
1 2012 120
1 2015 12015
2 2010 22010
2 2011 22011
2 2012 22012
2 2012 22013
2 2014 22014
2 2015 22015
I want to count the number of unique code values for three year windows for each firm (I count a single code once in three years even if it appears multiple times in that window). For example, my desired results would be as follows:
firm 1 2010 code count = 1 (i.e. code 12011)
firm 1 2011 code count = 1 (i.e. code 12011 occurs twice but counted only once)
firm 1 2012 code count = 3 (i.e. code 12011, 120, 12012)
I've tried a number of things (rangestat, asrol) but I couldn't account for duplicate CODE values in a window
so I tried the following:
forval i=2010(1)2015 {
keep if year ==`i' & year >`i'-4
duplicates drop firm code , force
collapse (count) numcode=code, by(firm)
}
and received the following output
input firm year code
firm year code
1. 1 2010 12011
2. 1 2011 12011
3. 1 2012 12012
4. 1 2012 12012
5. 1 2012 120
6. 1 2015 12015
7. 2 2010 22010
8. 2 2011 22011
9. 2 2012 22012
10. 2 2012 22013
11. 2 2014 22014
12. 2 2015 22015
13. end
.
.
. forval i=2010(1)2015 {
2. keep if year ==`i' & year >`i'-4
3. duplicates drop firm code , force
4. collapse (count) numcode=code, by(firm)
5. }
(10 observations deleted)
Duplicates in terms of firm code
(0 observations are duplicates)
year not found
r(111);
end of do-file
r(111);
Does anyone have any suggestions to improve my coding?
Thanks in advance
Ed
Related Posts with simple programming question
Outcome / negative -0.000?Hello I got in a regression, for a variable the coefficient -0.0001063. The problem is that as in m…
Point in poly dataset mappingHello, I am trying to map points in one dataset to the polygons defined in a second dataset in such…
label for loop?Der All, I have the following code for graphs Code: quietly rdplot lpop1994 X, graph_options(ytitl…
crosstabsIf I would like to analyze the relationship between two or more categorical variables, can you recom…
Defining variables based on characters included and conditional F tests using a step down procedure (Spanning tests)Dear STATA users, I am using the code below and have encountered two problems which I could not sol…
Subscribe to:
Post Comments (Atom)
0 Response to simple programming question
Post a Comment