Dear all,

I am facing some issues and I don't find any topic on my subject (sorry in advance if it has already been answered). Here is what I want to do: I want to create a variable that sums the first occurrence of the "object" variable within each region year group for the last three years. Let's call this variable sum3Y. My dataset looks like this:

region year object

"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2008 0
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2009 1
"Dushanbe" 2010 1
"Dushanbe" 2011 0
"Dushanbe" 2012 0
"Dushanbe" 2013 0
"Dushanbe" 2013 0
"Dushanbe" 2013 0
"Dushanbe" 2014 0


For instance, sum3Y for the year 2011 should be equal to 2: first occurrence in Dushanbe 2011 = 0 + first occurrence in Dushanbe 2010 = 1 + first occurrence in Dushanbe 2009 = 1 + first occurrence in Dushanbe 2008 = 0 = 2. In the same logic, sum3Y for the year 2013 should be equal to 1.

My main problem so far has been that I do not have the same number of observations per year. I tried this command:

by region: gen CP3Y = object + object[_n-1] + object[_n-2] + object[_n-3]

But it did not give me the expected outcome.

Many thanks in advance