Hello,

I have the following dataset:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long gid int(country_code year) long actor_id
62356   . 1997    .
62357   . 1997    .
79600 710 2012 2664
79601 710 2015 2664
79601 710 2015 2664
79601 710 1999 2794
79601 710 2013 2664
79601 710 2013 2664
79601 710 1999 2794
79601 710 2013 2664
80317 710 2015 2794
80317 710 2012 2664
80317 710 2017 2664
80317 710 2015 2664
80317 710 2015 2664
80317 710 2012 2535
80317 710 2002 2794
80317 710 2008 2664
80317 710 2015 2664
80317 710 2013 2794
80317 710 2009 2794
80317 710 2014 2794
80317 710 2009 2794
80317 710 2008 2664
80317 710 2017 2794
80317 710 2015 2794
80317 710 2015 2794
80317 710 2015 2794
80317 710 2008 2794
80317 710 2012 2664
80317 710 2015 2774
80317 710 2010 2794
80317 710 2017 2664
80317 710 2015 2664
80317 710 2014 2794
80317 710 2016 2664
80317 710 2016 2664
80317 710 2015 2794
80317 710 2014 2664
80317 710 2015 2794
80317 710 2016 2664
80317 710 2009 2794
80317 710 2012 2664
80317 710 2017 2794
80317 710 1999 2664
80317 710 2013 2794
80317 710 2009 2664
80317 710 2015 2794
80317 710 2012 2664
80317 710 2017 2664
80317 710 2017 2794
80317 710 2016 3181
80317 710 2016 2664
80317 710 2009 2794
80317 710 2012 2535
80317 710 2012 2794
80317 710 2010 2664
80317 710 2012 2664
80317 710 2015 2794
80317 710 2010 2664
80317 710 2015 2664
80317 710 2012 2794
80317 710 2017 2794
80317 710 2015 2664
80317 710 2012 2535
80317 710 2017 2794
80317 710 2012 2794
80317 710 2014 2794
80317 710 2015 2794
80317 710 2017 2794
80317 710 2014 2794
80317 710 2016 2664
80317 710 2015 2794
80317 710 2008 2794
80317 710 2017 2664
80317 710 2015 2794
80317 710 2015 2794
80317 710 2006 2664
80317 710 2012 2664
80317 710 2017 2664
80317 710 2012 2664
80317 710 2017 2664
80317 710 2009 2794
80317 710 2017 2794
80317 710 2001 2794
80317 710 2015 2794
80317 710 2017 2794
80317 710 2012 2664
80317 710 2017 2664
80317 710 2017 2794
80317 710 2017 2664
80317 710 2009 2794
80317 710 2013 2188
80317 710 2014 2664
80317 710 2009 2794
80317 710 2015 2794
80317 710 2015 2664
80317 710 2017 2664
80317 710 2013 2664
80317 710 2012 2664
end
I would like to sum the number of actors (variable actor_id) by year and gid. My final outcome should be a dataset which include a column with the total number of actors by gid and year. In total, my final dataset should be four columns, gid, year, total number of actors, country_code (it does not change by gid and year, it is time invariant).

I am using the following code:

Code:
bysort gid year actor_id: gen nactors = _n
keep if nactors == 1
egen nactors2 = count(nactors), by(year gid)
It is works for me if I do not have missing value. Unfortunately, I have it. For example, in the sample I attach, for gid 62356 and year 1997, there is a missing. I would like to have that the total number of actors by gid 62356 and year 1997 be 0. I am not able to do that.

In my dataset, if one pair year-gid has an actor, there is no any missing for that pair of gid-year pair.

I hope you understand my problem. If not, please let me know. Any suggestion/help/comment to solve it, it is more than welcome.