I need to run synthetic difference is difference regression. Therefore, I need balanced panel data. But, as you can see my sample year is from 2000-2021. So, there are 22 years total. For all the counties at least one year info for desired variable wanted is missing. When I'm running this following command it's telling me year and county are missing


Code:
tsset county year
isid county year, sort
variables county and year should never be missing
r(459);
When I'm running the following command all the observations are getting dropped out - indicating not even a single county has variable wanted for 22 years.

Code:
by county (year): keep if _N == 22
I'm attaching a part of my data
Code:
* Example generated by -dataex-. For more info, type help dataex clear

input float(wanted county year)

2 1011 2002
2 1011 2003
1 1011 2004
1 1011 2019
1 1027 2000
2 1027 2002
1 1027 2008
1 1027 2009
1 1027 2013
1 1027 2018
4 1001 2000
3 1001 2001
1 1001 2002
1 1001 2003
3 1001 2004
5 1001 2005
2 1001 2006
3 1001 2007
2 1001 2008
2 1001 2009
3 1001 2010
2 1001 2011
7 1001 2012
3 1001 2013
3 1001 2014
3 1001 2015
2 1001 2016
7 1001 2017
11 1001 2018
3 1001 2019
3 1001 2020

end
After using - tsfill, full -command I was successful to keep the disappearing counties to show up in my data to make it strongly balanced.

Code:
tsset county year

tsfill, full
Then I replaced my wanted variable with 0 when wanted == . This is actually right since wanted is 0 when it doesn't show up in my data.

[CODE]
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(wanted county year policy)
    4 1001 2000 0
    3 1001 2001 0
    1 1001 2002 0
    1 1001 2003 0
    3 1001 2004 0
    5 1001 2005 0
    2 1001 2006 0
    3 1001 2007 0
    2 1001 2008 0
    2 1001 2009 0
    3 1001 2010 0
    2 1001 2011 0
    7 1001 2012 0
    3 1001 2013 0
    3 1001 2014 1
    3 1001 2015 1
    2 1001 2016 1
    7 1001 2017 1
   11 1001 2018 1
    3 1001 2019 1
    3 1001 2020 1
0 1001 2021 0
0 1001 2022 0
0 1011 2000 0
0 1011 2001 0
    2 1011 2002 0
    2 1011 2003 0
    1 1011 2004 0
0 1011 2005 0
0 1011 2006 0
0 1011 2007 0
0 1011 2008 0
0 1011 2009 0
0 1011 2010 0
0 1011 2011 0
0 1011 2012 0
0 1011 2013 0
0 1011 2014 0
0 1011 2015 0
0 1011 2016 0
0 1011 2017 0
0 1011 2018 0
    1 1011 2019 0
0 1011 2020 0
0 1011 2021 0
0 1011 2022 0
    1 1027 2000 0
0 1027 2001 0
    2 1027 2002 0
0 1027 2003 0
0 1027 2004 0
0 1027 2005 0
0 1027 2006 0
0 1027 2007 0
    1 1027 2008 0
    1 1027 2009 0
0 1027 2010 0
0 1027 2011 0
0 1027 2012 0
    1 1027 2013 0
    3 1027 2014 0
0 1027 2015 0
    1 1027 2016 0
0 1027 2017 0
    1 1027 2018 1
    2 1027 2019 1
0 1027 2020 1
0 1027 2021 1
0 1027 2022 1
end
it's still showing unbalanced panel when I'm running the following stata command for SDID or synthetic difference in difference

Code:
sdid wanted county year policy, vce(bootstrap) seed(1213)

Panel is unbalanced.
r(451);
Is there anything I can do to detect the error ?