Hi all,

I have a problem with bootstrap. In particular I have a variable containing a lot of zeros. Hence, I suspect that whenever I run bootstrap, some of the samples include only zeros for such variable, determining the non convergence of the estimates. The Boot. exercise is quite simple. This is how the panel data look like:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 atc3no float Year double Trials float(y recalls_normalized lag_recalls_norm)
"A10D" 2004   1  22.15148    0    .
"A10D" 2005   0  22.30657    0    0
"A10D" 2006  71 22.425455    0    0
"A10D" 2007   3  22.54544    0    0
"A10D" 2008  18 22.711876 6.25    0
"A10D" 2009   2  22.90793    0 6.25
"A10D" 2010   7  23.09151    0    0
"A10D" 2011   5 23.213144    0    0
"A10D" 2012  10  23.32684    0    0
"A10D" 2013   8  23.54967    0    0
"A10E" 2004   0 17.460127    0    .
"A10E" 2005   9 15.233125    0    0
"A10E" 2006  27  13.80498    0    0
"A10E" 2007   0 17.687384    0    0
"A10E" 2008   0 17.763838    0    0
"A10E" 2009   0 17.642128    0    0
"A10E" 2010   0 17.612053    0    0
"A10E" 2011   0 17.626307    0    0
"A10E" 2012   0 17.678528    0    0
"A10E" 2013   0  17.67435    0    0
"A10H" 2004   0 20.555357    0    .
"A10H" 2005   2 20.438324    0    0
"A10H" 2006   8 19.775833    0    0
"A10H" 2007  21  19.49612    0    0
"A10H" 2008  63 19.285105    0    0
"A10H" 2009  21  19.17351    0    0
"A10H" 2010  45 19.319527    0    0
"A10H" 2011  11 19.522903    0    0
"A10H" 2012  23  19.43786    0    0
"A10H" 2013  35  19.34037    0    0
"A10X" 2004   4  15.09192    0    .
"A10X" 2005  32 16.236156    0    0
"A10X" 2006  36 17.594402    0    0
"A10X" 2007  39 17.977922    0    0
"A10X" 2008 117 18.303663    0    0
"A10X" 2009  27 18.415758    0    0
"A10X" 2010  25 18.589926    0    0
"A10X" 2011  30 18.681803    0    0
"A10X" 2012  25 18.646475    0    0
"A10X" 2013  21 18.569153    0    0
"A11A" 2004  17  22.54042    0    .
"A11A" 2005   0   21.5455    0    0
"A11A" 2006   0  21.54081    0    0
"A11A" 2007   1 21.640205    0    0
"A11A" 2008   0 21.760223    0    0
"A11A" 2009   3  22.01636    0    0
"A11A" 2010   0 21.985983    0    0
"A11A" 2011   0  22.03044    0    0
"A11A" 2012   0  21.97736    0    0
"A11A" 2013   0 21.984135    0    0
"A11B" 2004   0   20.8769    0    .
"A11B" 2005   0  20.46968    0    0
"A11B" 2006   0  20.29344    0    0
"A11B" 2007   4  20.16311    0    0
"A11B" 2008  34 19.688345    0    0
"A11B" 2009   0  20.09151    0    0
"A11B" 2010   5  19.72065    0    0
"A11B" 2011   0  19.45224    0    0
"A11B" 2012   1 19.596403    0    0
"A11B" 2013   1  19.68257    0    0
"A11E" 2004   0  19.52653    0    .
"A11E" 2005   0  18.55167    0    0
"A11E" 2006   0 18.251816    0    0
"A11E" 2007   0 18.072332    0    0
"A11E" 2008   0 18.030392    0    0
"A11E" 2009   0 18.399815    0    0
"A11E" 2010   0 18.860294    0    0
"A11E" 2011   1 18.501026    0    0
"A11E" 2012   0 18.439808    0    0
"A11E" 2013   0 18.047024    0    0
"A11F" 2004   0 16.994001    0    .
"A11F" 2005   0 16.907978    0    0
"A11F" 2006   0  16.86925    0    0
"A11F" 2007   0 17.033806    0    0
"A11F" 2008   0  17.27343    0    0
"A11F" 2009  15 17.641235    0    0
"A11F" 2010   0 17.882772    0    0
"A11F" 2011   0 18.076435    0    0
"A11F" 2012   0 18.098307    0    0
"A11F" 2013   0 18.349924    0    0
"A11G" 2004   0 17.647726    0    .
"A11G" 2005   0  14.97768    0    0
"A11G" 2006   0 14.476375    0    0
"A11G" 2007   0 14.556162    0    0
"A11G" 2008   1 14.447133    0    0
"A11G" 2009   1   14.4035    0    0
"A11G" 2010   0 14.234604    0    0
"A11G" 2011   0 14.745785    0    0
"A11G" 2012   1 14.802495    0    0
"A11G" 2013   0  14.72968    0    0
"A11X" 2004   1 18.248978    0    .
"A11X" 2005   4 17.706295    0    0
"A11X" 2006   8 17.597336    0    0
"A11X" 2007   4 17.427748    0    0
"A11X" 2008   7 17.384846    0    0
"A11X" 2009   5 18.027264    0    0
"A11X" 2010   7  18.85129    0    0
"A11X" 2011  10  18.77623    0    0
"A11X" 2012   6 16.929707    0    0
"A11X" 2013  15 16.825396    0    0
end
and this is my code:


Code:
cd "/Users/federiconutarelli/Desktop/"
capture program drop cre_cf

program cre_cf, rclass

xtreg y recalls_normalized lag_recalls_norm i.Year, fe vce(cluster idatc3)
predict double residuals, e
xtpoisson Trials y residuals i.Year, fe 
return scalar b_y = _b[y]
return scalar b_residuals = _b[residuals]

return scalar se_y = _se[y]
return scalar se_residuals = _se[residuals]


drop residuals 

end

gen newid = idatc3
xtset newid Year
bootstrap  r(b_y) r(b_residuals) r(se_y) r(se_residuals), ///
    reps(5000) saving(datiboot_5000, replace double) seed(123456789) cluster(idatc3) idcluster(newid) nodrop: cre_cf
As you can see the code is basic. But still there is no convergence in estimates. The variable that I suspect is causing the problem is recalls_norm. I have no alternative to add more observations to it. So is there a way either to maintain the estimates where at least a certain number of positive recalls are included or to force the bootstrap to make samples where at least one recall is positive?

Thank you,

Federico