This is mostly fine, but I am having some trouble making sure that my histograms are in order. I am unable to immediately figure out how many observations that are in each bin. I think I have found a solution, but wanted to hear if someone else has experience with this.
I found that the histogram calls another function called -twoway__histogram_gen-, and I found from the ado file that I can force it to return what it uses to create the bins. I tried to make this little function, that works well on this small, and well behaved, data. But the data I have are tens of million of observations, so it is a little more difficult to make the same "sanity" check.
Anyone else have experience in counting number of obs in bins? (And I dont understand why the -break- is not respected in this loop?)
Code:
sysuse auto, clear twoway__histogram_gen price, return local r_bin = r(bin) local r_start = r(start) local r_width = r(width) forvalues bin_n = 1(1)`r_bin'{ qui count if /// price >= (`r_start' + `r_width'*(`bin_n' - 1)) & /// price < (`r_start' + `r_width'*`bin_n') cap assert r(N) > 5 if _rc { di "Not enough observations in bin `bin_n' with `r(N)' obs." break } }
0 Response to Making sure that histogram bin has more than 5 observations.
Post a Comment