Hello everyone,

for a dataset that contains many countries, I have generated a tag that shows me outliers by an earlier defined criterion, taking a value of 1 if the value for a variable is an outlier and zero otherwise.

I have counted the number of outliers per country (c_id) with

Code:
bysort c_id: egen n_outliers=count(Z_mean_d_occ_SD2) if Z_mean_d_occ_SD2 == 1
Now I want to drop the entire country (c_id), if the number of outliers is above a certain number. e.g. above 3 or 4.

Here an extract of my data:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float c_id str33 country float(Z_mean_d_occ_SD2 n_outliers)
7 "Bangladesh" 0 .
7 "Bangladesh" 0 .
7 "Bangladesh" 0 .
7 "Bangladesh" 1 5
7 "Bangladesh" 1 5
7 "Bangladesh" 1 5
7 "Bangladesh" 1 5
7 "Bangladesh" 0 .
7 "Bangladesh" 1 5
end
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float c_id str33 country float(Z_mean_d_occ_SD2 n_outliers)
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 1 2
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 1 2
end
So I would like to drop Bangladesh completely (5 outliers), but keep Indonesia (only 2 outliers).