Drop entire individual if number of outliers exceeds threshold level

Hello everyone,

for a dataset that contains many countries, I have generated a tag that shows me outliers by an earlier defined criterion, taking a value of 1 if the value for a variable is an outlier and zero otherwise.

I have counted the number of outliers per country (c_id) with

Code:

bysort c_id: egen n_outliers=count(Z_mean_d_occ_SD2) if Z_mean_d_occ_SD2 == 1

Now I want to drop the entire country (c_id), if the number of outliers is above a certain number. e.g. above 3 or 4.

Here an extract of my data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float c_id str33 country float(Z_mean_d_occ_SD2 n_outliers)
7 "Bangladesh" 0 .
7 "Bangladesh" 0 .
7 "Bangladesh" 0 .
7 "Bangladesh" 1 5
7 "Bangladesh" 1 5
7 "Bangladesh" 1 5
7 "Bangladesh" 1 5
7 "Bangladesh" 0 .
7 "Bangladesh" 1 5
end

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float c_id str33 country float(Z_mean_d_occ_SD2 n_outliers)
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 1 2
41 "Indonesia" 0 .
41 "Indonesia" 0 .
41 "Indonesia" 1 2
end

So I would like to drop Bangladesh completely (5 outliers), but keep Indonesia (only 2 outliers).

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Drop entire individual if number of outliers exceeds threshold level
Drop entire individual if number of outliers exceeds threshold level

0 Response to Drop entire individual if number of outliers exceeds threshold level

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Drop entire individual if number of outliers exceeds threshold level Drop entire individual if number of outliers exceeds threshold level

Related Posts with Drop entire individual if number of outliers exceeds threshold level

0 Response to Drop entire individual if number of outliers exceeds threshold level

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Drop entire individual if number of outliers exceeds threshold level
Drop entire individual if number of outliers exceeds threshold level