Drop duplicates imposing a condition

Good morning,

I have divided Africa in a grid cell of 50x50 kilometers. However, there are some cells in the boundaries that involves several countries. The value of the variable of the grid cell is the same but the variable country code changes. I also have the area of the cell proportional to each country. I would like to drop the duplicates of the grid and take the ones with the highest value of the area. In other words, when I have duplicates in the grid (the same grid-cell appear in several countries), I would like to maintain just the country in which the cell has a greater area.

Another option could be eliminate the duplicates but maintain the grid when country_grid is equal to country (when this happen the area is greater than when country_grid is different than country). I attach an example of the data I have. I have more data, but I have choosen just one example (two rows) in which my problem arises.

If my question or what I would like to do is not clear enough, please let me know.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long gid float ccode str34(country_GID country) float area
136509 123 "South Sudan" "Kenya"        321621792
136509 268 "South Sudan" "South Sudan" 2733785344
end

Thank you,

Diego.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Drop duplicates imposing a condition
Drop duplicates imposing a condition

0 Response to Drop duplicates imposing a condition

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Drop duplicates imposing a condition Drop duplicates imposing a condition

Related Posts with Drop duplicates imposing a condition

0 Response to Drop duplicates imposing a condition

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Drop duplicates imposing a condition
Drop duplicates imposing a condition