I have divided Africa in a grid cell of 50x50 kilometers. However, there are some cells in the boundaries that involves several countries. The value of the variable of the grid cell is the same but the variable country code changes. I also have the area of the cell proportional to each country. I would like to drop the duplicates of the grid and take the ones with the highest value of the area. In other words, when I have duplicates in the grid (the same grid-cell appear in several countries), I would like to maintain just the country in which the cell has a greater area.
Another option could be eliminate the duplicates but maintain the grid when country_grid is equal to country (when this happen the area is greater than when country_grid is different than country). I attach an example of the data I have. I have more data, but I have choosen just one example (two rows) in which my problem arises.
If my question or what I would like to do is not clear enough, please let me know.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long gid float ccode str34(country_GID country) float area 136509 123 "South Sudan" "Kenya" 321621792 136509 268 "South Sudan" "South Sudan" 2733785344 end
Diego.
0 Response to Drop duplicates imposing a condition
Post a Comment