Keeping most full observation in a hierarchy

Hi,

I have a dataset that is a hierarchy set over a maximum of 4 levels where each row is a level in the hierarchy (eg, level 1=fruit, level 2=citrus, level 3=lemon). Below is an example of how the data is structured.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str9 l1 str8(l2 l3) str14 l4 float keep
"fruit"     ""         ""         ""               0
"fruit"     "citrus"   ""         ""               0
"fruit"     "citrus"   "lemon"    ""               1
"fruit"     "apple"    ""         ""               1
"fruit"     "bannana"  ""         ""               1
"vegetable" "root veg" ""         ""               0
"vegetable" "root veg" "carrot"   ""               1
"vegetable" "root veg" "parsnip"  ""               1
"vegetable" "root veg" "turnip"   ""               1
"vegetable" "root veg" "beetroot" ""               0
"vegetable" "root veg" "beetroot" "white beetroot" 1
"vegetable" "root veg" "beetroot" "red beetroot"   1
"vegetable" "tomato"   ""         ""               1
end

I want to keep only the fullest row for each unit within a level. For example, I would not keep row 2 as "citrus" appears also in row 3. Similarly, I would drop row 10 as "beetroot" appears in rows 11 & 12 - but both rows 11 & 12 would be retained as they have different l4 entries. In this example I have manually added the "keep" variable to show which observations should be kept.

Is there a way to either automate the creation of the keep variable or collapse the data so that only the fullest rows are retained?

Thank you for any help,
Bryony

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Keeping most full observation in a hierarchy
Keeping most full observation in a hierarchy

0 Response to Keeping most full observation in a hierarchy

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Keeping most full observation in a hierarchy Keeping most full observation in a hierarchy

0 Response to Keeping most full observation in a hierarchy

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Keeping most full observation in a hierarchy
Keeping most full observation in a hierarchy