I have a dataset that looks like this, 2 observations per id:
id observation birth year diabetes bmi weight waist
1 1 1960 0 23 80 34
1 2 1960 0 22 79 33
2 1 1958 1 25 76 48
2 2 1958 0 24 74 46
3 1 1970 1 35 78 40
3 2 1970 1 38 81 42










I need to write a program that generates a new variable for vars bmi, weight and waist, which takes a value 1 if the average measurements (average of 2 observations) for each id is in the top 20th percentile (>= top 20th percentile), for each categories of diabetes; and takes a value 0 otherwise.

Although I can imagine egen will do it for generating a new var taking 0 or 1, I am very confused at how to have this work based on the top 20th percentile calculation by the factor diabetes.

I will kindly appreciate any suggestions.

Thanks much.