Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier

Dear statalists,

I have a question with regard to organizing the data from long to wide and removing duplicate ids.

The data that I'm working with contains ids with multiple observations. The data is besides id, identified by date of delivery (delivery_date) as well, so some ids have multiple delivery dates.
I imported the dataset from excel, variable 1 and variable 2 were originally in excel stated in different rows.
Like this:
id/ delivery_date/ var1/var2 /obs
1/ 01-05-2018/ var 1/ 5
1 /01-05-2018 /var 2 /7
2 /02-01-2018/ var 1 /4
2 /02-01-2018/ var 2 /3
2 /02-12-2019/ var 1 /2
2/ 02-12-2019/ var 2/ 10
3/ 04-05-2019/ var 1/ 6
3/ 04-05-2019/ var 2 /8

I subtracted var 1 and var 2 by sorting, selecting and copying into 2 different columns in excel and than imported it in Stata. Now, every id with an observation in variable 1 has a missing value in variable 2 and the the other way round (the same id number in the next row has a missing value in variable 1 and an observation in variable 2).
I would like to organise the data as following: one id per row, that id is selected on the highest value of var 1 (in my example, id 2 would become var 1 = 4 and var 2 = 3).

I tried a lot of commands like collapse (but than I got averages of the variables...), replace wide,
sort id date_delivery var 1
quietly by id date_delivery var 1: gen dup = cond(_N==1,0,_n)... I couldn't find the solution yet...

Thanks a lot in advance!

Regards, Anouk

BJ Data Tech Solution

0 Response to Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier

Post a Comment

Related Posts with Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier

0 Response to Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier

Post a Comment