Dear statalists,
I have a question with regard to organizing the data from long to wide and removing duplicate ids.
The data that I'm working with contains ids with multiple observations. The data is besides id, identified by date of delivery (delivery_date) as well, so some ids have multiple delivery dates.
I imported the dataset from excel, variable 1 and variable 2 were originally in excel stated in different rows.
Like this:
id/ delivery_date/ var1/var2 /obs
1/ 01-05-2018/ var 1/ 5
1 /01-05-2018 /var 2 /7
2 /02-01-2018/ var 1 /4
2 /02-01-2018/ var 2 /3
2 /02-12-2019/ var 1 /2
2/ 02-12-2019/ var 2/ 10
3/ 04-05-2019/ var 1/ 6
3/ 04-05-2019/ var 2 /8
I subtracted var 1 and var 2 by sorting, selecting and copying into 2 different columns in excel and than imported it in Stata. Now, every id with an observation in variable 1 has a missing value in variable 2 and the the other way round (the same id number in the next row has a missing value in variable 1 and an observation in variable 2).
I would like to organise the data as following: one id per row, that id is selected on the highest value of var 1 (in my example, id 2 would become var 1 = 4 and var 2 = 3).
I tried a lot of commands like collapse (but than I got averages of the variables...), replace wide,
sort id date_delivery var 1
quietly by id date_delivery var 1: gen dup = cond(_N==1,0,_n)... I couldn't find the solution yet...
Thanks a lot in advance!
Regards, Anouk
Related Posts with Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier
Difference-in-differences: issue with numbers of pre- and post-treatment observationsHello, I am assessing the impact of a wage increase policy on employment status using a difference-…
Introducing year observations to longitudinal databaseDear all, I am trying to include observations from 2013 to 2016 in a panel database with observatio…
How to deal with outliers?Hi, I am performing fixed effects panel regression and some of my control variables have outliers. …
mestreg - Incorporating Variance covariance structureGreetings!! I'm John, currently working on multilevel survival analysis using mestreg command.. I wa…
How to perform CS-ARDL?Hello all, I want to apply CS-ARDL to my model but as I am new to the field and STATA I could not f…
Subscribe to:
Post Comments (Atom)
0 Response to Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier
Post a Comment