Dear statalists,
I have a question with regard to organizing the data from long to wide and removing duplicate ids.
The data that I'm working with contains ids with multiple observations. The data is besides id, identified by date of delivery (delivery_date) as well, so some ids have multiple delivery dates.
I imported the dataset from excel, variable 1 and variable 2 were originally in excel stated in different rows.
Like this:
id/ delivery_date/ var1/var2 /obs
1/ 01-05-2018/ var 1/ 5
1 /01-05-2018 /var 2 /7
2 /02-01-2018/ var 1 /4
2 /02-01-2018/ var 2 /3
2 /02-12-2019/ var 1 /2
2/ 02-12-2019/ var 2/ 10
3/ 04-05-2019/ var 1/ 6
3/ 04-05-2019/ var 2 /8
I subtracted var 1 and var 2 by sorting, selecting and copying into 2 different columns in excel and than imported it in Stata. Now, every id with an observation in variable 1 has a missing value in variable 2 and the the other way round (the same id number in the next row has a missing value in variable 1 and an observation in variable 2).
I would like to organise the data as following: one id per row, that id is selected on the highest value of var 1 (in my example, id 2 would become var 1 = 4 and var 2 = 3).
I tried a lot of commands like collapse (but than I got averages of the variables...), replace wide,
sort id date_delivery var 1
quietly by id date_delivery var 1: gen dup = cond(_N==1,0,_n)... I couldn't find the solution yet...
Thanks a lot in advance!
Regards, Anouk
Related Posts with Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier
xtfmb function, unknown egen function group()Hi there, I am having some issues for xtfmb function. My command is xtfmb y x1 x2 x3, verbose. Also,…
need some help for creating a graph.How to create a graph like this in stata? Array I have a change from baseline variable , but i can't…
Marginal effects with IV estimationHello, I am trying to estimate the effect of private tutoring on learning outcomes (math ability). I…
Howto build a covariance structureDear all We are having a journal club reading Diggle et all, Analysis of Longitudinal Data. In there…
Predict,xbHi all, I am trying to run the following command to predict yhat and I am only getting missing value…
Subscribe to:
Post Comments (Atom)
0 Response to Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier
Post a Comment