Dear statalists,
I have a question with regard to organizing the data from long to wide and removing duplicate ids.
The data that I'm working with contains ids with multiple observations. The data is besides id, identified by date of delivery (delivery_date) as well, so some ids have multiple delivery dates.
I imported the dataset from excel, variable 1 and variable 2 were originally in excel stated in different rows.
Like this:
id/ delivery_date/ var1/var2 /obs
1/ 01-05-2018/ var 1/ 5
1 /01-05-2018 /var 2 /7
2 /02-01-2018/ var 1 /4
2 /02-01-2018/ var 2 /3
2 /02-12-2019/ var 1 /2
2/ 02-12-2019/ var 2/ 10
3/ 04-05-2019/ var 1/ 6
3/ 04-05-2019/ var 2 /8
I subtracted var 1 and var 2 by sorting, selecting and copying into 2 different columns in excel and than imported it in Stata. Now, every id with an observation in variable 1 has a missing value in variable 2 and the the other way round (the same id number in the next row has a missing value in variable 1 and an observation in variable 2).
I would like to organise the data as following: one id per row, that id is selected on the highest value of var 1 (in my example, id 2 would become var 1 = 4 and var 2 = 3).
I tried a lot of commands like collapse (but than I got averages of the variables...), replace wide,
sort id date_delivery var 1
quietly by id date_delivery var 1: gen dup = cond(_N==1,0,_n)... I couldn't find the solution yet...
Thanks a lot in advance!
Regards, Anouk
Related Posts with Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier
merging panel datadear, I have 2 panel datasets which I want to merge. One dataset contains values from 2009 to 2015 …
tempname frames questionCan someone explain why posting to a temporary frame doesn't work in the code below whereas posting …
Checking to see if one variable is bigger than the otherDear Stata Users, I have two variables (Goal & Pledged). As I am looking to see what is my idea…
Extracting just the month as a new variable from a complete date Hi, I have a variable that shows me the date of funding in the format, "day, dd/month/year, time"…
r(459);unconditional standard errors derived assuming full estimation sample; indepvars dropped observations from the estimation sampleHi, I am estimating (using Stata 15) an eprobit model with an endogenous treatment, for which I had…
Subscribe to:
Post Comments (Atom)
0 Response to Multiple observations per id, select id with highest value in other identifier and select corresponding other variables to that identifier
Post a Comment