I have a situation where I have 5 million records, many of them duplicates spellid's with separate primary operation codes. (This is when they have received multiple operations in one spell for example)
I need to collapse on spellid as I am looking at outcome for spells, but I want to keep all relevant operation codes for each spellid
I thought I had solved the issue with the following code:
duplicates report spellid primaryoperationcode
duplicates tag spellid, generate(duplicate)
gen primaryoperationcode2=primaryoperationcode if duplicate==1
gen primaryoperationcode3=primaryoperationcode if duplicate==2
gen primaryoperationcode4=primaryoperationcode if duplicate==3
gen primaryoperationcode5=primaryoperationcode if duplicate==4
gen primaryoperationcode6=primaryoperationcode if duplicate==5
gen primaryoperationcode7=primaryoperationcode if duplicate==6
gen primaryoperationcode8=primaryoperationcode if duplicate==7
gen primaryoperationcode9=primaryoperationcode if duplicate==8
gen primaryoperationcode10=primaryoperationcode if duplicate==9
gen primaryoperationcode11=primaryoperationcode if duplicate==10
gen primaryoperationcode12=primaryoperationcode if duplicate==11
gen primaryoperationcode13=primaryoperationcode if duplicate==12
gen primaryoperationcode14=primaryoperationcode if duplicate==13
gen primaryoperationcode15=primaryoperationcode if duplicate==14
gen primaryoperationcode16=primaryoperationcode if duplicate==15
gen primaryoperationcode17=primaryoperationcode if duplicate==16
gen primaryoperationcode18=primaryoperationcode if duplicate==17
collapse (firstnm) primaryoperationcode primaryoperationcode2 primaryoperationcode3 primaryoperationcode4 primaryoperationcode5 primaryoperationcode6 primaryoperationcode7 ///
primaryoperationcode8 primaryoperationcode9 primaryoperationcode10 primaryoperationcode11 primaryoperationcode12 primaryoperationcode13 primaryoperationcode14 primaryoperationcode15 ///
primaryoperationcode16 primaryoperationcode17 primaryoperationcode18 transfer rtt admissionage mainspecialty patientsex provider providerphinid purchasertype site sitephinid ///
charlindex patientimdscore dischargedate episodeid ethniccategory consultantcode consultant phinproceduregroup phinproceduregroupid, by (spellid)
however this has of course not solved the issue, as for duplicate==1, it codes the primaryoperationcode for both records as primaryoperationcode2, and for duplicate==2 it codes the primaryoperationcode for all 3 records as primaryoperationcode3, and so on
Does anyone have any suggestions?
Related Posts with not losing data when collapsing on spellid
How to compute margins after meologit (for each dependent variable level)Problem solved. …
Flagging negative values within an idHi. My dataset has an id variable and a total amount variable. There are multiple observations per i…
JoinPoint RegressionHello everyone, I am working on a dataset with 15 years of outcomes. I am very interested in plotti…
Spline for Panel Regression (Random Effects)Hi all, I am modeling daily deaths per 100,000 from COVID-19 by country. Given the natural rise and…
How to count number of observations per idHi everyone, I have a large data with around 20,000 observations where V1 is the household ID that …
Subscribe to:
Post Comments (Atom)
0 Response to not losing data when collapsing on spellid
Post a Comment