I have a situation where I have 5 million records, many of them duplicates spellid's with separate primary operation codes. (This is when they have received multiple operations in one spell for example)
I need to collapse on spellid as I am looking at outcome for spells, but I want to keep all relevant operation codes for each spellid
I thought I had solved the issue with the following code:
duplicates report spellid primaryoperationcode
duplicates tag spellid, generate(duplicate)
gen primaryoperationcode2=primaryoperationcode if duplicate==1
gen primaryoperationcode3=primaryoperationcode if duplicate==2
gen primaryoperationcode4=primaryoperationcode if duplicate==3
gen primaryoperationcode5=primaryoperationcode if duplicate==4
gen primaryoperationcode6=primaryoperationcode if duplicate==5
gen primaryoperationcode7=primaryoperationcode if duplicate==6
gen primaryoperationcode8=primaryoperationcode if duplicate==7
gen primaryoperationcode9=primaryoperationcode if duplicate==8
gen primaryoperationcode10=primaryoperationcode if duplicate==9
gen primaryoperationcode11=primaryoperationcode if duplicate==10
gen primaryoperationcode12=primaryoperationcode if duplicate==11
gen primaryoperationcode13=primaryoperationcode if duplicate==12
gen primaryoperationcode14=primaryoperationcode if duplicate==13
gen primaryoperationcode15=primaryoperationcode if duplicate==14
gen primaryoperationcode16=primaryoperationcode if duplicate==15
gen primaryoperationcode17=primaryoperationcode if duplicate==16
gen primaryoperationcode18=primaryoperationcode if duplicate==17
collapse (firstnm) primaryoperationcode primaryoperationcode2 primaryoperationcode3 primaryoperationcode4 primaryoperationcode5 primaryoperationcode6 primaryoperationcode7 ///
primaryoperationcode8 primaryoperationcode9 primaryoperationcode10 primaryoperationcode11 primaryoperationcode12 primaryoperationcode13 primaryoperationcode14 primaryoperationcode15 ///
primaryoperationcode16 primaryoperationcode17 primaryoperationcode18 transfer rtt admissionage mainspecialty patientsex provider providerphinid purchasertype site sitephinid ///
charlindex patientimdscore dischargedate episodeid ethniccategory consultantcode consultant phinproceduregroup phinproceduregroupid, by (spellid)
however this has of course not solved the issue, as for duplicate==1, it codes the primaryoperationcode for both records as primaryoperationcode2, and for duplicate==2 it codes the primaryoperationcode for all 3 records as primaryoperationcode3, and so on
Does anyone have any suggestions?
Related Posts with not losing data when collapsing on spellid
Grouping Together Industries- Should I use Varlist?I am analyzing the effect of Covid on unemployment and I believe a major explanatory factor is the i…
High R-sqaured, Low Variable SignificanceHi everyone! I am currently working on a project that used secondary data and struggle a bit with th…
Diff GMM Stationarity ConcernsHi all, I am conducting an analysis with T=32 (quarters) and N=52 using difference GMM. I have my r…
MergingI am using time use survey 2019 in India.I have extracted 5 datasets of different levels.In level5 e…
Long-term effective tax rate over 5 yearsDear Stata experts I am a bloody beginner when it comes to stata. I am working with unbalanced pan…
Subscribe to:
Post Comments (Atom)
0 Response to not losing data when collapsing on spellid
Post a Comment