I have two data sets on healthcare costs during the past 12 months, in which one is at individual level and the other is at household level. I want to merge the two but I encounter an issue of overlapping obs in the individual data. Specifically, the individual data consists of information on: 1) healthcare centers (public or private centers) where individuals visit for healthcare treatment; 2) services they use (whether out-patient or in-patient services); 3) costs incurred by outpatient visits; and 4) costs incurred by in-patient visits. Thus, it could be cases that an individual may use healthcare services more than one time during the last 12 months, at different healthcare centers, and different services. For example, Mr. went to hospitals three times in a given year, in which the 1st and 2nd time he used outpatient services at public healthcare centers, but the 3rd time he was hospitalized at a private healthcare center (note: this may not be the case in my example data but similar things may happen). That created challenges in generating a unique identifier for each individual.
My question is how to generate a unique identifier for each individual so that I can merge the individual data set into the household one, without dropping any variables in the both data sets. Any help is highly appreciated.
Note: prid comid hhid are uniquely identified observations in the household data
prid comid hhid invid should be uniquely identified observations in the individual data, but due to the overlapping issue, yet they are not.
Individual data
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int prid long(comid hhid invid opcost ipcost) float hcenter byte service 101 1010103 14101 1410101 1000 0 1 0 101 1010103 14102 1410201 500 0 1 0 101 1010103 14102 1410201 700 0 0 0 101 1010103 14103 1410301 500 0 1 0 101 1010103 14103 1410301 0 1000 1 1 101 1010103 14103 1410303 0 3000 1 1 101 1010103 14103 1410303 300 0 0 0 101 1010103 14104 1410401 1500 0 1 0 101 1010109 2002 200201 90 0 1 0 101 1010109 2011 201101 200 0 0 0 101 1010109 2011 201102 210 0 0 0 101 1010109 2011 201103 500 0 1 0 101 1010109 2011 201104 300 0 0 0 101 1010109 2013 201302 150 0 1 0 101 1010109 2014 201403 20 0 1 0 101 1010109 2014 201403 480 0 0 0 101 1010109 2014 201403 800 0 0 0 101 1010109 2014 201404 . 400 1 1 101 1010109 2019 201901 . 200 1 1 101 1010109 2020 202002 . 5000 1 1 101 1010109 2021 202102 700 0 0 0 101 1010109 2021 202102 500 0 0 0 101 1010109 2021 202104 300 0 0 0 101 1010109 2022 202201 0 0 1 0 101 1010115 5006 500601 2000 0 1 0 101 1010115 5006 500603 15 0 1 0 101 1010115 5006 500604 15 0 1 0 101 1010115 5007 500701 . 300 1 1 101 1010115 5007 500701 . 400 1 1 101 1010115 5008 500801 . 315 1 1 101 1010115 5008 500801 . 210 1 1 101 1010115 5008 500801 . 315 1 1 101 1010115 5008 500801 . 315 1 1 101 1010115 5008 500802 200 0 1 0 101 1010115 5010 501001 400 0 1 0 101 1010115 5010 501001 . 500 1 1 101 1010115 5011 501101 3000 0 1 0 101 1010115 5011 501102 1500 0 1 0 101 1010115 5012 501204 195 0 1 0 101 1010115 5012 501204 220 0 1 0 101 1010115 5013 501304 . 600 1 1 101 1010115 5013 501304 . 200 1 1 101 1010115 5015 501501 . 100 1 1 101 1010115 5015 501501 . 120 1 1 101 1010115 5015 501501 . 130 1 1 101 1010115 5015 501503 280 0 1 0 101 1010115 5016 501601 335 0 1 0 101 1010115 5016 501603 335 0 1 0 101 1010115 5016 501604 335 0 1 0 101 1010115 5016 501605 335 0 1 0 101 1010115 5016 501606 440 0 1 0 101 1010115 5020 502002 . 1100 1 1 101 1010115 5020 502002 1000 0 0 0 101 1010115 5020 502003 150 0 0 0 101 1010115 5020 502003 . 1500 1 1 101 1010115 5020 502003 30 0 1 0 101 1010115 5020 502004 30 0 1 0 101 1010115 5020 502004 . 1500 1 1 101 1010115 5020 502004 150 0 0 0 101 1010115 5020 502005 30 0 1 0 101 1010115 5103 510301 0 1000 1 1 101 1010115 5106 510601 80 0 1 0 101 1010115 5107 510702 200 0 1 0 101 1010115 5108 510801 80 0 1 0 101 1010115 5108 510802 30 0 1 0 101 1010115 5109 510901 200 0 1 0 101 1010117 27101 2710101 50 0 1 0 101 1010117 27101 2710102 110 0 1 0 101 1010117 27103 2710301 300 0 0 0 101 1010117 27103 2710304 550 0 0 0 101 1010117 27104 2710401 20 0 1 0 101 1010117 27104 2710402 40 0 1 0 101 1010117 27105 2710501 300 0 1 0 101 1010117 27105 2710503 1150 0 0 0 101 1010117 27105 2710505 850 0 0 0 101 1010117 27106 2710602 500 0 1 0 101 1010123 5101 510101 0 2500 1 1 101 1010123 5102 510201 500 0 0 0 101 1010123 5102 510202 2000 0 0 0 101 1010123 5104 510401 80 0 1 0 101 1010303 5101 510101 100 0 1 0 101 1010303 5101 510102 100 0 1 0 101 1010311 1102 110205 500 0 1 0 101 1010311 1102 110206 600 0 1 0 101 1010311 1104 110401 0 1000 1 1 101 1010313 13001 1300102 400 0 1 0 101 1010313 13002 1300202 2000 0 1 0 101 1010313 13004 1300403 500 0 0 0 101 1010313 13006 1300602 . 4000 1 1 101 1010313 13009 1300901 100 0 1 0 101 1010313 13012 1301201 100 0 0 0 101 1010313 13013 1301302 200 0 1 0 101 1010313 13014 1301402 500 0 1 0 101 1010313 13016 1301603 . 35000 1 1 101 1010313 13019 1301902 . 120 1 1 101 1010313 13101 1310102 200 0 1 0 101 1010313 13103 1310301 0 3000 1 1 101 1010313 13103 1310302 0 2000 1 1 101 1010313 13105 1310504 0 500 1 1 101 1010503 4101 410103 0 200 1 1 end label values hcenter hcenter label values service service
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int prid long(comid hhid topcost tipcost) 101 1010103 14101 1000 0 101 1010103 14102 1200 0 101 1010103 14103 800 4000 101 1010103 14104 1500 0 101 1010103 14105 0 0 101 1010109 2001 0 0 101 1010109 2002 90 0 101 1010109 2004 0 0 101 1010109 2006 0 0 101 1010109 2007 0 0 101 1010109 2008 0 0 101 1010109 2009 0 0 101 1010109 2010 0 0 101 1010109 2011 1210 0 101 1010109 2012 0 0 101 1010109 2013 150 0 101 1010109 2014 1300 400 101 1010109 2015 0 0 101 1010109 2016 0 0 101 1010109 2019 0 200 101 1010109 2020 0 5000 101 1010109 2021 1500 0 101 1010109 2022 0 0 101 1010109 2023 0 0 101 1010109 2024 0 0 101 1010115 5001 0 0 101 1010115 5002 0 0 101 1010115 5003 0 0 101 1010115 5004 0 0 101 1010115 5005 0 0 101 1010115 5006 2030 0 101 1010115 5007 0 700 101 1010115 5008 200 1155 101 1010115 5009 0 0 101 1010115 5010 400 500 101 1010115 5011 4500 0 101 1010115 5012 415 0 101 1010115 5013 0 800 101 1010115 5014 0 0 101 1010115 5015 280 350 101 1010115 5016 1780 0 101 1010115 5017 0 0 101 1010115 5018 0 0 101 1010115 5019 0 0 101 1010115 5020 1390 4100 101 1010115 5103 0 1000 101 1010115 5106 80 0 101 1010115 5107 200 0 101 1010115 5108 110 0 101 1010115 5109 200 0 101 1010117 27101 160 0 101 1010117 27103 850 0 101 1010117 27104 60 0 101 1010117 27105 2300 0 101 1010117 27106 500 0 101 1010123 5101 0 2500 101 1010123 5102 2500 0 101 1010123 5103 0 0 101 1010123 5104 80 0 101 1010123 5105 0 0 101 1010303 5101 200 0 101 1010303 5102 0 0 101 1010303 5103 0 0 101 1010303 5104 0 0 101 1010303 5105 0 0 101 1010311 1101 0 0 101 1010311 1102 1100 0 101 1010311 1103 0 0 101 1010311 1104 0 1000 101 1010311 1105 0 0 101 1010313 13001 400 0 101 1010313 13002 2000 0 101 1010313 13003 0 0 101 1010313 13004 500 0 101 1010313 13005 0 0 101 1010313 13006 0 4000 101 1010313 13007 0 0 101 1010313 13008 0 0 101 1010313 13009 100 0 101 1010313 13010 0 0 101 1010313 13011 0 0 101 1010313 13012 100 0 101 1010313 13013 200 0 101 1010313 13014 500 0 101 1010313 13015 0 0 101 1010313 13016 0 35000 101 1010313 13017 0 0 101 1010313 13018 0 0 101 1010313 13019 0 120 101 1010313 13020 0 0 101 1010313 13101 200 0 101 1010313 13102 0 0 101 1010313 13103 0 5000 101 1010313 13104 0 0 101 1010313 13105 0 500 101 1010503 4101 0 200 101 1010503 4103 0 0 101 1010503 4104 0 0 101 1010503 4106 200 0 101 1010503 4107 0 0 end
0 Response to How to deal with overlapping obs in healthcare data
Post a Comment