I am working on a somewhat intricate reshape of my data set. I have done some research and testing but have not found a logical solution yet, and as this is important to solve for me I hope some of you may have input on this issue.
I have a data set with 100 hospitals in the first column and municipality's patient activity in 400 columns. Since the municipalities go into hospital's catchment area, I want to group the municipality columns by the hospital column. Some municipalities have patient acitivity registered at multiple hospitals. I only want to assign the municipality to one hospital by its highest patient acitivity number. For example: m1 and m2 will be registered by id = 3 in the reshaped data (municipality labels are important).
Array
To provide intuition I have created the above toy example with id = hospital and m1-7 = municipality 1 to 7. The cells is total number of patient contacts within a year. This is my current data set structure:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(id m1 m2 m3 m4 m5 m6 m7) 1 . 100 . 1000 . 1800 . 2 100 . . . . 200 1500 3 3000 7000 . . . . . 4 . . 1000 . 1700 . . end
Code:
clear input float id str2 m 1 "m4" 1 "m6" 2 "m7" 3 "m1" 3 "m2" 4 "m3" 4 "m5" end
0 Response to Reshape and group data by highest value of variables
Post a Comment