im trying to split a huge data set into about 400 files so that i can run an analysis on them with stata using collapse and reshape. the larger the number of files i can split into, the faster it will run.
since i have panel data that consists of multiple entries for each patient_id, i have to make sure that identical id's will stay together. so i am trying to form groups of Id's.
my data looks like this:
patient_id | x | y | z | |
1 | ||||
1 | ||||
1 | ||||
2 | ||||
3 | ||||
3 | ||||
3 | ||||
4 | ||||
4 |
I d like to group the patient_id's like this:
patient_id | group | x | y | z |
1 | 1 | |||
1 | 1 | |||
1 | 1 | |||
2 | 1 | |||
3 | 2 | |||
3 | 2 | |||
3 | 2 | |||
4 | 2 | |||
4 | 2 |
gen group=1 if patient_id<=2
replace group=2 if patient_id >2 & patient_id<=4
replace group=3 ....... and so on for 400 different groups.
i need to make sure that patient_id's are not split into different groups (ie patient_id=1 is not split at 2nd observation and thus results in patient 1 in group 1 and 2)
any feedback or alternative methods would be much appreciated.
thx
vishal
0 Response to for loops
Post a Comment