1) when sorting by drug and date (within the same Drug group), if an observation in phases1 is the same as the previous observation in phases1 AND phases2 is not blank, replace phases1 by phases2. This is a case of observation 1 and 2 below.
2) when sorting by drug and date (within the same Drug group), if an observation in phases1 is NOT the same as the previous observation in phases1 AND phases2 is not blank, create a new observation with the same data (i.e. a duplicate). Example, observation 5 below.
Basically, the phases variables are in wide and I would've been able to reshape them and then manipulate the date further. But because the rest of the data is in long form with the Drug variable having multiple observations for the same Drug name, I'm unable to do so.
Code:
clear input str64 Drug str11 phases1 str4 phases2 int date "(+)-calanolide A" "i" "" 13772 "(+)-calanolide A" "i" "ii" 14941 "(+)-discodermolide" "i" "" 15508 "(+)-discodermolide" "i" "" 16464 "(R)-etodolac" "i" "ii" 15496 "100240" "i" "" 12645 "100240" "ii" "" 15110 "100240" "ii" "" 15561 "1018-ISS (inhaled)" "ii" "iii" 16244 "1069C85" "ii" "" 12219 "11 beta HSD inhibitors, AbbVie" "preclinical" "" 19730 "11 beta HSD inhibitors, Bristol-Myers Squibb-2" "preclinical" "" 19941 "117m-Sn-DTPA" "ii" "" 15943 "1192U90" "i" "" 13468
0 Response to Combining variables in wide format into one variable in a long dataset
Post a Comment