I am working on a database of federal contractors and I'm trying to create dummy variables for the different business types an entity can have. The problem I am facing is that in this dataset a contractor can have multiple business types which are all contained in one string variable. For example:
contractor | Busn_type_str |
A | 23~2X~PI |
B | 1D~23~27~A5~A8~H2~HK~PI~QF |
In summary: I have a string of 2 digit codes (78 different codes) that I want to use to make 78 dummy variables indicating the presence of individual codes. I am trying to avoid going one by one and generating them using regexm. I am considering using a for loop combined with regexm but I want to see if there are any other methods out there that might save me some time.
Thanks for all your help!
-Enrique A Figueroa
0 Response to How can I efficiently generate many dummy variables from substrings? (other than one by one using regexm)
Post a Comment