Hi,

I have a variable "CaseDescription1". It's a string variable, with very long text description of a case. I want to create a new variable "RACF" if "CaseDescription1" contains any of the following terms: RACF, ACF, HLCNH, NH, nursing home, aged care facility it should be coded 1, if it doesn't contain those terms then 0.

So far I have tried (just with RACF, figured if I can't get it working with one term it's definitely not going to work with multiple). As you'll see it doesn't 'detect' RACF for any of the cases, even though it's definitely there for a few thousand:


gen RACF=strpos(CaseDescription1 ,"RACF")

.gen RACF_pt=1 if RACF>0
(2,858,239 missing values generated)

. tab RACF_pt
no observations

. drop RACF

. drop RACF_pt

. gen byte RACF = strmatch( CaseDescription1 , "*RACF*")

. tab RACF

RACF | Freq. Percent Cum.
------------+-----------------------------------
0 | 2,858,239 100.00 100.00
------------+-----------------------------------
Total | 2,858,239 100.00

. drop RACF

. gen byte RACF = 1 if strmatch( CaseDescription1 , "*RACF*")
(2,858,239 missing values generated)

. drop RACF



Thank you in advance.