Hello everyone!
I'm working with data (string) that works with medical diagnoses. I have a variable "Description" that contains a phrase (ex. - "Malignant neoplasm of peripheral nerves of abdomen" or "Intraductal carcinoma in situ of left breast".
I need to identify all patients with "Cancer" in the data. This can be specified by different terms within these phrases, as "malignant", or "neoplasm", or "carcinoma".
I would like to create a new dichotomous variable called "cancer" and replace cancer=1 if the variable "Description" contains any of these "buzz words". I have approximately 6-8 of these buzz words that would identify a patient as having cancer.
I came up with:
replace cancer=1 if regexm(Description, `"carcinoma"')
This seems to work. Is this correct? Is there a way to add an "or" command to accomplish the command for multiple "buzz words"? - For example running that command for both "neoplasm" and "malignant"?
I would appreciate any help! Otherwise, I will manually have to review thousands of entries.
Thanks!
Related Posts with Identify specific words within String Variables
How to calculate group-wise mean of VarX & subtract it from VarX? ID Year Temp NTemp Temp - NTemp 1 1990 10 12 -2 1 1991 12 12 0 1 1992 14 12 2 2 1990 12 1…
First stage with ivpoissonHi everyone, I am estimating a country-level gravity model with one endogenous regressor using ivpoi…
Simple sum programI am trying to write out a simple sum program in STATA. I need this program t use in one of my othe …
Handling time in stataHey all I am having some trouble with transforming some of the time stamps in my data, they are cur…
Help designing an algorithm that takes in logistic regression model and generates a forest plotI'm new to STATA and can think of ways of doing this in R but I'm just having trouble figuring out w…
Subscribe to:
Post Comments (Atom)
0 Response to Identify specific words within String Variables
Post a Comment