Hello everyone!
I'm working with data (string) that works with medical diagnoses. I have a variable "Description" that contains a phrase (ex. - "Malignant neoplasm of peripheral nerves of abdomen" or "Intraductal carcinoma in situ of left breast".
I need to identify all patients with "Cancer" in the data. This can be specified by different terms within these phrases, as "malignant", or "neoplasm", or "carcinoma".
I would like to create a new dichotomous variable called "cancer" and replace cancer=1 if the variable "Description" contains any of these "buzz words". I have approximately 6-8 of these buzz words that would identify a patient as having cancer.
I came up with:
replace cancer=1 if regexm(Description, `"carcinoma"')
This seems to work. Is this correct? Is there a way to add an "or" command to accomplish the command for multiple "buzz words"? - For example running that command for both "neoplasm" and "malignant"?
I would appreciate any help! Otherwise, I will manually have to review thousands of entries.
Thanks!
Related Posts with Identify specific words within String Variables
Industry/Year Fixed Effects (Panel Data)Hi, Let me first say that I'm Stata-beginner and would appreciate your help. I have read many threa…
Joint Standard Errors of RegressionHello everyone, I am trying to model the line of the intersect between the equation Y = a + b1.X1 + …
Restricting the survival analysis to first five years since diagnosisHi Everyone, Below is an example of data from the survival analysis. The total duration of follow…
How to estimate the exponential decay parameterHi Listers, My data code GP attendance per month (I have 24 months worth of data). Plotting the dat…
How to create variables recording characteristics of the other members in the householdHello, I’m working with a time use database with information at the individual and household level.…
Subscribe to:
Post Comments (Atom)
0 Response to Identify specific words within String Variables
Post a Comment