Create a new variable from existing variables where the categories are not mutually exclusive

Hi,

I am trying to create a new variable adv_outcome. I have 5 pre-existing variables I am interested in sad, angry, grief, disappoint, hope. These categories are not mutually exclusive (i.e. some patients will be sad AND angry). These categories are all binary 0=no, 1=yes.

I tried using
gen adv_outcome= 0 if sad==0 & angry==0 & grief==0 & disappoint==0 & hope==0
replace adv_outcome=1 if sad==1
replace adv_outcome=2 if angry==1
replace adv_outcome=3 if grief==1
replace adv_outcome=4 if disappoint==1
replace adv_outcome=5 if hope==1

The issue when I do this is that as it creates each new category it "takes" the observations from the previous category instead of adding them to the tally for both. The reason I want this adv_outcome variable is so that I can compare it to various other factors such as age, gender, rurality and have it show me what % of men end up experiencing each adverse outcome rather than creating 6 separate tables for each patient-factor variable. I do have an any_outcome variable that was created if any of the 5 (sad, angry, grief, disappoint, hope.) occurred which seems to be working fine, but I'd like to see which outcomes were most common.

This is what I would like the crosstabs to look like: (%'s made up for example purpose, obviously just want the numbers not the words but wanted to convey what I want as clearly as possible so put the words in for this)

	Metro	Regional	Rural
none	12% experienced none	24% experienced none	5% experienced none
sad	36% experienced sad	47% experienced sad	60% experienced sad
angry	40% experienced angry	10% experienced angry	40% experienced angry
grief	65% experienced grief	60% experienced grief	30% experienced grief
dissapoint	35% experienced disappointment	5% experienced disappointment	90% experienced disappointment
hope	80% experienced hope	70% experienced hope	67% experienced hope

This then brings me to my next issue, the dataset is 2,000 people (1,000 men, 1,000 not-men). If the patients fit in with multiple groups, there could end up being a total of 5,000 in each column assuming that all 1,000 men felt all 5 emotions, so would percentages then be calculated /5,000 instead of /2,000?

Is it easier to just run each table separately and manually type the numbers into a normal table? Am I missing something super obvious? I feel like I've been staring at it all day making zilch progress.

Thanks, J

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Create a new variable from existing variables where the categories are not mutually exclusive
Create a new variable from existing variables where the categories are not mutually exclusive

0 Response to Create a new variable from existing variables where the categories are not mutually exclusive

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Create a new variable from existing variables where the categories are not mutually exclusive Create a new variable from existing variables where the categories are not mutually exclusive

Related Posts with Create a new variable from existing variables where the categories are not mutually exclusive

0 Response to Create a new variable from existing variables where the categories are not mutually exclusive

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Create a new variable from existing variables where the categories are not mutually exclusive
Create a new variable from existing variables where the categories are not mutually exclusive