Which is the correct approach in coding a dummy variable

Hi Statalist.

I want to generated a dummy variable from a categorical variable with values ranging '0-10'. The range '0-2' is nil to low and '3-10' is mid-high. I note that I have two categorical variables: one relates to responses by husbands and the other by wives(relimp1 - importance for husband, relimp2 - importance for wife):

Code:

gen byte imp2 = inrange(relimp1, 3, 10) & inrange(relimp2, 3, 10) & relimp1 < . & relimp2 < .

However as you can see below, "0" was given when relimp1 or relimp2 were 'missing', so I tried:

Code:

gen byte imp4 = 1 if inrange(relimp1, 3, 10) & inrange(relimp2, 3, 10) & relimp1 < . & relimp2 < . replace imp4 = 0 if (relimp12 == 1 & relimp22 == 1) | (relimp12 == 1 & inlist(relimp22, 2, 3)) | (inlist(relimp12, 2, 3) & relimp22 == 1)

which provided "1" when true, "0" when false, and "." when missing - which is what I thought I should get. Based on my reading of https://www.stata.com/support/faqs/d...rue-and-false/ I thought the first piece of code would have given me this outcome.

Given the first piece of code has considerably more "0" than the second piece of code, I believe I should go with the second piece of code (imp4). Am I reading too much into this? Help is appreciated.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long(id p_id) byte(wave relimp1 relimp2 imp2 imp4)
106 1002 10  .  . 0 .
106 1002 11  .  . 0 .
106 1002 12  .  . 0 .
106 1002 13  .  . 0 .
106 1002 14  0  0 0 0
106 1002 15  .  . 0 .
106 1002 16  .  . 0 .
106 1002 17  .  . 0 .
106 1002 18  0  0 0 0
108  109  1  .  . 0 .
108  109  2  .  . 0 .
108  109  3  .  . 0 .
108  109  4  5  6 1 1
108  109  5  .  . 0 .
108  109  6  .  . 0 .
108  109  7  .  5 0 .
103  104  1  .  . 0 .
103  104  2  .  . 0 .
103  104  3  .  . 0 .
103  104  4 10 10 1 1
103  104  5  .  . 0 .
103  104  6  .  . 0 .
103  104  7 10 10 1 1
103  104  8  .  . 0 .
103  104  9  .  . 0 .
103  104 10 10 10 1 1
103  104 11  .  . 0 .
103  104 12  .  . 0 .
103  104 13  .  . 0 .
103  104 14 10 10 1 1
103  104 15  .  . 0 .
103  104 16  .  . 0 .
103  104 17  .  . 0 .
103  104 18 10 10 1 1
end

Am I correct in my understanding that

Code:

! missing(relimp1, relimp2)    is the same as   
relimp1 < . & relimp2 < .

Stata 15.1

Note this was originally posted at https://www.statalist.org/forums/for...=1601514760045 though resposted as nature of question differs from that thread.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Which is the correct approach in coding a dummy variable
Which is the correct approach in coding a dummy variable

0 Response to Which is the correct approach in coding a dummy variable

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Which is the correct approach in coding a dummy variable Which is the correct approach in coding a dummy variable

Related Posts with Which is the correct approach in coding a dummy variable

0 Response to Which is the correct approach in coding a dummy variable

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Which is the correct approach in coding a dummy variable
Which is the correct approach in coding a dummy variable