I am trying to create two dummy variables respecting a few conditions, but for some reason it is not working all the time. For some years, it works perfectly, but for others I should have dummy equals to 1, but it is not the case.
Here are the steps: first, I create the 25th and the 75th percentiles that will become my thresholds. Then, I create two dummy variables using conditions.
Here is the code:
Code:
bysort prod year : egen p75KL=pctile(diffKL_jw_mean) if PTA_j==1, p(75) bysort prod year : egen p25KL=pctile(diffKL_jw_mean) if PTA_j==1, p(25) gen extKLyears = (p75KL <= diffKL_jw_mean) if PTA_j ==1 & KL_w !=. & KL_j !=. gen intKLyears = (p25KL >= diffKL_jw_mean) if PTA_j ==1 & KL_w !=. & KL_j !=.
Code:
input prod str3 j year diffKL_jw_mean PTA_j KL_w KL_j 1 "KEN" 2000 2.9516403 1 28022 1654 1 "UGA" 2000 3.2742886 1 28022 1029 1 "TZA" 2000 3.7610051 1 28022 668 end
So checking a little bit, I think that the problem comes from the format of numbers: in my big sample where I do all the calculations, the format for the variable "diffKL_jw_mean" is %10.0g while for "p75KL" it's %9.0g (while in the small dataset I created above, formats are the same for both variables).
Could it be the problem and how can I fix it?
Thank you in advance !
0 Response to Why dummy variable is not correct?
Post a Comment