Hello everyone,

for my studies, I would like to estimate income-related inequality across countries using the concentration index. In particular, the outcome variable yi is doctor visit which is encoded binary (yes=1, no=0). The rank variable I use is income (in quantiles). I would like to use the indirect standardization to estimate the need-adjusted health variable, which is defined as:

Code:
 yISi = yi - yxi + ymean 
where yxi is the need-predicted utilization and ymean the sample mean (0.84).
To estimate yi and yxi, I run the following regression:

Code:
 global xvar "i.agegroup i.sah female"
global zvar "yedu employed rural_area married"
logit y income $xvar $zvar [pw=weight], robust
where xvar captures all need-variables (agegroup, self-rated health and gender) and zvar a couple of non-need variables (years of education, employment status, living in a rural area and if the individual is married).

yISi is estimated by this code:

Code:
 foreach z of global zvar {
      quietly sum `z' [aw=weight]
      gen `z'_mean = r(mean)
      gen `z'_copy = `z'
      replace `z' = `z'_mean
}
predict yhat

foreach z of global zvar {
      replace `z' = `z'_copy
      drop `z'_copy `z'_mean
}

gen y_is = y - yhat + ymean
Estimating yISi, Stata returns negative values as well as values greater than 1. Is this correct?

Afterward, I use the command conindex to estimate the concentration index. Do I have to take into accounts any bounds or corrections (Wagstaff)?
The code I used is:

Code:
conindex y_is [fweight=weight], rankvar(income) robust

I hope you can help me!

Thanks,
Jake