Dear all,

I am using Stata/SE 16.1 for Mac and I have created an imaginable dataset to illustrate my problem. The dataset includes:
  • v_cont: a continuous variable
  • v_bin_1 - v_bin_8, and outcome: several binary (0/1) variable
  • v_categ: one categorical variable with the value 1, 2, and 3

I set up and run a lasso regression and predicted the values using the following commands.
Code:
vl set v_bin_1-v_bin_8 v_categ v_cont, categorical(3) uncertain(0)
vl substitute ifactors = i.vlcategorical
lasso logit outcome $ifactors $vlcontinuous
predict p_predicted

As I plan to modify the coefficients to build a score, i.e. multiply by 10 and round, I was wondering how the - predict - command works. Below, the output I got with the two Stata commands is shown:
Code:
estimates store cv
lassocoef cv, display(coef)
cv
2.v_categ 9.619903
_cons -.3400077
Legend:
b - base level
e - empty cell
o - omitted



I tried to calculate the predicted values "manually" using the following commands:
Code:
gen ln_odds_of_outcome = 9.619903 * (v_categ==2) - .3400077
gen p_manual = exp(ln_odds_of_outcome)/(1+exp(ln_odds_of_outcome))
However, the variables p_predicted and p_manual strongly differ if v_categ != 2. Can you help to find my error?

Thank you in advance.
Martin