Hello,
I am attempting to build a model for mortality prediction using lasso regression. So far I have separated variables into continuous and categorical subsets and have split the data. However, when it comes to attempting the actual lasso regression, an error occurs. My data set has around 400 observations and 190 variables. I have run the following codes so far:
*lasso regression steps
*dividing variables into categorical and continuous subsets
vl set, categorical(6) uncertain(0) dummy
vl list vlcategorical
vl list vlother
vl move (s1 s2 s3 s4 s5) vlother
vl list vldummy
vl move (mv2 mv3 mv4 mv5 mv6 mv7 mv8 mv9 mv10 mv11 mv12 mv13 mv14) vlother
vl list vlcontinuous
vl list vldummy
vl list vlcategorical
vl create factors = vldummy + vlcategorical
vl substitute ifactors = i.factors
label data "Survey data with vl"
save survey_vl
*splitting sample into Training and Testing
set seed 1234
splitsample, generate(sample) nsplit(2)
label define svalues 1 "Training" 2 "Testing"
label values sample svalues
lasso logit mortalityd $ifactors $vlcontinuous if sample == 1, rseed(1234)
*the number of observations is less than the cross-validation folds r(198);
It is when I ran the last code that the error occurred. I do not understand what the error means and I do not understand what cross-validation is either. I would really appreciate some help in the understanding of this and understanding how I could rectify it. I am using Stata 17 from my university portal.
Thank you!
Barsa
Related Posts with Lasso Regression for logit model
Using weights with xtheckman | xtheckman's fixed effects equivalentHi, I am using six waves of the PSID to estimate several determinants (particularly wealth) of the …
Creating dummy variables with category namesDear all, I have a dataset comprising thousands of individuals. For each individual I have, among ot…
Panel binary logistic regression, lagged independent and dependent variables, and age effectsDear Statalist users, I hope you are well. I am a PhD candidate at RMIT University, Melbourne. I a…
Export output of levelsofHi Statalists! This will be my first posts, but I have been lurking around this forum for a while l…
Shaded bar between values of X axis of a line graphHi, I am making a line graph where x axis is the year. I want to highlight/shade the periods betwee…
Subscribe to:
Post Comments (Atom)
0 Response to Lasso Regression for logit model
Post a Comment