Thanks in advance for reviewing my question. I am trying to run a glm model (below) with a restricted cubic spline, but it is predicting values >1 for my binary outcome. I am running the exact same model with other independent continuous spline variables and am having no issues. Just with this particular variable, I am having difficulties. I thought it was because the values are small (between 0 and 0.0013, so I multiplied by 10k and still the same issue. I attached a graph of the predicted probabilities and a sample of the data. I cannot think of a reason why this is happening, so any help is appreciated!! e2sfca30km is the variable giving me difficulties, and kmspline1-6 is the restricted cubic spline from e2sfca30km. I have tried this model with 3-7 splines.
E2sfca30km is a measure at the level of census block group so there are only 9k values for 580k individuals in the dataset. 1.5% of the variable = 0. Skewness = 0.555, kurtosis 4.17.
Code:
glm apncu_cat2 kmspline*, fam(poisson) link(log) vce(robust)
Code:
clear input float(apncu_cat2 kmspline1 kmspline2 kmspline3 kmspline4 kmspline5 kmspline6 e2sfca30km) 0 .00028818857 .00004457869 5.113746e-06 1.444405e-07 0 0 .00028818857 0 .00026230657 .000033261917 2.703544e-06 7.405989e-09 0 0 .00026230657 0 .00023844297 .00002469184 1.291617e-06 0 0 0 .00023844297 0 .0004159302 .0001384533 .00003760846 9.969267e-06 2.1287353e-06 1.2599354e-07 .0004159302 0 .0004119772 .00013444884 .000035937806 9.285647e-06 1.888368e-06 9.170773e-08 .0004119772 1 .00028010557 .00004080611 4.250433e-06 7.49899e-08 0 0 .00028010557 0 .00027958507 .00004057076 4.198444e-06 7.15074e-08 0 0 .00027958507 0 .00025103986 .000029003764 1.9509584e-06 1.3559164e-10 0 0 .00025103986 0 .00028010557 .00004080611 4.250433e-06 7.49899e-08 0 0 .00028010557 0 .0002661811 .000034817243 3.001255e-06 1.457813e-08 0 0 .0002661811 0 .00026700884 .00003515568 3.067567e-06 1.6548881e-08 0 0 .00026700884 0 .0004536351 .0001806384 .0000561545 .00001819453 5.471376e-06 9.093207e-07 .0004536351 1 .0005572831 .0003282382 .00012796781 .00005453524 .00002325828 7.007693e-06 .0005572831 0 .0003217929 .000062755695 9.952177e-06 8.650636e-07 6.584936e-10 0 .0003217929 0 .00026771924 .000035447887 3.1252505e-06 1.8375584e-08 0 0 .00026771924 0 .00026278905 .000033453016 2.739492e-06 8.1290095e-09 0 0 .00026278905 1 .00039412 .0001173194 .0000290054 6.584881e-06 1.025755e-06 1.1155322e-08 .00039412 0 .0003506344 .00008182417 .000015973721 2.301527e-06 9.386544e-08 0 .0003506344 0 .0002377031 .00002445282 1.258366e-06 0 0 0 .0002377031 0 .00026774168 .000035457142 3.127084e-06 1.843537e-08 0 0 .00026774168 0 .0002678029 .000035482408 3.132092e-06 1.8599193e-08 0 0 .0002678029 0 .0005172913 .00026659502 .00009716836 .00003847566 .000015118503 4.0678246e-06 .0005172913 0 .0004609424 .00018962205 .00006028068 .000020139 6.338406e-06 1.1614068e-06 .0004609424 1 .0001860779 .000011311463 7.507855e-08 0 0 0 .0001860779 1 .00023370735 .00002318857 1.0886133e-06 0 0 0 .00023370735 1 .00018509098 .000011122442 6.856325e-08 0 0 0 .00018509098 1 .0004410896 .00016580876 .00004946164 .000015114076 4.144417e-06 5.508027e-07 .0004410896 1 .00040353995 .00012616147 .000032538937 7.932543e-06 1.437277e-06 4.051566e-08 .00040353995 1 .0007652474 .0006941111 .0003179769 .00015772582 .00007791635 .000027962524 .0007652474 0 .00026773266 .00003545342 3.126347e-06 1.8411317e-08 0 0 .00026773266 end label values apncu_cat2 apncu2 label def apncu2 0 "Inadequate or Intermediate", modify label def apncu2 1 "Adequate or Adequate Plus", modify
Array
0 Response to GLM Model (log link, poisson family) predicting values > 1 with binary outcome but only with particular independent variable
Post a Comment