I have a conundrum regarding a policy analysis I'm running. I have a variable stateorder that is coded 0 when the policy was not in effect and 1 when it goes into effect and back to 0 when it was rescinded. I ran a logit yesterday on the data and received a message indicating that I might have complete separation or quasi-separation in the data which I'm not sure how to fix. Here's a data sample:
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input byte(stateorder medicaid_expansion demgov) float(div_gov percapita_deaths ideology_diff)
1 0 0 0  .0002178176 -13.41059
1 0 0 0 .00022515976 -13.41059
1 0 0 0 .00022719926 -13.41059
1 0 0 0  .0002286269 -13.41059
1 0 0 0 .00022923875 -13.41059
1 0 0 0  .0002373967 -13.41059
1 0 0 0 .00024698232 -13.41059
1 0 0 0 .00025085735 -13.41059
1 0 0 0 .00025799556 -13.41059
1 0 0 0  .0002622785 -13.41059
1 0 0 0 .00026248244 -13.41059
1 0 0 0 .00026329825 -13.41059
1 0 0 0 .00026574562 -13.41059
1 0 0 0 .00027818652 -13.41059
1 0 0 0 .00028491684 -13.41059
1 0 0 0 .00029327875 -13.41059
1 0 0 0 .00029694985 -13.41059
1 0 0 0   .000300417 -13.41059
1 0 0 0 .00030408805 -13.41059
1 0 0 0 .00030408805 -13.41059
1 0 0 0  .0003136737 -13.41059
1 0 0 0  .0003191803 -13.41059
1 0 0 0  .0003222395 -13.41059
1 0 0 0  .0003269303 -13.41059
1 0 0 0  .0003318251 -13.41059
1 0 0 0  .0003330488 -13.41059
1 0 0 0  .0003397791 -13.41059
1 0 0 0  .0003456937 -13.41059
1 0 0 0  .0003495687 -13.41059
1 0 0 0  .0003538516 -13.41059
1 0 0 0  .0003579306 -13.41059
1 0 0 0  .0003605819 -13.41059
1 0 0 0  .0003664965 -13.41059
1 0 0 0  .0003766939 -13.41059
1 0 0 0  .0003838321 -13.41059
1 0 0 0  .0003854637 -13.41059
1 0 0 0  .0003860756 -13.41059
1 0 0 0  .0003866874 -13.41059
1 0 0 0  .0003870953 -13.41059
1 0 0 0  .0003926019 -13.41059
1 0 0 0  .0003948454 -13.41059
1 0 0 0   .000396477 -13.41059
1 0 0 0  .0004025955 -13.41059
1 0 0 0 .00040708235 -13.41059
1 0 0 0  .0004101416 -13.41059
1 0 0 0  .0004105495 -13.41059
1 0 0 0  .0004127929 -13.41059
1 0 0 0  .0004154443 -13.41059
1 0 0 0 .00041707585 -13.41059
1 0 0 0 .00042339825 -13.41059
1 0 0 0  .0004297207 -13.41059
1 0 0 0  .0004388984 -13.41059
1 0 0 0  .0004409379 -13.41059
1 0 0 0  .0004450169 -13.41059
1 0 0 0  .0004486879 -13.41059
1 0 0 0  .0004521551 -13.41059
1 0 0 0 .00045541825 -13.41059
1 0 0 0  .0004621486 -13.41059
1 0 0 0  .0004639841 -13.41059
1 0 0 0  .0004641881 -13.41059
1 0 0 0  .0004641881 -13.41059
end
------------------ copy up to and including the previous line ------------------

I begin with a regression model which works just fine (however I understand we have potential issues with linearity, etc.):

HTML Code:
reg stateorder medicaid_expansion percapita_deaths ideology_diff prop_neighbors div_go
> v demgov

      Source |       SS           df       MS      Number of obs   =    22,908
-------------+----------------------------------   F(6, 22901)     =   2075.11
       Model |  1372.63861         6  228.773102   Prob > F        =    0.0000
    Residual |  2524.75269    22,901  .110246395   R-squared       =    0.3522
-------------+----------------------------------   Adj R-squared   =    0.3520
       Total |   3897.3913    22,907  .170139752   Root MSE        =    .33203

------------------------------------------------------------------------------------
        stateorder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------------+----------------------------------------------------------------
medicaid_expansion |   .1776538   .0035522    50.01   0.000     .1706913    .1846162
  percapita_deaths |   20.83641   3.134149     6.65   0.000     14.69326    26.97955
     ideology_diff |  -.0006289   .0000425   -14.78   0.000    -.0007123   -.0005455
    prop_neighbors |  -.2364633   .0120137   -19.68   0.000    -.2600109   -.2129157
           div_gov |   .1939881   .0053275    36.41   0.000      .183546    .2044303
            demgov |   .2594809   .0049884    52.02   0.000     .2497034    .2692584
             _cons |    .510375    .007722    66.09   0.000     .4952394    .5255105
------------------------------------------------------------------------------------
The logistic regression (same model) produces this:
HTML Code:
logit stateorder medicaid_expansion percapita_deaths ideology_diff prop_neighbors div_
> gov demgov, nolog
note: div_gov != 0 predicts success perfectly
      div_gov dropped and 5976 obs not used

note: demgov != 0 predicts success perfectly
      demgov dropped and 6474 obs not used


Logistic regression                             Number of obs     =     10,458
                                                LR chi2(4)        =    1106.02
                                                Prob > chi2       =     0.0000
Log likelihood = -6684.0625                     Pseudo R2         =     0.0764

------------------------------------------------------------------------------------
        stateorder |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------+----------------------------------------------------------------
medicaid_expansion |   .7738874   .0273959    28.25   0.000     .7201924    .8275824
  percapita_deaths |    136.772   29.96518     4.56   0.000     78.04132    195.5026
     ideology_diff |   .0088932   .0018851     4.72   0.000     .0051984     .012588
    prop_neighbors |  -2.262627   .1110186   -20.38   0.000     -2.48022   -2.045035
           div_gov |          0  (omitted)
            demgov |          0  (omitted)
             _cons |   .8999345   .0806817    11.15   0.000     .7418012    1.058068
------------------------------------------------------------------------------------
I suspect this happens because of the underlying data structure:

HTML Code:
 tabulate stateorder div_gov

           |  Divided Government
stateorder |         0          1 |     Total
-----------+----------------------+----------
         0 |     5,478          0 |     5,478
         1 |    12,948      6,474 |    19,422
-----------+----------------------+----------
     Total |    18,426      6,474 |    24,900
HTML Code:
tabulate stateorder demgov

           | Democratic Governor=1
stateorder |         0          1 |     Total
-----------+----------------------+----------
         0 |     5,478          0 |     5,478
         1 |     7,470     11,952 |    19,422
-----------+----------------------+----------
     Total |    12,948     11,952 |    24,900
Essentially, in this case, it would seem that having a Democratic Governor means state policies where enacted but there is variation on the Republican side. So I ran an additional logit selecting only demgov==1. Results presented below:
HTML Code:
logit stateorder medicaid_expansion percapita_deaths ideology_diff prop_neighbors  if
> demgov==0 & div_gov==1, nolog
outcome does not vary; remember:
                                  0 = negative outcome,
        all other nonmissing values = positive outcome
r(2000);

. logit stateorder medicaid_expansion percapita_deaths ideology_diff prop_neighbors  if
> demgov==0 & div_gov==0, nolog

Logistic regression                             Number of obs     =     10,458
                                                LR chi2(4)        =    1106.02
                                                Prob > chi2       =     0.0000
Log likelihood = -6684.0625                     Pseudo R2         =     0.0764

------------------------------------------------------------------------------------
        stateorder |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------+----------------------------------------------------------------
medicaid_expansion |   .7738874   .0273959    28.25   0.000     .7201924    .8275824
  percapita_deaths |    136.772   29.96518     4.56   0.000     78.04132    195.5026
     ideology_diff |   .0088932   .0018851     4.72   0.000     .0051984     .012588
    prop_neighbors |  -2.262627   .1110186   -20.38   0.000     -2.48022   -2.045035
             _cons |   .8999345   .0806817    11.15   0.000     .7418012    1.058068
------------------------------------------------------------------------------------
Is there anyway that I can make the logit run with both options for the categorical variables on the right-side of the model, or is this something better suited for event history analysis? I believe for EHA I would need to add a duration variable somewhere in the data but I am not sure how to do this. Any advice or suggestions would be appreciated.