Hello everyone,

I have insurance claims data (health insurance) from which I'm trying to figure out what effects a remote / digital service offered to insurance customers has on the claims amount. This is a bit tricky is because the service is voluntary so selection bias is quite big problem and I don't have too many variables in data which would explain the usage of the service. The health insurance is also voluntary, we have large public sector, but some people opt to get private health insurance.

dependent variable:
- logCLAIMS (euros)

independent variables are:
- age (continuous)
- three class factor variable: where the treatment was given: 1. preferred provider organization 2. not PPO 3. public hospital
- gender(MALE, FEMALE, UNBORN CHILD > NO GENDER)
- treated variable is a dummy 0 / 1 in which 0 = did not use the service and 1 = did use the service

Instruments:
- area of living (certain areas of the country there are bigger cities and some mainly country side -> significant difference to the use of service)
- three class factor variable, where the treatment was given

My instruments are quite weak but this is all i can get atm from my data.

Questions:

- I would like to get your comments on the model and is there a better way to do this?
- I get ATE -0,75 (from eregress and etreg -models) and I'm wondering how margins can give me a result of 0,17, what can I interpret from here? ?

codes below

Code:
eregress logCLAIMS AGE i.b2.TREATMENTPLACE i.b2.GENDER, entreat (REMOTE = i.b2.TREATMENTPLACE i.b2.AREA, nointeract) vce(robust)

Extended linear regression                      Number of obs     =     17,084
                                                Wald chi2(6)      =    2274.00
Log likelihood =  -25883.81                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------------------
                         |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
logCLAIMS|
                     AGE|   .0048115   .0002871    16.76   0.000     .0042488    .0053741
                         |
        TREATMENTPLACE|
          PPO|    .264882   .0170253    15.56   0.000     .2315131    .2982509
          NON PPO |          0  (base)
      PUBLIC HOSPITAL  |  -.7805487   .0243844   -32.01   0.000    -.8283414   -.7327561
                         |
                  GENDER|
                      NG  |   .0343502   .0463053     0.74   0.458    -.0564064    .1251069
                      M  |          0  (base)
                      F  |  -.0331972   .0128694    -2.58   0.010    -.0584206   -.0079737
                         |
             REMOTE |
                      0  |          0  (base)
                      1  |  -.7573412     .05366   -14.11   0.000    -.8625128   -.6521696
                         |
                   _cons |   4.873668   .0166892   292.03   0.000     4.840957    4.906378
-------------------------+----------------------------------------------------------------
REMOTE              |
        TREATMENTPLACE|
          PPO|   1.314718   .0519358    25.31   0.000     1.212926     1.41651
            NON PPO|          0  (base)
      PUBLIC HOSPITAL|   -.203226   .1063506    -1.91   0.056    -.4116694    .0052174
                         |
              AREAS |
1 |  -.0764202   .2209754    -0.35   0.729    -.5095241    .3566837
      2  |          0  (base)
      3 |  -.0710966   .0456614    -1.56   0.119    -.1605913    .0183981
            4|  -.1733018   .0992317    -1.75   0.081    -.3677923    .0211887
    5 |  -.0832831   .0390727    -2.13   0.033    -.1598641   -.0067021
6 |   .0906437   .0350018     2.59   0.010     .0220414     .159246
   7 |   .1551398   .0477874     3.25   0.001     .0614782    .2488014
                         |
                   _cons |  -2.261838   .0511611   -44.21   0.000    -2.362112   -2.161564
-------------------------+----------------------------------------------------------------
        var(e.logCLAIMS)|    .726331   .0109492                      .7051849    .7481112
-------------------------+----------------------------------------------------------------
      corr(e.REMOTE,|
            e.logCLAIMS)|   .5082226   .0292915    17.35   0.000     .4485854     .563354
------------------------------------------------------------------------------------------
estat teffects gives me the same -0,757...

but

margins, dydx(REMOTE)

Code:
Average marginal effects                        Number of obs     =     17,084
Model VCE    : Robust

Expression   : mean of logCLAIMS, predict()
dy/dx w.r.t. : 1.REMOTE

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
 REMOTE
          0  |          0  (base)
          1  |   .1748854   .0660175     2.65   0.008     .0454935    .3042772
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.