Dear all,

I am new to the Oaxaca-Blinder decomposition, and I'ld like to know your opinion about how to set it up and interpret the results.

In my sample, I have respondents who answered by phone to the survey, and some other F2F, and I would like to explore if the different mode of response may lead to differences in the outcome.

I preliminary run a simple regression to check if actually something is there:
Code:
.  reg  dep4 edu aage male hsize acountry  f2f

      Source |       SS       df       MS              Number of obs =    2879
-------------+------------------------------           F(  6,  2872) =   51.04
       Model |  115.128287     6  19.1880478           Prob > F      =  0.0000
    Residual |  1079.74424  2872  .375955515           R-squared     =  0.0964
-------------+------------------------------           Adj R-squared =  0.0945
       Total |  1194.87253  2878  .415174609           Root MSE      =  .61315

------------------------------------------------------------------------------
        dep4 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         edu |  -.0154029   .0175565    -0.88   0.380    -.0498275    .0190217
        aage |   -.007566   .0012684    -5.97   0.000     -.010053   -.0050789
        male |  -.2096275   .0230887    -9.08   0.000    -.2548996   -.1643554
       hsize |  -.0274292    .008292    -3.31   0.001    -.0436881   -.0111702
    acountry |   .0264477   .0022047    12.00   0.000     .0221247    .0307706
         f2f |  -.1819788   .0302105    -6.02   0.000    -.2412153   -.1227423
       _cons |   1.483471   .0734162    20.21   0.000     1.339517    1.627425
------------------------------------------------------------------------------
The coefficient of f2f (which indicates the mode of response) is significant, so that means that it could be worthy to go on with a decomposition (I believe).

Code:
. nldecompose, by(f2f): reg  dep4   male aage  hsize acountry  edu

                                                   Number of obs (A) =    1293
                                                   Number of obs (B) =    1387

------------------------------------------------------------------------------
      Results |      Coef.  Percentage
--------------+---------------------------------------------------------------
 Omega = 1    |
         Char |   .3214277   188.1257%
         Coef |  -.1505698  -88.12575%
--------------+---------------------------------------------------------------
 Omega = 0    |
         Char |  -.0200212  -11.71803%
         Coef |   .1908791    111.718%
--------------+---------------------------------------------------------------
          Raw |   .1708579        100%
------------------------------------------------------------------------------
This is the results of the decomposition, which puzzles me a bit. What does the percentages over 100% mean? How should I interpret these results?

Here an example of my data:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(acountry aage dep4) double f2f float(edu male hsize)
14 23 1 0 2 1 2
14 47 1 0 1 1 4
14 48 1 0 3 0 4
14 30 1 1 2 0 2
14 49 1 0 3 0 2
14 29 1 0 2 1 1
14 23 1 0 3 1 2
14 49 1 0 2 0 4
14 22 4 0 3 0 1
14 30 1 0 2 0 1
end
label values acountry acountry
label def acountry 14 "Germany", modify
label values aage NoLabel
label values dep4 Tfreq3
label def Tfreq3 1 "Seldom or Never", modify
label def Tfreq3 4 "Most or all of the time", modify
label values f2f f2f_lab
label def f2f_lab 0 "0. only processed in CAWI phase", modify
label def f2f_lab 1 "1. processed by an interviewer", modify
Your help would be absolutely appreciated.

Thanks in advance, best, G.