Hi all,

I am trying to interpret the results of an Oaxaca-Blinder Decomposition. I am using the popular Oaxaca command. For my project I am just interested in the results of a two-way decomposition (i.e. I don't want the interaction terms), so I am using the pooled option. These results are based off of a logistic regression. Here is my output:

Code:
.
svyset [pweight = weight], str(stratum_var) psu(cluster_var)

tab race, gen(race)
tab educ, gen(educ)
tab hhinc, gen(hhinc)
tab age_g5, gen(age_g5

oaxaca cohab_par age_g52 age_g53 age_g54 age_g55 race2 race3 race4 imm ///
>          educ2 educ3 educ4 hhinc2 hhinc3 hhinc4, ///
>          by(rural) svy logit pooled

Blinder-Oaxaca decomposition

Number of strata = 18                             Number of obs   =      3,268
Number of PSUs   = 72                             Population size = 38,244,320
Design df       =         54
Model              =     logit
Group 1: rural = 0                              N of obs 1         =      2116
Group 2: rural = 1                              N of obs 2         =       468


Linearized
cohab_par  Coefficient  std. err.      t    P>t     [95% conf. interval]

overall      
group_1    .1405385   .0141326     9.94   0.000     .1122043    .1688727
group_2    .2199063   .0242979     9.05   0.000     .1711919    .2686207
difference   -.0793678   .0276199    -2.87   0.006    -.1347424   -.0239933
explained   -.0197445   .0107814    -1.83   0.073    -.0413599    .0018708
unexplained   -.0596233   .0273862    -2.18   0.034    -.1145294   -.0047172

explained    
age_g52   -.0005991      .0012    -0.50   0.620    -.0030048    .0018067
age_g53       -.001    .001295    -0.77   0.443    -.0035962    .0015963
age_g54    .0000969   .0009533     0.10   0.919    -.0018143    .0020081
age_g55   -.0004468   .0010476    -0.43   0.671    -.0025472    .0016535
race2    -.001195   .0013982    -0.85   0.396    -.0039982    .0016081
race3    .0009479     .00266     0.36   0.723    -.0043851    .0062809
race4    .0005973   .0013599     0.44   0.662     -.002129    .0033237
imm    .0003432   .0019318     0.18   0.860    -.0035298    .0042161
educ2   -.0002301   .0009301    -0.25   0.806    -.0020949    .0016347
educ3    .0008762   .0017989     0.49   0.628    -.0027304    .0044828
educ4    -.012033    .005119    -2.35   0.022     -.022296     -.00177
hhinc2   -.0003015   .0006636    -0.45   0.651    -.0016318    .0010289
hhinc3   -.0001882   .0006291    -0.30   0.766    -.0014493     .001073
hhinc4   -.0066125   .0039993    -1.65   0.104    -.0146306    .0014057

unexplained  
age_g52    -.028689   .0165345    -1.74   0.088    -.0618386    .0044607
age_g53   -.0297615   .0290941    -1.02   0.311    -.0880916    .0285686
age_g54   -.0556364   .0411148    -1.35   0.182    -.1380666    .0267938
age_g55   -.0026016   .0261534    -0.10   0.921     -.055036    .0498328
race2    .0048568   .0077741     0.62   0.535    -.0107293    .0204428
race3    .0176802   .0161501     1.09   0.278    -.0146989    .0500593
race4    .0041202   .0058635     0.70   0.485    -.0076355    .0158759
imm   -.0093887    .012094    -0.78   0.441    -.0336357    .0148583
educ2    .0047812   .0141557     0.34   0.737    -.0235994    .0331617
educ3   -.0291882    .026426    -1.10   0.274    -.0821691    .0237926
educ4   -.0366389   .0235373    -1.56   0.125    -.0838284    .0105505
hhinc2    .0017072   .0118416     0.14   0.886    -.0220339    .0254482
hhinc3    .0011674   .0093851     0.12   0.901    -.0176486    .0199834
hhinc4   -.0020982   .0118124    -0.18   0.860    -.0257806    .0215842
_cons    .1000664    .098881     1.01   0.316    -.0981782    .2983109
I am seeking help since within these results, there is overall significant differences between the two groups (urban and rural in this case), with roughly 24.9% of the difference coming from difference in composition/explained (p-value not statistically significant) and 75.1 percent coming from differences in coefficients/unexplained (p-value is statistically significant). On the surface this make sense, but when you look at the individual variables within the explained and unexplained portions, these findings don't line up.

Specifically, the only significant variables are found in the explained portion (educ4), and there are no significant variables in the unexplained portion (despite unexplained being significant overall).

Am I misinterpreting the results? Or, on a more technical level, how are standard errors and p-values calculated within an Oaxaca-Blinder Decomposition? Does the calculation differ when trying to estimate the significance of the overall explained/unexplained components than when trying to calculate the effects of individual variables?

Please let me know if I can clarify anything