I'm currently researching the effect of gender on the performance of microfinance institutions (MFI).
For one part of my thesis I have formulated two hypotheses (in short):
  1. Gender does not have any effect on efficiency
  2. The effect of gender on efficiency does not significantly differ between the profit status' of MFIs
I have included the second hypothesis as there are some studies which have shown that profit status of an MFI does have a significant effect in some cases (mainly different regions of the world), while other articles argue that gender is the primary effect and the effect of profit status was wrongly attributed. Thus I wanted to explore this some more and decided to analyse H2 with an interaction term.
OER = Defined as an MFIs efficiency
PF = percentage of women borrowers
dumPP = 0 if non-profit , 1 if for-profit

Code:
. xtreg OER TA1M PSK MFIage c.PF##dumPP dumRP dum2006PF, robust

Random-effects GLS regression                   Number of obs      =       143
Group variable: numMFI                          Number of groups   =        48

R-sq:  within  = 0.0921                         Obs per group: min =         1
       between = 0.1891                                        avg =       3.0
       overall = 0.0972                                        max =         9

                                                Wald chi2(8)       =     53.86
corr(u_i, X)   = 0 (assumed)                    Prob > chi2        =    0.0000

                                (Std. Err. adjusted for 48 clusters in numMFI)
------------------------------------------------------------------------------
             |               Robust
         OER |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        TA1M |  -.0028577   .0013633    -2.10   0.036    -.0055298   -.0001856
         PSK |   .0293177   .0266599     1.10   0.271    -.0229346    .0815701
      MFIage |  -.0022455   .0057635    -0.39   0.697    -.0135418    .0090509
          PF |   .4711311   .1959322     2.40   0.016     .0871111    .8551511
     1.dumPP |    .278675   .1908222     1.46   0.144    -.0953297    .6526797
             |
  dumPP#c.PF |
          1  |  -.3597447   .2193317    -1.64   0.101     -.789627    .0701376
             |
       dumRP |    .041475   .0602707     0.69   0.491    -.0766535    .1596034
   dum2006PF |   .0807376   .0392052     2.06   0.039     .0038969    .1575783
       _cons |   .3151538   .1802078     1.75   0.080    -.0380471    .6683546
-------------+----------------------------------------------------------------
     sigma_u |  .12275084
     sigma_e |  .11450364
         rho |  .53471907   (fraction of variance due to u_i)
------------------------------------------------------------------------------
Code:
. quietly margins, dydx(dumPP) at (PF=(0(0.1)1)) vsquish
Code:
. marginsplot, yline(0)
Array

I've also included the regression without the interaction term:

Code:
. xtreg OER TA1M PSK MFIage PF dumPP dumRP dum2006PF, robust

Random-effects GLS regression                   Number of obs      =       143
Group variable: numMFI                          Number of groups   =        48

R-sq:  within  = 0.0941                         Obs per group: min =         1
       between = 0.0857                                        avg =       3.0
       overall = 0.1079                                        max =         9

                                                Wald chi2(7)       =     42.03
corr(u_i, X)   = 0 (assumed)                    Prob > chi2        =    0.0000

                                (Std. Err. adjusted for 48 clusters in numMFI)
------------------------------------------------------------------------------
             |               Robust
         OER |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        TA1M |   -.003148   .0014683    -2.14   0.032    -.0060259   -.0002701
         PSK |   .0293754   .0255809     1.15   0.251    -.0207624    .0795131
      MFIage |  -.0016013    .006634    -0.24   0.809    -.0146036    .0114011
          PF |   .1932328   .0901735     2.14   0.032     .0164959    .3699696
       dumPP |   -.002885   .0742708    -0.04   0.969     -.148453     .142683
       dumRP |   .0547621   .0585926     0.93   0.350    -.0600773    .1696015
   dum2006PF |   .0706054   .0344676     2.05   0.041     .0030501    .1381606
       _cons |   .5389458   .1261681     4.27   0.000     .2916608    .7862307
-------------+----------------------------------------------------------------
     sigma_u |  .13503925
     sigma_e |  .11413315
         rho |  .58331564   (fraction of variance due to u_i)
------------------------------------------------------------------------------
Now PF does stay significant in both cases, although coefficients do vary, but this was expected.
Am I right to assume the following:

1. As -marginsplot- shows, dumPP does not have, at any point, any proven significant impact on the effect gender has on efficiency
2. Regression 1 implies that although the effect of gender is somewhat weaker in for-profit MFIs than it is in non-profit MFIs, the effect gender has on efficiency is similar. Again I expected coefficients to change with addition of the interaction term, as it redefines the meaning of PF. Given
0.4711311PF - 0.3597447PF*numPP

Effect of PF in non-profit MFI: 0.47
Effect of PF in for-profit MFI: 0.11

Effect of PF in second regression: 0.19

On the one hand, the model does not estimate the interaction term to be significant, -marginsplot- does not imply any significant impact of dumPP on the linear prediction concerning PF, and PF does stay significant in both regressions. On the other hand, coefficients do vary quite a bit between non-profit and for-profit MFIs.
Is this a case where I could argue both ways given my interpretation, or is there some obvious approach I have missed?
My interpretation would be that although the effect of PF is different between non-profit and for-profit MFIs, this difference is not significant. Thus, there is no evidence that would suffice for a rejection of H2.

I have also included some summary statistics for OER for comparison.
Code:
. tabstat OER, stat(mean q min max)

    variable |      mean       p25       p50       p75       min       max
-------------+------------------------------------------------------------
         OER |  .6897555  .5999919  .6730029   .768463  .2283214  1.283038
--------------------------------------------------------------------------
Any advice is appreciated, thank you in advance!