I am trying to understand Stata's -margins- command by recreating results by hand, and I am getting wrong answers when I consider models with interactions.

First I will give an example where I seem to obtain correct results. Consider the diabetes data from Stata:

Code:
webuse nhanes2f, clear
Suppose you have the following model:

diabetes = F(b0 + b1*age + b2*female + b3*(female*age))

where F is the standard normal CDF.

The marginal effect of age for male should be

F'(b0+b1+b2+b3)(b1)

and the marginal effect of age for female should be

F'(b0+b1+b2+b3)(b1+b3)

where F' is the standard normal PDF.

After running the regression:

Code:
qui probit diabetes i.female##c.age
I do

Code:
predict phat if e(sample), xb 
gen index = normalden(phat) 
gen dydx_age_male = index*(_b[age]) if female == 0
gen dydx_age_female = index*(_b[age] + _b[1.female#c.age]) if female == 1
and I find

Code:
. sum dydx_age_male dydx_age_female 

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
dydx_age_m~e |      4,909     .002704     .002162   .0002435   .0070914
dydx_age_f~e |      5,426    .0023238     .001393   .0005265   .0048638
which is pretty close to what Stata finds:

Code:
. margins, dydx(age) at(female = (0 1))

Average marginal effects                        Number of obs     =     10,335
Model VCE    : OIM

Expression   : Pr(diabetes), predict()
dy/dx w.r.t. : age

1._at        : female          =           0

2._at        : female          =           1

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
age          |
         _at |
          1  |    .002729   .0002578    10.59   0.000     .0022237    .0032342
          2  |   .0023102   .0002168    10.66   0.000     .0018853     .002735
------------------------------------------------------------------------------

But when I try the same approach with a similar model on different data set, I get very wrong answers:

Code:
sysuse auto, clear
gen mpg_discrete = cond(mpg > 25, 1, 0) // make up a binary variable
qui probit mpg_discrete i.foreign##c.gear_ratio
predict phat if e(sample), xb 
gen index = normalden(phat) 
gen dydx_gear_ratio_domestic = index*(_b[gear_ratio]) if foreign == 0
gen dydx_gear_ratio_foreign = index*(_b[gear_ratio] + _b[1.foreign#c.gear_ratio]) if foreign == 1  

. sum dydx_gear_ratio_domestic dydx_gear_ratio_foreign 

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
dydx_gear_~c |         52    .3923934    .2875949   .0146149   .9461553
dydx_gear_~n |         22    .3982511    .0766482   .2494647   .4690799

. margins, dydx(gear_ratio) at(foreign=(0 1))

Average marginal effects                        Number of obs     =         74
Model VCE    : OIM

Expression   : Pr(mpg_discrete), predict()
dy/dx w.r.t. : gear_ratio

1._at        : foreign         =           0

2._at        : foreign         =           1

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
gear_ratio   |
         _at |
          1  |   .4946714    .103051     4.80   0.000     .2926952    .6966477
          2  |   .2608279   .1207171     2.16   0.031     .0242268     .497429
------------------------------------------------------------------------------

I must be doing something wrong, but I can't figure out what exactly.

Furthermore, I am confused now as to what Stata is calculating when levels of the dummy are not specified to margins. For instance, going back to the first model, if you run

Code:
. margins, dydx(age)

Average marginal effects                        Number of obs     =     10,335
Model VCE    : OIM

Expression   : Pr(diabetes), predict()
dy/dx w.r.t. : age

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0025044   .0001667    15.02   0.000     .0021777    .0028311
------------------------------------------------------------------------------
it produces something close to, but not exactly, the marginal effect for age when the level of female is set to 0.