Hello everyone,

I am doing research on racial discrimination at the loan approval decisions, i.e. whether minority borrowers have a lower approval probability than similar white borrowers, c.p.

Actually, previous research has already raised evidence of discrimination against minorities using a logit regression in the following form,

P(approval)= f (minority status, loan features, borrower characteristics, ..., and some other controls)

A negative and statistically significant coefficient from this logit model reveals that minority status reduces the loan approval probability, c.p.

Nevertheless, this model is unable to distinguish between differential treatment and disparate impact discrimination. In the form of differential treatment discrimination, two otherwise equal borrowers - except their race and ethnicity - will be treated differently by lenders. The second form - disparate impact discrimination - has a legal cover but can have an unintentional disparate impact against minority borrowers. One example is that lenders could set a minimum income level for all borrowers. This seemingly race-blind requirement will most likely negatively impact minority borrowers but not white borrowers because on average minorities have a lower income level than white.

The best way and the only way to isolate differential treatment discrimination in loan approvals is the paired testing methodology. Specifically, two applicants with the same credit histories and in need of the same type of loan would apply for a mortgage at the same lender. In this setting, the observed differences in treatment only reflect the differential treatment discrimination because two applicants are identically qualified. But the paired testing methodology is hardly practical in real life, because of the fact that pushing pair testing into the loan approval stage might be illegal and face high legal bills.

I noticed that the propensity score matching is used to balance the distribution of covariates, in other words, it will match the observations and make them the most similar in the covariables except the treatment indicator - in our case, the minority indicator. In other words, the propensity score matching seems perfectly imitate the paired testing. The minority-status impact is just the difference between the observed value of one observation and the observed value of its matching. Race as a treatment seems to be unreasonable. But maybe we can assume that a borrower enrolled in a "minority program" when he/she was born. The borrower enrolled in this minority program might have a lower income or other disadvantages in the future.

In fact, when I run the baseline logit model,

Code:
logit approval minority income_w dti20 dti20_30 dti30_36 dti36_49 dti50_60  fico680_699 fico700_719 fico720_739 ltv80 ltv80_85 ltv85_90 ltv90_95  origination_2019  refinance female age62 lender_top100 shadowbank fintech aus tract_minority_population_percen tract_owner_occupied_units tract_one_to_four_family_homes tract_median_age_of_housing_unit cra fhfa_index
I got the following result, i.e. the minority indicator has a negative value equaling -.391 at p<0.0001

Code:
Logistic regression                                   Number of obs =  250,000
                                                      LR chi2(28)   = 55744.90
                                                      Prob < chi2   =   0.0000
Log likelihood = -88966.138                           Pseudo R2     =   0.2386

--------------------------------------------------------------------------------------------------
                        approval | Coefficient  Std. err.      z    P<|z|     [95% conf. interval]
---------------------------------+----------------------------------------------------------------
                        minority |   -.391179   .0152328   -25.68   0.000    -.4210346   -.3613233
                        income_w |    .004729   .0001977    23.93   0.000     .0043416    .0051163
                           dti20 |   2.967021   .0540445    54.90   0.000     2.861096    3.072947
                        dti20_30 |   3.664266   .0424642    86.29   0.000     3.581038    3.747495
                        dti30_36 |   3.892662   .0413462    94.15   0.000     3.811625    3.973699
                        dti36_49 |   3.960401   .0378111   104.74   0.000     3.886293    4.034509
                        dti50_60 |   3.709353    .038279    96.90   0.000     3.634328    3.784378
                     fico680_699 |   .0205687   .0424997     0.48   0.628    -.0627291    .1038665
                     fico700_719 |     .11979   .0419051     2.86   0.004     .0376574    .2019225
                     fico720_739 |   .0570352   .0466314     1.22   0.221    -.0343607    .1484311
                           ltv80 |  -.2817957   .0251547   -11.20   0.000     -.331098   -.2324933
                        ltv80_85 |   -.043908   .0258024    -1.70   0.089    -.0944797    .0066637
                        ltv85_90 |  -.2121639   .0289899    -7.32   0.000    -.2689831   -.1553448
                        ltv90_95 |  -.3095127   .0256459   -12.07   0.000    -.3597778   -.2592476
                origination_2019 |   .2395657   .0132166    18.13   0.000     .2136617    .2654698
                       refinance |   -1.23423   .0219976   -56.11   0.000    -1.277345   -1.191116
                          female |    -.02576   .0135202    -1.91   0.057    -.0522592    .0007392
                           age62 |  -.3451483   .0198167   -17.42   0.000    -.3839883   -.3063083
                   lender_top100 |  -.4454505   .0154397   -28.85   0.000    -.4757118   -.4151892
                      shadowbank |  -.0205853   .0167077    -1.23   0.218    -.0533318    .0121612
                         fintech |  -.1228223   .0212574    -5.78   0.000    -.1644859   -.0811586
                             aus |   2.048448   .0218263    93.85   0.000     2.005669    2.091226
tract_minority_population_percen |    .003672    .000289    12.71   0.000     .0031055    .0042384
      tract_owner_occupied_units |   .0001755   .0000218     8.07   0.000     .0001329    .0002182
  tract_one_to_four_family_homes |  -.0000885   .0000165    -5.35   0.000    -.0001209   -.0000561
tract_median_age_of_housing_unit |  -.0005254   .0004295    -1.22   0.221    -.0013672    .0003164
                             cra |  -.1248721   .0165081    -7.56   0.000    -.1572274   -.0925168
                      fhfa_index |   .0417232   .0042321     9.86   0.000     .0334285    .0500179
                           _cons |  -3.884013   .0679459   -57.16   0.000    -4.017184   -3.750841
--------------------------------------------------------------------------------------------------
Next, we run the propensity score matching in the same sample by using - teffects psmatch -,

Code:
teffects psmatch (approval) (minority income_w dti20 dti20_30 dti30_36 dti36_49 dti50_60 fico680_699 fico700_719 fico720_739 ltv80 ltv80_85 ltv85_90 ltv90_95 origination_2019 refinance female age62 lender_top100 shadowbank fintech aus tract_minority_population_percen tract_owner_occupied_units tract_one_to_four_family_homes tract_median_age_of_housing_unit cra fhfa_index)
, and we got the average treatment effect equals only -.044 at p<0.0001

Code:
Treatment-effects estimation                   Number of obs      =    250,000
Estimator      : propensity-score matching     Matches: requested =          1
Outcome model  : matching                                     min =          1
Treatment model: logit                                        max =          3
------------------------------------------------------------------------------
             |              AI robust
    approval | Coefficient  std. err.      z    P<|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATE          |
    minority |
   (1 vs 0)  |   -.043562   .0026974   -16.15   0.000    -.0488489   -.0382751
------------------------------------------------------------------------------
From the above result, we noticed the minority coefficient changed from -.391 to only -.044, both at p<0.0001. If the propensity score matching imitates the paired testing well, then we can conclude that differential treatment discrimination is not the major concern, while the disparate impact discrimination plays the main role in discrimination at the loan origination decisions.

Can we use the propensity score matching to imitate the paired testing and isolate the differential treatment discrimination?
Is this method feasible?

Thanks!