I am doing research about racial discrimination in mortgage interest.
Many researchers use the OLS model to investigate whether minorities pay a higher interest rate than comparable white borrowers, controlling for borrower demographic characteristics, creditworthiness, and loan features. In addition, researchers will apply the ''year fixed effect'' and ''county/MSA effect'' to control the unobserved housing market situation at the year of loan origination and the unobserved effect due to the different geographic locations of the property. As an example, please see table 4 on page 45 in paper https://faculty.haas.berkeley.edu/mo...rs/discrim.pdf
However, the data being used in these papers are in fact cross-sectional data instead of panel data. For example, one uses 20000 obs of loan data at the individual level, which were generated between 2009 - 2019. Each obs is a loan being originated by a unique borrower.
As an example of the data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int(as_of_year county_code) long msamd byte(applicant_race_1 applicant_ethnicity) str10 respondent_id byte(loan_type loan_purpose) long loan_amount_000s 2017 7 . 6 3 "0000068601" 1 1 175 2017 5 38540 5 2 "0000063194" 2 1 196 2017 1 36084 6 3 "0000451965" 1 1 1079 2017 53 33460 6 1 "41-1842999" 1 3 199 2017 47 35614 6 3 "0000613307" 1 3 800 2017 141 21340 5 1 "0000451965" 1 3 170 2017 209 28140 6 2 "0000016450" 2 1 157 2017 39 26420 6 3 "73-1577221" 1 3 183 2017 31 27260 5 2 "0000068490" 1 3 62 2017 91 33874 5 2 "0000451965" 2 1 196 end
In cross-sectional data, data are not observed at T time periods, as a result, the unobserved variables cannot be eliminated by demeaning the variables using the within transformation. Also, the explanatory variable of interest, "race of borrower", is a time-constant variable. It will be swept away by using the within transformation.
Hence, my question is, are the "year fixed effect" and "county/MSA fixed effect" in these papers actually just two sets of dummies?
To be more specific, a set of dummies for the year of loan origination between 2009 - 2019, e.g. if a loan was originated in 2009, then the dummy for 2009 is 1.
And a set of dummies for all counties/MSA, e.g. if a loan was originated at county 86, then the dummy of county 86 is 1.
Thank you!
Lei
0 Response to Problem with fixed effect in cross sectional data
Post a Comment