PROBLEM:
I am examining how mobility fell during a COVID lockdown in two types of communities, slums and non-slums. I have data on > 500,000 cell phones on a daily basis for several months. During that period, the government imposed a lockdown. For simplicity, assume that the data start on 1 March 2020 and the lockdown is 1 April - 30 April 2020, and I am only looking in those two months. I want to study if the effect of the lockdown differed across the two communities, but want to measure that in 2 ways.
1/ Was mobility lower in non-slums than slums during the lockdown
2/ Did mobility decline more during lockdown in non-slums measured as a percentage of pre-lockdown mobility
In addition, I want to identify changes in mobility within devices (ie, the phones). Importantly, each device is associated with a time-invariant community type (slum or non-slum).
And to make it even more complicated, the device id variable is a string.
I am using Stata 16.
ATTEMPTED SOLUTION:
Ordinarily, I would do the following:
reghdfe mobility lockdown if slum==0, a(device_id)
eststo nonslum
reghdfe mobility lockdown if slum==1, a(device_id)
eststo slum
suest nonslum slum, vce(cl device)
test _[slum_lockdown] - _[nonslum_lockdown] = 0
test (_[slum_lockdown]/(_[slum_cons])) - (_[nonslum_lockdown]/(_[nonslum_cons])) = 0
But suest doesn't work with reghdfe.
So then I tried reg with factors.
egen cell = group(device_id)
reg mobility lockdown i.cell if slum==0
eststo nonslum
reg mobility lockdown i.cell if slum==1
eststo slum
suest nonslum slum, vce(cl device)
But I have > 500,000 devices. I can't set maxvar high enough.
So, I try using predict instead.
gen lockdown_slum = lockdown * slum
reghdfe mobility lockdown lockdown_slum, a(device_id)
predict nonslum_level if slum==0, xb
predict nonslum_level_se if slum==0, stdp
predict slum_level if slum==1, xb
predict slum_level_se if slum==1, stdp
The question is how to test the difference in predictions to test hypothesis 1. Moreover, how to test hypothesis 2?
Any advice would be much appreciated.
Related Posts with Testing coefficients or predictions with > 500,000 unit fixed effects
How to correctly interrupt the moderating effect in a modelDear all I have three variables: New.gw is my x and log_spread is my Y and the moderator is a legal …
How to calculate distance in this case?Dear all, I would like to calculate a distance as follows: distance is defined as the straight-line…
set string case to missing if a phrase is foundHello, I have a series of string variables (drug1 - drug25) that contain information on types of dr…
Problems with the ARIMA model for white noise testingThe variable is log return, which has been confirmed as a stationary series by the DF test, I choose…
Create correlation table of AUROCs in Excel or WordDear Statalisters, I want to create a correlation table of AUROCs and their corresponding 95% CI and…
Subscribe to:
Post Comments (Atom)
0 Response to Testing coefficients or predictions with > 500,000 unit fixed effects
Post a Comment