Hi everyone,

I have a dataset of 70 million observations and am running the following model.

First I estimate the probability of belonging to class A with a logit model. Then I calculate fitted probabilities. Call those values x1hat.


Then I run an OLS regression to explain another variable y, as follows.

reg y x1hat x2 x2*x1hat

Where x2 is another explanatory variable.

I know that the standard errors of the last regression will not reflect the uncertainty of x1hat. So I wanted to bootstrap the standard errors of the entire procedure: first logit then Ols.

But my sample size is very large so I am afraid it won't be feasible to do 1000 reps with a sample size of 70 million each time.

Any suggestions on how to do this?

I noticed the wild and fast bootstrapping command (boottest) but I think that is only after one single estimation command ?So don't know if I can use boottest to repeat the joint procedure of logit followed by ols.

Appreciate your guidance,

Laurie