Hi,

I have a large wide panel dataset: t=804 months. N=45464 individuals. In long form, this equals to 804*45464=36,553,056 observations. I am wondering which of the following procedures in, in general, faster:
  1. Transform the full wide dataset into one large long dataset. Run regressions on sub samples. Perhaps: drop not needed observations before running regressions on sub samples.
  2. Transform only sub sample into long dataset. Run analysis on sub sample. Consider new sub sample.
What is, in general, faster?

Thanks so much!