Good afternoon, first time poster here. I'm an undergraduate student trying to run some regressions on data I have been compiling. I thought this method of compiling would make it simple to regress, but it's proving to be more complicated than I thought.
I am in the process of trying to use daily market data to calculate the ANNUAL Betas by Firm then calculate the error term for each firm_year combination. Ultimately, I'm trying to use daily stock return data (regressed on/controlled by the Fama-French 3-factors**) to calculate the annual systemic (Error Term) and systematic (Beta) risk for each firm.
In order to do this, I have ~252 daily data points over 11 years (2011 - 2021), for 100 firms (this totals 277,200 data points). In the end, I hope to have one data point for every firm & year combination for each of the 100 firms (100 firms * 11 years = 1100 data points).
I have been able to run the regression using "sort" and "by" to perform one regression per year-firm combination and receive the appropriate annual beta (displayed below). However, calculating the error term for each of these points is proving to be much more complicated. It seems as if using the "predict y_hat" and "predict residuals" doesn't break the results into each individual firm_year subset.
Finally, if anybody has any advice on exporting the results into an excel or CSV file that contains the relevant Firm, Year, Systemic Risk, Systematic Risk compiled, that would be greatly appreciated (currently I'm planning on manually combining the datapoints).
Below I have listed an example of my variables, as well code I'm currently using:
Array
Code:
Array
At this point it should provide me the difference squared of the (predicted - actual)^2. If I'm not mistaken, I would then need to sum that variable for each year_firm to get the error term (Annual Systemic Risk) for that firm.
I hope I was clear enough in explaining my goals, and explain what I've been doing so far. I really appreciate your time and any support you may have to offer.
Best!
0 Response to Performing Regressions on Panel and Time Series Data
Post a Comment