Good afternoon, first time poster here. I'm an undergraduate student trying to run some regressions on data I have been compiling. I thought this method of compiling would make it simple to regress, but it's proving to be more complicated than I thought.
I am in the process of trying to use daily market data to calculate the ANNUAL Betas by Firm then calculate the error term for each firm_year combination. Ultimately, I'm trying to use daily stock return data (regressed on/controlled by the Fama-French 3-factors**) to calculate the annual systemic (Error Term) and systematic (Beta) risk for each firm.
In order to do this, I have ~252 daily data points over 11 years (2011 - 2021), for 100 firms (this totals 277,200 data points). In the end, I hope to have one data point for every firm & year combination for each of the 100 firms (100 firms * 11 years = 1100 data points).
I have been able to run the regression using "sort" and "by" to perform one regression per year-firm combination and receive the appropriate annual beta (displayed below). However, calculating the error term for each of these points is proving to be much more complicated. It seems as if using the "predict y_hat" and "predict residuals" doesn't break the results into each individual firm_year subset.
Finally, if anybody has any advice on exporting the results into an excel or CSV file that contains the relevant Firm, Year, Systemic Risk, Systematic Risk compiled, that would be greatly appreciated (currently I'm planning on manually combining the datapoints).
Below I have listed an example of my variables, as well code I'm currently using:
Array
Code:
Array
At this point it should provide me the difference squared of the (predicted - actual)^2. If I'm not mistaken, I would then need to sum that variable for each year_firm to get the error term (Annual Systemic Risk) for that firm.
I hope I was clear enough in explaining my goals, and explain what I've been doing so far. I really appreciate your time and any support you may have to offer.
Best!
Related Posts with Performing Regressions on Panel and Time Series Data
MGARCH DVECH for panel dataDear all, I am trying to estimate Diagonal VECH GARCH(1,1) model for a panel of 600 firms (ID) and …
Hausman testI want to investigate if I have to use a random effects or fixed effects model. Therefore I run the …
convergent validityHow would I calculate the convergent validity of two diagnostic tests where both outcome measures ar…
how to replace missing value by using the information in previous or following year?I am dealing with panel data of 20000 firms for period 2010-2015. I found the identity of some firms…
More efficient way to moss p_name, match("([0-9])") regex in a very large dataset?Good morning, I need to find the first instance of [0-9] in a string. I tested the following and it …
Subscribe to:
Post Comments (Atom)
0 Response to Performing Regressions on Panel and Time Series Data
Post a Comment