Dear Statalist,
I am quiet new to Stata (Version 14 SE on Windows 10) and I am facing the following Problem: I have unbalanced quarterly panel data on 250 corporates (identified by idno) where the variables from the income statement (57 of 406 total variables) are reported in an accumulated manner, e.g. net income in the 3rd quarter contains the respective numbers from the 1st and 2nd quarter as well. In order to retrieve the correct income statement figures for the 2nd, 3rd and 4th quarter, I need to recalculate them (e.g. Q4 obs = Financial Year End obs - Q3 obs). In my understanding, howewer, there are two ways to do this: the first is to reshape the data from long to wide to calculate the Q2 to Q4 results across variables (afterwards reshape to long again); the second is to keep the data in long format and to create new observations for the respective quaters and then to calculate the results across observations.
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float idno str8 periodq double(FC_CASH_DUE_BANKS_BNK FC_NET_LOANS_BNK FC_EQUITY_BNK FC_NET_INCOME_BNK FC_PERSONNEL_EXP_BNK)
1 "2007_3" 3.1836e+10 4.42465e+11 1.17704e+11 4.787e+09 6.234e+09
1 "2007_6" 3.5449e+10 4.57404e+11 1.19211e+11 4.234e+09 6.309e+09
1 "2007_9" 3.2766e+10 4.78207e+11 1.19978e+11 3.373e+09 4.677e+09
1 "2007_12" 4.0144e+10 5.100e+11 1.230e+11 1.5365e+10 2.2689e+10
1 "2008_3" 4.6888e+10 5.2531e+11 1.25627e+11 2.373e+09 4.951e+09
1 "2008_6" 3.2255e+10 5.24783e+11 1.27176e+11 2.003e+09 6.913e+09
1 "2008_9" 5.435e+10 7.42329e+11 1.37691e+11 5.270e+08 5.858e+09
1 "2008_12" 2.6895e+10 7.220e+11 1.350e+11 5.605e+09 2.2746e+10
1 "2009_3" 2.6681e+10 6.80862e+11 1.38201e+11 2.141e+09 7.588e+09
1 "2009_6" 2.5133e+10 6.51529e+11 1.46614e+11 2.721e+09 6.917e+09
1 "2009_9" 2.1068e+10 6.22511e+11 1.54101e+11 3.588e+09 7.311e+09
1 "2009_12" 2.6206e+10 6.020e+11 1.570e+11 1.1728e+10 2.6928e+10
1 "2010_3" 3.1422e+10 6.75613e+11 1.56569e+11 3.326e+09 7.276e+09
1 "2010_6" 3.2806e+10 6.63647e+11 1.62968e+11 4.795e+09 7.616e+09
1 "2010_9" 2.396e+10 6.5637e+11 1.6603e+11 4.418e+09 6.661e+09
1 "2010_12" 2.7567e+10 6.610e+11 1.680e+11 1.737e+10 2.8124e+10
1 "2011_3" 2.3469e+10 6.56246e+11 1.72798e+11 5.555e+09 8.263e+09
1 "2011_6" 3.0466e+10 6.61216e+11 1.75079e+11 5.431e+09 7.569e+09
1 "2011_9" 5.6766e+10 6.68503e+11 1.74487e+11 4.262e+09 6.908e+09
1 "2011_12" 5.9602e+10 6.960e+11 1.760e+11 1.8976e+10 2.9037e+10
end
I' ve already begun with the fist approach and figured that this involves a lot of typing, as I have 57 variables from the income statement and 36 quaterly observations. After reshaping from long to wide I would type something like:
Code:
gen incloan2007_q4= incloan2007_12 - incloan2007_9
gen incloan2007_q3= incloan2007_9 - incloan2007_6
gen incloan2007_q2= incloan2007_6 - incloan2007_3
and so on for the following years (up to 2018) in order to calculate the quarterly results for each idno.
My questions now are: (a) would the second approach with calculations across observations (or another one) be more suitable / time-saving compared to the first approach and (b) if yes, how could that be done with proper stata commands / functions (maybe a loop with foreach)? And (c) is there a simple way to shorten the calculations for the first reshapig-approach, maybe with foreach?
Any help / advise would be appriciated.
Kind regards,
Julian
Related Posts with Calculating across observations or across variables in panel data
Poisson regression - noconsDear all I am conducting a mirror image study on days of admission to hospital before and after a c…
Adjusting the window of a Moving AverageDear all, I am currently constructing a 5-window and 7-window moving average to smooth out a time-s…
Adjust tests for size in MataHi Stata/Mata users, I have the following issue. I have to adjust t-tests for size so that their tr…
Help Granger Causality testHi gyus! I would like to perform a Granger causality test after a VECM. Does anyone have an idea? S…
Merging two datasets based on first and last namesHi! I'm working on an analysis of student performance. I have 2 lists- list A contains the names of …
Subscribe to:
Post Comments (Atom)
0 Response to Calculating across observations or across variables in panel data
Post a Comment