I am using Stata 16 on Windows 10 and I'm working on a quarterly dataset of over 10,000 companies.
Code:
xtset panel variable: gvkey (unbalanced) time variable: fyearq_, 1996q2 to 2008q2, but with gaps delta: 1 quarter
' Average assets = ((Total assets) + (lagged Total assets)) / 2 '. The strange thing that occured is that the variable "Average assets" differs if I use l1.[Total assets] instead of a previously generated variable for "lagged Total assets". I provide sample data and the code I used. I will explain at the end why I didn't create new variable names that are straightforward.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double gvkey float fyearq_ double atq2 float(t2_atq_L1 t2_avg_assets t3_avg_assets) 1004 146 449.645 . . . 1004 147 468.55 449.645 459.0975 459.0975 1004 148 523.852 468.55 496.201 496.201 1004 149 529.584 523.852 526.718 526.718 1004 150 542.819 529.584 536.2015 536.2015 1004 151 587.136 542.819 564.9775 564.9775 1004 152 662.345 587.136 624.7405 624.7405 1004 153 670.559 662.345 666.452 666.452 1004 154 707.695 670.559 689.127 689.127 1004 155 737.416 707.695 722.5555 722.5555 1004 156 708.218 737.416 722.817 722.817 1004 157 726.63 708.218 717.424 717.424 1004 158 718.913 726.63 722.7715 722.7715 1004 159 747.043 718.913 732.978 732.978 1004 160 753.755 747.043 750.399 750.399 1004 161 740.998 753.755 747.3765 747.3765 1004 162 747.543 740.998 744.2705 744.2705 1004 163 772.941 747.543 760.242 760.242 1004 164 754.718 772.941 763.8295 763.8295 1004 165 701.854 754.718 728.286 728.286 1004 166 758.503 701.854 730.1785 730.1785 1004 167 714.208 758.503 736.3555 736.3555 1004 168 690.681 714.208 702.4445 702.4445 1004 169 710.199 690.681 700.44 700.44 1004 170 722.944 710.199 716.5715 716.5715 1004 171 727.776 722.944 725.36 725.36 1004 172 723.019 727.776 725.3975 725.3975 1004 173 686.621 723.019 704.82 704.82 1004 174 676.345 686.621 681.483 681.483 1004 175 666.178 676.345 671.2615 671.2615 end format %tq fyearq_
Code:
gen t2_avg_assets=((atq2)+(l1.atq2))/2 (15,545 missing values generated) . gen t2_atq_L1 = l1.atq2 (14,933 missing values generated) . gen t3_avg_assets=((atq2)+(t2_atq_L1))/2 (15,545 missing values generated) . * t2_avg_assets and t3_avg_assets should be same, but they aren't: . compare t2_avg_assets t3_avg_assets ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ t2_avg_~s<t3_avg_~s 14814 -.0078125 -.0000578 -2.33e-10 t2_avg_~s=t3_avg_~s 217381 t2_avg_~s>t3_avg_~s 14735 2.33e-10 .0000563 .0039063 ---------- jointly defined 246930 -.0078125 -1.06e-07 .0039063 jointly missing 15545 ---------- total 262475
In preparation for this post I created variables with easier to understand names. But by doing this another question emerged.
Code:
gen assetstotalqtly = atq2 (741 missing values generated) . gen assetstotalqtly_L1 = l1.assetstotalqtly (14,933 missing values generated) . gen averageassets = ((assetstotalqtly)+(assetstotalqtly_L1))/2 (15,545 missing values generated) . gen test_averageassets = ((assetstotalqtly)+(l1.assetstotalqtly))/2 (15,545 missing values generated) . compare averageassets test_averageassets ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ average~s=test_av~s 246930 ---------- jointly defined 246930 0 0 0 jointly missing 15545 ---------- total 262475 compare assetstotalqtly atq2 ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ assetst~y<atq2 123741 -.00625 -.0000155 -5.96e-11 assetst~y=atq2 12729 assetst~y>atq2 125264 2.61e-11 .0000156 .00625 ---------- jointly defined 261734 -.00625 1.46e-07 .00625 jointly missing 741 ---------- total 262475
How are assetstotalqtly and atq2 not identical when I created the first by telling Stata it is equal to the latter? And why doesn't the issue described above occure?
I hope I described everything well enough, if not feel free to let me know. Thank you in advance!
0 Response to Difference between using l1.var and previously generated lagged variable
Post a Comment