BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Saturday, December 31, 2022

Contract year reshape

Dear Statalists,

Hope all is well.

I am dealing with the contract data with specific contract start and end years as shown below:

Contract ID	Contract Start Year	Contract End Year
5	05aug1987	05nov1987
11	01sep1987	01sep1994
13	01aug1987	01aug1988
14	01aug1987	03feb1988
16	12sep1987	01jan1989
17	01sep1987	01mar1988
18	01sep1987	01sep1990
19	12sep1987	01oct1997
21	01oct1987	01oct1992
22	01oct1987	01oct1992
24	02sep1987	01oct1988
25	02sep1987	01oct1988
26	01sep1987	01apr1997
28	12sep1987	01nov1990
29	12sep1987	01nov1992
30	01sep1987	01sep1992
31	12aug1987	01mar1995
32	12aug1987	01mar1995
33	12aug1987	01mar1990
34	12aug1987	01mar1989
35	12aug1987	01mar1990
36	12aug1987	01mar1995

Could you please kindly let me know how to reshape this data into long format in Stata?

For example, for Contract ID "18", it should be presented as follows:

Contract ID	Year
18	1987
18	1988
18	1989
18	1990

Many thanks in advance.

Best,
Cong

Removing duplicates

Hi,

I have a dataset which i want to be uniques with the variables gvkey, fyear and loc.
But there are duplicates. Incase of duplicates, I want to remove the observations which have data, if there is data in 2 observations i want to keep the one with max value.
Any suggestions please.

Below is the data extracted with dataex.
input str6 gvkey str4 curcd double(fyear at capx ceq dltt ebitda oiadp oibdp revt sale txdb xopr xrd) str4 loc byte(check checkcurcd temp)
"114881" "USD" 2010 3.165 .005 3.027 0 -1.16 -1.195 -1.16 1.817 1.817 0 2.977 . "AIA" 0 0 0
"114881" "USD" 2011 2.439 .005 2.343 0 -.66 -.668 -.66 1.417 1.417 0 2.077 . "AIA" 0 0 0
"114881" "USD" 2012 3.362 0 3.256 0 -.093 -.098 -.093 1.766 1.766 0 1.859 . "AIA" 0 0 0
"114881" "USD" 2013 3.607 .002 3.369 0 -.785 -.79 -.785 1.939 1.939 0 2.724 . "AIA" 0 0 0
"114881" "USD" 2014 3.997 .005 3.84 0 -2.624 -2.627 -2.624 .032 .032 0 2.656 1.181 "AIA" 0 0 0
"114881" "USD" 2015 1.129 .001 .951 0 -2.946 -2.949 -2.946 .112 .112 0 3.058 1.23 "AIA" 0 0 0
"114881" "USD" 2016 .129 .007 -.316 .348 -2.619 -2.623 -2.619 .279 .279 0 2.898 1.078 "AIA" 0 0 0
"114881" "USD" 2017 .558 .004 -.147 .502 -1.752 -1.756 -1.752 .093 .093 0 1.845 .833 "AIA" 0 0 0
"114881" "USD" 2018 .769 .012 .678 0 -2.684 -2.69 -2.684 .107 .107 0 2.791 .948 "AIA" 0 0 0
"114881" "USD" 2019 9.787 .007 8.407 .103 -.993 -1.467 -.993 4.517 4.517 0 5.51 1.014 "AIA" 0 0 0
"114881" "USD" 2020 10.969 .028 8.67 .121 .58 .015 .58 7.148 7.148 0 6.568 1.149 "AIA" 0 0 0
"114881" "USD" 2021 13.926 .008 9.351 .09 1.005 .439 1.005 12.475 12.475 .21 11.47 1.679 "AIA" 0 0 0
"024368" "USD" 2010 279.359 2.533 90.672 16.323 29.34 26.353 29.34 264.13 264.13 2.908 234.79 . "ANT" 0 0 0
"024368" "USD" 2011 303.735 4.748 87.603 16.815 39.314 35.904 39.314 331.297 331.297 1.993 291.983 . "ANT" 0 0 0
"024368" "USD" 2012 332.274 7.047 85.868 17.64 43.933 40.158 43.933 343.927 343.927 2.842 299.994 . "ANT" 0 0 0
"024368" "USD" 2013 437.151 8.149 86.793 63.971 51.206 46.901 51.206 452.149 452.149 2.205 400.943 . "ANT" 0 0 0
"024368" "USD" 2014 464.919 5.405 88.88 60.18 46.058 41.423 46.058 486.993 486.993 1.968 440.935 . "ANT" 0 0 0
"024368" "USD" 2015 391.394 4.15 74.644 61.82 40.559 36.005 40.559 419.303 419.303 .781 378.744 . "ANT" 0 0 0
"024368" "USD" 2016 386.822 6.389 49.174 58.936 49.71 45.527 49.71 489.085 489.085 .968 439.375 . "ANT" 0 0 0
"024368" "USD" 2017 179.811 1.927 75.202 .038 14.854 13.163 14.854 196.93 196.93 .072 182.076 . "ANT" 0 0 0
"024368" "USD" 2018 140.93 1.926 29.216 .469 21 19.291 21 231.272 231.272 .228 210.272 . "ANT" 0 0 0
"024368" "USD" 2019 41.827 1.139 12.458 .377 1.054 .161 1.054 68.814 68.814 0 67.76 . "ANT" 0 0 0
"024368" "USD" 2020 17.037 .215 9.378 0 -.944 -1.34 -.944 .327 .327 0 1.271 . "ANT" 0 0 0
"024368" "USD" 2021 7.984 0 4.216 0 -.541 -.541 -.541 .487 .487 0 1.028 . "ANT" 0 0 0
"014033" "USD" 2010 255.126 1.743 39.265 10.748 28.077 26.161 28.077 257.056 257.056 4.174 228.979 . "ARE" 0 0 0
"014033" "USD" 2011 232.053 .859 45.684 7.345 38.597 36.507 38.597 329.592 329.592 4.822 290.995 . "ARE" 0 0 0
"014033" "USD" 2012 326.532 1.526 131.305 4.831 52.715 50.778 52.715 413.683 413.683 8.528 360.968 . "ARE" 0 0 0
"014033" "USD" 2013 422.45 3.325 153.836 2.739 75.33 73.266 75.33 547.344 547.344 6.666 472.014 . "ARE" 0 0 0
"014033" "USD" 2014 497.072 2.375 193.36 1.466 92.432 90.035 92.432 700.045 700.045 6.217 607.613 . "ARE" 0 0 0
"014033" "USD" 2015 502.986 .568 216.209 .518 69.13 67.068 69.13 563.451 563.451 5.201 494.321 . "ARE" 0 0 0
"014033" "USD" 2016 574.605 .072 262.483 .049 70.684 68.905 70.684 551.831 551.831 4.491 481.147 . "ARE" 0 0 0
"014033" "USD" 2017 512.827 .443 218.715 .027 47.92 46.303 47.92 413.901 413.901 .802 365.981 . "ARE" 0 0 0
"014033" "USD" 2018 26.157 -1.18 -1.395 27.582 -324.812 -325.58 -324.812 64.437 64.437 .291 389.249 . "ARE" 0 0 0
"035993" "USD" 2017 . . . . . . . . . . . . "ARE" 0 0 0
"035993" "USD" 2018 199.937 .271 60.978 28.109 29.919 24.203 29.919 35.839 35.839 0 5.92 . "ARE" 0 0 0
"035993" "USD" 2019 307.252 38.69 109.416 102.785 37.06 31.274 37.06 44.085 44.085 0 7.025 . "ARE" 0 0 0
"035993" "USD" 2020 432.624 97.215 128.619 259.304 29.789 23.91 29.789 41.832 41.832 0 12.043 . "ARE" 0 0 0
"104579" "AED" 2010 1199.604 . 623.683 134.422 -52.169 -54.378 -52.169 190.923 . . . . "ARE" 0 0 0
"104579" "AED" 2011 1060.163 . 614.172 106.608 7.427 5.189 7.427 105.506 . . . . "ARE" 0 0 0
"104579" "AED" 2012 1048.919 . 627.802 82.8 37.7 35.911 37.7 139.564 . . . . "ARE" 0 0 0
"104579" "AED" 2013 882.36 . 478.636 48.578 -16.096 -17.521 -16.096 53.856 . . . . "ARE" 0 0 0
"104579" "AED" 2014 878.82 . 417.819 22.578 -51.284 -52.017 -51.284 16.524 . . . . "ARE" 0 0 0
"104579" "AED" 2015 904.093 . 349.421 .155 -38.331 -39.048 -38.331 92.429 . . . . "ARE" 0 0 0
"104579" "AED" 2016 871.008 . 260.852 185.014 -68.597 -69.278 -68.597 166.776 . . . . "ARE" 0 0 0
"104579" "AED" 2017 636.412 . 115.481 154.94 -109.639 -110.313 -109.639 76.582 . . . . "ARE" 0 0 0
"104579" "AED" 2018 548.268 . 105.235 117.84 -14.336 -14.902 -14.336 54.04 . . . . "ARE" 0 0 0
"104579" "AED" 2019 463.148 . 74.433 105.967 4.166 3.705 4.166 34.373 . . . . "ARE" 0 0 0
"104579" "AED" 2020 386.018 . 8.795 60.439 -12.457 -12.783 -12.457 49.366 . . . . "ARE" 0 0 0
"104580" "AED" 2010 931.925 . 301.066 . 47.112 46.725 47.112 187.838 . . . . "ARE" 0 0 0
"104580" "AED" 2011 972.166 . 322.866 . 43.53 43.274 43.53 172.435 . . . . "ARE" 0 0 0
"104580" "AED" 2012 1066.548 . 355.572 . 41.99 41.762 41.99 198.232 . . . . "ARE" 0 0 0
"104580" "AED" 2013 1166.439 . 393.02 . 44.514 44.26 44.514 204.143 . . . . "ARE" 0 0 0
"104580" "AED" 2014 1214.164 . 421.529 . 48.423 48.133 48.423 214.017 . . . . "ARE" 0 0 0
"104580" "AED" 2015 1266.055 . 439.163 . 40.759 40.406 40.759 204.493 . . . . "ARE" 0 0 0
"104580" "AED" 2016 1334.176 . 459.247 . 41.909 41.307 41.909 204.992 . . . . "ARE" 0 0 0
"104580" "AED" 2017 1325.57 . 481.631 . 44.411 43.763 44.411 212.141 . . . . "ARE" 0 0 0
"104580" "AED" 2018 1331.312 . 497.355 . 49.11 48.028 49.11 220.195 . . . . "ARE" 0 0 0
"104580" "AED" 2019 1370.119 . 515.372 . 45.762 44.713 45.762 215.756 . . . . "ARE" 0 0 0
"104580" "AED" 2020 1446.262 . 526.899 . 48.432 47.486 48.432 208.518 . . . . "ARE" 0 0 0
"104580" "AED" 2021 1374.202 . 538.617 . 34.75 34.023 34.75 203.616 . . . . "ARE" 0 0 0
"179660" "USD" 2010 .06 0 -.055 0 -.061 -.061 -.061 .025 .025 0 .086 . "ARE" 0 0 0
"179660" "USD" 2011 .257 0 .201 0 -.594 -.624 -.594 .008 .008 0 .602 . "ARE" 0 0 0
"179766" "AED" 2010 3037.836 24.188 1599.889 157.485 -123.286 -191.494 -123.286 1814.393 1814.393 .337 1937.679 . "ARE" 0 0 0
"179766" "AED" 2011 3029.986 70.709 1663.037 66.438 161.691 77.101 161.691 1736.108 1736.108 .309 1574.417 . "ARE" 0 0 0
"179766" "AED" 2012 3144.126 46.514 1598.924 83.534 -113.808 -189.351 -113.808 1946.845 1946.845 2.671 2060.653 . "ARE" 0 0 0
"179766" "AED" 2013 3209.399 26.197 1448.843 126.755 -60.001 -122.082 -60.001 2318.403 2318.403 1.928 2378.404 . "ARE" 0 0 0
"179766" "AED" 2014 2981.469 20.623 1461.355 95.424 93.036 33.961 93.036 1940.249 1940.249 .283 1847.213 . "ARE" 0 0 0
"179766" "AED" 2015 2489.921 9.346 1167.822 43.94 -132.953 -190.739 -132.953 1640.627 1640.627 .248 1773.58 . "ARE" 0 0 0
"179766" "AED" 2016 2625.1 10.7 1215.7 45.6 113.8 65.2 113.8 1730.3 1730.3 0 1616.5 . "ARE" 0 0 0
"179766" "AED" 2017 2620.1 22.5 1397 25.9 231.2 192.6 231.2 1800.3 1800.3 .2 1569.1 . "ARE" 0 0 0
"179766" "AED" 2018 2392.1 25 1140.9 22.8 85.3 46.9 85.3 1802.3 1802.3 .1 1717 . "ARE" 0 0 0
"179766" "AED" 2019 1944.8 22.6 717.1 52.2 -332.3 -372.2 -332.3 1375.8 1375.8 .1 1708.1 . "ARE" 0 0 0
"179766" "AED" 2020 1266.4 5.7 225.5 18.2 -328.1 -365.7 -328.1 599.7 599.7 .1 927.8 . "ARE" 0 0 0
"179766" "AED" 2021 1053.5 2.7 269.4 47.3 20.8 -6.3 20.8 801.6 801.6 .1 780.8 . "ARE" 0 0 0
"187366" "USD" 2010 0 0 -3.26 0 -.733 -.733 -.733 0 0 0 .733 . "ARE" 0 0 0
"187366" "USD" 2011 .049 0 -6.418 0 -6.892 -6.893 -6.892 .021 .021 0 6.913 . "ARE" 0 0 0
"187366" "USD" 2012 .017 0 -8.319 0 -2.143 -2.143 -2.143 0 0 0 2.143 . "ARE" 0 0 0
"187366" "USD" 2013 .009 0 -6.417 0 -.954 -.954 -.954 0 0 0 .954 . "ARE" 0 0 0
"187366" "USD" 2014 .01 0 -6.235 0 -1.094 -1.094 -1.094 0 0 0 1.094 . "ARE" 0 0 0
"187366" "USD" 2015 .058 .042 -7.058 0 -.747 -.751 -.747 0 0 0 .747 . "ARE" 0 0 0
"187366" "USD" 2016 .063 0 -6.532 0 -.589 -.593 -.589 0 0 0 .589 . "ARE" 0 0 0
"201676" "USD" 2010 2658.274 424.103 2092.946 0 676.199 487.723 676.199 780.409 780.409 83.231 104.21 . "ARE" 0 0 0
"201676" "USD" 2011 3351.317 406.782 2588.545 0 1061.624 856.205 1061.624 1150.513 1150.513 115.815 88.889 . "ARE" 0 0 0
"201676" "USD" 2012 3843.093 366.749 2859.316 0 1002.542 790.908 1002.542 1155.143 1155.143 141.789 152.601 . "ARE" 0 0 0
"201676" "USD" 2013 4396.2 281.105 3239.499 0 902.862 687.732 902.862 1047.89 1047.89 175.633 145.028 . "ARE" 0 0 0
"201676" "USD" 2014 4960.056 546.51 3708.204 0 895.023 578.616 895.023 1093.057 1093.057 146.38 198.034 . "ARE" 0 0 0
"251138" "AED" 2010 178271.194 . 19564.787 23460.695 4326.384 4217.589 4326.384 9003.58 . . . . "ARE" 0 0 0
"251138" "AED" 2011 183725.63 . 22072.006 22386.414 6358.269 6177.394 6358.269 10527.417 . . . . "ARE" 0 0 0
"251138" "AED" 2012 180795.723 . 24269.789 18346.172 5639.246 5476.433 5639.246 9234.079 . . . . "ARE" 0 0 0
"251138" "AED" 2013 183142.536 . 24176.598 15523.791 5450.712 5293.059 5450.712 8985.543 . . . . "ARE" 0 0 0
"251138" "AED" 2014 204019.463 . 26408.436 21853.912 5740.638 5582.64 5740.638 8907.947 . . . . "ARE" 0 0 0
"251138" "AED" 2015 228267.101 . 28727.762 26744.179 6675.859 6524.423 6675.859 8866.834 . . . . "ARE" 0 0 0
"251138" "AED" 2016 258289.257 . 30350.885 26328.845 6881.809 6736.996 6881.809 11053.376 . . . . "ARE" 0 0 0
"251138" "AED" 2017 265003.295 . 32444.94 33254.821 7602.257 7437.143 7602.257 12058.344 . . . . "ARE" 0 0 0
"251138" "AED" 2018 279830.338 . 32805.543 31895.319 9390.717 9214.79 9390.717 13564.078 . . . . "ARE" 0 0 0
"251138" "AED" 2019 405134.677 . 55638.387 44351.405 12433.922 12002.688 12433.922 18872.421 . . . . "ARE" 0 0 0
"251138" "AED" 2020 411156.301 . 56597.369 52636.977 8751.692 8252.472 8751.692 16771.725 . . . . "ARE" 0 0 0
"251138" "AED" 2021 440278.369 . 59359.8 51806.629 7951.733 7454.896 7951.733 14358.324 . . . . "ARE" 0 0 0
"251139" "AED" 2010 75257.518 . 6107.577 6708.931 2163.868 2086.653 2163.868 4089.952 . . . . "ARE" 0 0 0
"251139" "AED" 2011 74335.066 . 8568.458 6798.033 2134.597 2043.207 2134.597 4282.217 . . . . "ARE" 0 0 0
end
[/CODE]
------------------ copy up to and including the previous line ------------------

Listed 100 out of 572444 observations
Use the count() option to list more

dynamic looped regressions

Dear Stata colleagues:

I am using LASSO for inference. I want to independently assess the causality of 40 policy variables on a rate. Some of these vars are likely colinear. I would like to correct the loop below to: 1) automatically remove each x var from the control var pool when it is used as the variable of interest (I currently get an error message); 2) test for collinearity between the var of interest and each control x1-x40 and remove all collinears x1-x40 from the control pool for each regression. Note I have a handful of other control vars in that var pool that I do not want to subject to this restriction as I have a few quadratic terms.

Code:

foreach v of varlist x1-x40 { xpopoisson Mortality_rate `v', controls(rural gee cce fsi GDP_cap GDP_cap_sqr GHE_cap GHE_cap_sqr x1-x40) selection(cv) vce(cluster country_code) }

Thank you in advance for any guidance you may be able to provide.

Robert

Friday, December 30, 2022

How to use xtcointtest with runby? OR How to do cointegration tests for each individuals separately for panel data?

Dear all:
I'm trying to use xtcointtest to do cointegration tests for each individuals separately in panel data and store the results, so I need run the command xtcointtest by groups. However I failed by using the command runby.
My question is: why dose this error occur? and how to solve it?

the example data:

Code:

input float(pair ym mcprice2) double mcprice3
1 683      3395            3712.5
1 682      3058              3160
1 681    2847.5              2800
1 680      2800            2762.5
1 679      2782              2870
1 678      2560              2725
1 677      2414              2580
1 676    2562.5              2700
1 675      2610              2500
1 674      2282              2550
1 673    1962.5              2250
1 672      1940              2300
1 671      1932              2230
1 670      2085            2262.5
1 669      2160              2400
1 668      2174              2520
1 667 2251.6667              2600
1 666      2260              2670
1 665      2395            2837.5
1 664   2453.75              2875
1 663      2476              2930
1 662      2480              2925
1 661      2510              2900
1 660      2725            2962.5
1 659      2903              3000
1 658   2888.75              3325
1 657      2910              3325
1 656   3043.75              3325
1 655    3132.5              3325
1 624      4400              4600
1 653      3200              3325
1 652    3287.5              3650
1 651      3398              3650
1 650      3410            3687.5
1 649   3526.25              3700
1 648      3567              3700
1 647      3620              3695
1 646   3603.75            3672.5
1 645      3593              3690
1 644      3660              3715
1 643      3630              3705
1 642      3665              3696
1 641    3657.5              3825
1 640      3830              3940
1 639      3920            4102.5
1 638      3935              4120
1 637  3922.727            4117.5
1 636  3832.667              4108
1 635  3776.667              3950
1 634 3793.3335              4000
1 633  3737.333              4000
1 632  3648.333              3925
1 631  3684.667              4130
1 630 4075.8335            4337.5
1 629 4248.3335              4425
1 628 4396.6665              4480
1 627  4453.333              4550
1 626 4384.1665              4525
1 625 4359.3335              4520
1 654      3166              3325
3 654      3899              3166
3 683      2550              3395
2 683      2550            3712.5
3 682      2550              3058
2 682      2550              3160
3 681      2550            2847.5
2 681      2550              2800
3 680      2550              2800
2 680      2550            2762.5
3 679      2550              2782
2 679      2550              2870
3 678      2550              2560
2 678      2550              2725
3 677      2550              2414
2 677      2550              2580
3 676      2550            2562.5
2 676      2550              2700
3 675      2550              2610
2 675      2550              2500
3 674      2550              2282
3 673      2550            1962.5
2 673      2550              2250
3 672      2550              1940
2 672      2550              2300
3 671      2550              1932
2 671      2550              2230
3 670      2550              2085
2 670      2550            2262.5
3 669  2751.667              2160
2 669  2751.667              2400
3 668  2842.857              2174
2 668  2842.857              2520
3 667 2841.4285 2251.666748046875
2 667 2841.4285              2600
3 666  2888.889              2260
2 666  2888.889              2670
3 665   2966.25              2395
2 665   2966.25            2837.5
3 664      3045           2453.75
2 664      3045              2875
end
format %tm ym

my code:
capture program drop xtby
program define xtby
xtcointtest pedroni mcprice2 mcprice3
matrix b = r(stats)
svmat b,names(c)
matrix c = r(p)
svmat c,names(d)
end

runby xtby,by(pair)

the error returns:
Array

after this the whole data was disappear.

Renaming Matrix Rows With Spaces

Suppose I wish to rename the rows of a matrix.

Code:

clear *
cls
matrix A = (2,1\3,2)


cls
loc rn
foreach i in "City God Temple" Songjiang { //
    local rn `rn' `:display "`i''
    
    di "`rn'"
}


matrix rownames A = "`rn'"

mat l A

In this case, I want row 1 to be named "City God Temple" and row 2 to be "Songjiang". How would I do this? Instead, at present, it concatenates them together.

EDIT: Okay so I have the answer (dug up from some code in my program), but I don't get why it's correct.

Code:

clear *
cls
u "http://fmwww.bc.edu/repec/bocode/s/scul_Reunification.dta", clear

matrix A = (2,1\3,2)


levelsof country if inlist(code,20,21), l(labs)

mat rownames A = `labs'

Why does this work? Is it the backticks that levelsof defines?

Extracting All Strings After a Word within a Variable Name

I have a dataset representing a treated unit (GDP), a series of estimated counterfactuals, (cf) and pointwise treatment effects (diff_). Let's inspect what I've got.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input long gdp double cf int year float(diff_ diff_19) double cf_19 float diff_4 double cf_4 float diff_1 double cf_1 float relative
 2284 2382.1523483077835 1960  -98.15235 -239.61543  2523.615430234012 -128.92287  2412.922868887859  -156.5139 2440.5139206906247 -30
 2388 2446.6506511192324 1961  -58.65065  -189.2806 2577.2805809334377 -178.53693   2566.53692273034   -142.385   2530.38499218418 -29
 2527 2604.8115880199935 1962  -77.81158  -188.0124 2715.0123993333445 -186.70563 2713.7056316563535 -115.07682 2642.0768231463126 -28
 2610  2724.349430948411 1963 -114.34943 -210.40553  2820.405534499806 -233.68675  2843.686754138453 -157.42615 2767.4261537616703 -27
 2806 2889.9580019727537 1964    -83.958 -178.27428 2984.2742718124796  -115.0309  2921.030893136066 -140.29448  2946.294474600189 -26
 3005 3099.1157042307163 1965  -94.11571 -175.01315  3180.013158727885 -36.343998  3041.343998243464 -104.68647  3109.686466545085 -25
 3168  3359.960375695906 1966 -191.96037  -266.1019  3434.101900463562  -82.81771  3250.817708028666 -132.79723 3300.7972267528758 -24
 3241 3520.2771709761573 1967 -279.27716  -344.3777 3585.3776979131167 -250.44392 3491.4439182434617 -239.84328 3480.8432762798575 -23
 3571 3818.1429155125434 1968  -247.1429 -298.81952 3869.8195290516924 -227.67203 3798.6720259284243 -198.46587 3769.4658603651965 -22
 3998  4115.411017936789 1969 -117.41102   -151.486  4149.485996602814 -141.20108   4139.20108234847 -103.13166  4101.131661014288 -21
 4367  4332.029471182796 1970  34.970528  -7.026773  4374.026773141374 -175.03514  4542.035139627704  -47.44807  4414.448068668981 -20
 4686  4657.291964804378 1971  28.708035 -11.247523  4697.247522840271 -105.25514  4791.255143064456  -61.78067  4747.780672611365 -19
 5055  5086.462070494702 1972  -31.46207  -53.92624  5108.926233933297  -23.07299  5078.072991847272  -54.10266  5109.102663884323 -18
 5553  5646.499101450986 1973   -93.4991  -82.10171  5635.101705764349   -54.6655  5607.665499150371 -142.69818  5695.698185910818 -17
 6074  6130.730898105595 1974   -56.7309 -22.579327  6096.579327380564  -227.6143  6301.614301823399  -153.5402  6227.540215496327 -16
 6603  6635.678217863695 1975  -32.67822   9.549131  6593.450868564118  -86.40922  6689.409220964208  -160.1954  6763.195415566906 -15
 7367  7310.191521860769 1976   56.80848  100.64253  7266.357469068064  -60.40121  7427.401213899765  -8.467811 7375.4678110799205 -14
 8090   8024.42535038122 1977  65.574646   84.27509 8005.7249110951425   90.03732   7999.96268731265   49.71434  8040.285657605269 -13
 8928  8926.148259121721 1978   1.851741   57.40613  8870.593869525446   153.5217  8774.478301441524   64.27234  8863.727662437957 -12
10067  9836.536517859578 1979   230.4635   223.5902  9843.409790618043   162.4379  9904.562110128869  216.52855  9850.471452859647 -11
11083 10666.661733228293 1980   416.3383   432.9714  10650.02859426265   51.57676  11031.42324635798   401.3458 10681.654217847243 -10
12115 11806.043361600614 1981   308.9566   341.7199 11773.280100739687   63.06381 12051.936191465373   469.9802 11645.019818629977  -9
12761 12235.306759606703 1982  525.69324   539.0721 12221.927891952286 -23.557064 12784.557063637207  143.00787 12617.992126276697  -8
13519 13117.384658666579 1983   401.6154   378.0477 13140.952282833236  157.58456  13361.41543366107   83.95309 13435.046902355898  -7
14481 14362.574193051578 1984   118.4258   156.2245 14324.775496315307   334.3267  14146.67328931739  294.18005 14186.819958095495  -6
15291 15243.436752898206 1985   47.56325   83.27309 15207.726906842216   347.2524 14943.747610653902   325.0587 14965.941300613735  -5
15998 15972.494991886613 1986   25.50501   82.64354 15915.356458415343   397.9014 15600.098567980867   276.6529 15721.347114126615  -4
16679 16859.815540052205 1987 -180.81554  -36.51328 16715.513278612565   228.1159  16450.88409695461   42.18883 16636.811167643467  -3
17786 18036.168024604918 1988 -250.16803  -42.37479 17828.374794129002  157.76624  17628.23376836272  -155.0739 17941.073893747514  -2
18994 19248.291406558565 1989  -254.2914   7.740776  18986.25922439489  193.78767 18800.212319007103  -92.95484 19086.954839201564  -1
20465 20224.150759576136 1990  240.84924  525.81714 19939.182854950133  514.29364 19950.706386596594   393.9321 20071.067943179816   0
21602 20738.400966532055 1991   863.5991  1154.2319  20447.76805299464   685.5661 20916.433899914013   941.9903 20660.009702585576   1
22154 21580.174302957952 1992   573.8257   853.8572 21300.142823316975   676.8211 21477.178907646707   866.7733 21287.226699563194   2
21878 22271.115165831437 1993  -393.1152   -109.473  21987.47300196588  -40.68504 21918.685034424747   8.941148 21869.058852596623   3
22371 23360.161454596157 1994  -989.1614  -676.4664 23047.466398700126  -405.9901   22776.9901324677  -620.2714    22991.271386514   4
23035  24195.70462781342 1995 -1160.7046  -803.5131 23838.513081502944  -767.8742 23802.874198338817  -963.2782 23998.278215525206   5
23742   25245.9743769779 1996 -1503.9744 -1161.2825 24903.282474010928  -816.0538  24558.05377771829 -1319.3065 25061.306472021763   6
24156 26466.868505360024 1997 -2310.8684  -1861.107  26017.10710408016 -1019.7327  25175.73266554645 -2332.7148  26488.71495195701   7
24931 27612.192320207047 1998 -2681.1924 -2129.7876 27060.787551848938 -1351.1836 26282.183542873412 -2691.8235 27622.823511233928   8
25755 28983.168196963157 1999  -3228.168  -2599.467 28354.466950375485 -1130.8975  26885.89751671589  -2711.866 28466.865748668202   9
26943 30458.162643152937 2000 -3515.1626  -2849.585  29792.58513357836 -1400.6198 28343.619691990345  -3381.886 30324.886080181062  10
27449 31282.417577475535 2001 -3833.4175  -2956.631 30405.631330885048  -1783.003  29232.00306631015   -4500.46  31949.46011041869  11
28348  32233.17794185473 2002  -3885.178  -2846.022 31194.021767452297 -1672.4836 30020.483594424702  -5242.954 33590.953480535565  12
28855   33459.1985655981 2003 -4604.1987  -3463.965 32318.965141239547 -1398.5902  30253.59023557029  -5341.003  34196.00272261752  13
end
format %ty year
label var gdp "Real West Germany" 
label var cf "Sans All" 
label var diff_19 "Effect Sans Spain" 
label var cf_19 "Sans Spain" 
label var diff_4 "Effect Sans Belgium" 
label var cf_4 "Sans Belgium" 
label var diff_1 "Effect Sans USA" 
label var cf_1 "Sans USA" 
label var relative "Relative Time to German Reunification" 

cls

// Our Fit


mata A=J(0,1,.)
tempname e123

foreach x of var cf* {
    
local cf_lab: variable lab `x'

loc outlab "`cf_lab'" // Grabs the label of our outcome variable


tempvar Yco Ytr

matrix rmse_`x'=J(1, 2, .)

mkmat gdp if rel < 0, mat(`Ytr')

mkmat `x' if rel < 0, mat(`Yco')

mat loss = (`Ytr' - `Yco')' * ( `Ytr' - `Yco')
 
mat loss = loss / rowsof(`Ytr')
 
mata: roottaker("loss")
mat rowname loss = "`outlab'"
mat colname loss = "RMSPE"
matrix `e123'=loss

mata: A=A\st_matrix("`e123'")

}
mata: st_matrix("B",A)

mata C="RMSE"
mata C=C,J(1,1," ")
mata st_matrix("B",A)
mata st_matrixcolstripe("B",C)
mat l B


// Our ATTS


cls
mata T=J(0,1,.)
tempname e123
foreach x of var diff_* {
tempvar difftr

local cf_lab: variable lab `x'

loc outlab "`cf_lab'" // Grabs the label of our outcome variable


mkmat `x' if rel >= 0, mat(`difftr')

mat U = J(rowsof(`difftr'),1,1)

mat sum = U'*`difftr'

/* create vector of column (variable) means */

mat meanvec = sum/rowsof(`difftr')

mat rowname meanvec = "`outlab'"
mat colname meanvec = "ATT"
matrix `e123'=meanvec

mata: T=T\st_matrix("`e123'")

}
mata: st_matrix("B",T)

mata C="RMSE"
mata C=C,J(1,1," ")
mata st_matrix("B",T)
mata st_matrixcolstripe("B",C)
mat l B

I calculate the Root Mean Squared Errors and their ATTs. But, I need each row to reflect the country/unit that was left out. How might I do this? Could I, perhaps extract the string from the variable label? Every string after "Sans", so we'd have "All, Spain..." and so on as the row names? Oh, and as usual, I'm open to any Mata implementation that accomplishes this , as well as any simplification one might offer. Thanks so much!!

Assigning value from one variable to another if condition is met

I have a data manipulation question:
I have a dataset with the variables "id_1", "year", and "id_2." I would like to create a new variable "year_new" that takes the value of the variable "year" in row 1, if "id_2" has the same value as id_1 in row 1. Essentially, "year_new[_n]" (in row _n) is assigned the value of "year[_n-x]" if "id_2[_n]" = "id_1[_n-x]", where x is some number.

Code:

id_1 year id_2 year_new
909 2019   2  .
  .    . 909   2019

Thank you for your help!

Is it possible that log mis-calculates it or create weird ".b" when the workload is too heavy?

I use Stata 17 (24 cores).

I am sorry that I cannot provide the reproducible example because this problem is really a problem of Stata generating different result at different times. But I can provide this screenshot.
Array

lnFirmEmp is created by "gen lnFirmEmp = log(FirmEmp)"

There are more than 10M observations. In just two of them, I got weird results.

1st observation in the screenshot: It should be 10.854624 if I take the log. It gave me 10.854639. After the data was generated and saved, I checked the accuracy and it was wrong. So I created log variable once again ("lnFirmEmp2"). Then now it's correct.

2nd observation in the screenshot: After taking log, ".b" was created. I have more than 10M observations and many of them are missing "FirmEmp", but this was the only case where Stata thought it should be ".b" after taking the log. It's very strange.

3rd observation in the screenshot: It's correct.

But these problems disappeared after I re-ran the code (That is, Log created 10.854624 in the 1st observation, and created "." in 2nd observation.)

How can I explain this situation?

I ran the same code. Stata made two "mistakes" in the first try among lots of calculations. Stata didn't make these mistakes in the second try.

Can Stata really just make mistakes when somehow it didn't interact very well with my CPU?

Is it possible that log mis-calculates it or create weird ".b" when the workload is too heavy?

Determining level of significance between two median and between two catergorial variables, pre and post treatment

Dear Stata community,

I was wondering if anybody could help me with the following question. I am trying to figure out how to determine two things:

- Level of significance between pre op and post-op CKD (kidney function)
- Level of significance between pre op and post-op for median of eGFR and creatinine (also kidney function),

To help you understand, people either undergo a treatment RN or a treatment PN.

I hope I made myself clear, I am struggling for days now so I was hoping somebody could give me a clue. Thanks in advance!

	RN			PN
	Pre-op	1 year Post-op	p-value	Pre-op	1 year Post-op	p-value
CKD stage n (%)			???			???
1 (<15)	0	0		0	0
2 (15 – 29)	0	1		0	0
3a (30 – 44)	3	20		8	8
3b (45 – 59)	11	30		5	11
4 (60 – 89)	50	27		51	45
5 (>90)	36	3		36	12
eGFR (ml/min/1.73 m²) median [IQR]	83 (69 – 95)	57 (44 – 65)	???	79 (71 – 94)	73 (60 – 87)	???
Creatinine median [IQR]	81 (69 – 92)	107 (92 – 127)	???	84 (72 – 95)	88 (73 – 104)	???
New-onset CKD n (%)	42			11		<0.05
Negative delta-eGFR (%)	33 (23 – 40)			11 (3 – 19)		<0.05
Negative delta-eGFR median [IQR]	28 (18 – 34)			8 ( 2 – 16)		<0.05
Delta serum creatinine (%)	27 (4 – 46)			10 (-34 – 17)		<0.05
Delta serum creatinine median [IQR]	32 (22 – 41)			8 ( 1 – 16)		<0.05

Create graphs

Dear all,

I have a dataset that contains the following string variables: "country" and "countrydestination". The first variable includes all African countries as individual country observations in the year 2000. Each country then has many - kind of - results with "countrydestination" (the country where African refugees emigrate to --> there are many listed for each country of origin).

I would like to create a graph (preferably a pie chart or a bar graph) that shows which countries Africans are fleeing to the most.

My thought (before starting with the graph) is: I would like to summarize all the African countries so that it is only "Africa" --> not individual countries, but only the entire continent. And then I would like to have the top 5 or top 10 countries that are destinations.

Unfortunately, I don't have a solution for any of these three problems (A: Summarize countries to Africa, B: Show only the top 10 destinations and C: create the graph).

Every help is very much appreciated!
Thank you to everyone in advance!

Kind regards,
Marah

having issues plotting stock returns in panel data

Dear statalist, i am trying to plot the stock return data for a panel data of some companies, however, each time i plot this, but each time i plot this using xtline command, i get an vertical line which does not show any dynamics in the variable movement. however, if i plot the closing price of the stock, the plot looks good.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(Date ret4) str6 ticker long bcal_date
22280          . "ABF.L"  0
22284 -1.6926985 "ABF.L"  1
22285 -1.0840229 "ABF.L"  2
22286   2.688337 "ABF.L"  3
22287 -1.4247924 "ABF.L"  4
22288 -.49449205 "ABF.L"  5
22291  .04506954 "ABF.L"  6
22292    .045015 "ABF.L"  7
22293          0 "ABF.L"  8
22294   1.519262 "ABF.L"  9
22295 -.26643598 "ABF.L" 10
22298   .9294199 "ABF.L" 11
22299  -.1763671 "ABF.L" 12
22300  1.6630546 "ABF.L" 13
22301   .6489297 "ABF.L" 14
22302  -2.842971 "ABF.L" 15
22305  -2.424892 "ABF.L" 16
22306  .09086218 "ABF.L" 17
22307  -.7750133 "ABF.L" 18
22308  .18288766 "ABF.L" 19
22309  -3.344475 "ABF.L" 20
22312   3.891174 "ABF.L" 21
22313  1.6670744 "ABF.L" 22
22314   .4013417 "ABF.L" 23
22315   .8420265 "ABF.L" 24
22316  1.0535668 "ABF.L" 25
22319 -1.0977212 "ABF.L" 26
22320   .3085941 "ABF.L" 27
22321 -1.4184653 "ABF.L" 28
22322  -.4474456 "ABF.L" 29
22323 -.13461708 "ABF.L" 30
22326   5.290861 "ABF.L" 31
22327 -1.7615863 "ABF.L" 32
22328   .0433178 "ABF.L" 33
22329   .5185954 "ABF.L" 34
22330  2.1744623 "ABF.L" 35
22333   1.506937 "ABF.L" 36
22334  1.9338083 "ABF.L" 37
22335  -.6951607 "ABF.L" 38
22336  -.6587002 "ABF.L" 39
22337  -2.213491 "ABF.L" 40
22340  1.2172174 "ABF.L" 41
22341 -1.2594588 "ABF.L" 42
22342   .6736876 "ABF.L" 43
22343 -.04197542 "ABF.L" 44
22344 -1.0975165 "ABF.L" 45
22347  .25434014 "ABF.L" 46
22348  2.0947537 "ABF.L" 47
22349  -.7072873 "ABF.L" 48
22350  -.5023145 "ABF.L" 49
22351   1.664616 "ABF.L" 50
22354  1.3527613 "ABF.L" 51
22355   1.535391 "ABF.L" 52
22356 -1.3320118 "ABF.L" 53
22357 -1.8454542 "ABF.L" 54
22358 -.08282112 "ABF.L" 55
22361    -.74844 "ABF.L" 56
22362 -2.3649566 "ABF.L" 57
22363   -.428283 "ABF.L" 58
22364     .12867 "ABF.L" 59
22365  2.5810676 "ABF.L" 60
22368  -.7126455 "ABF.L" 61
22369  1.8341514 "ABF.L" 62
22370 -.24813415 "ABF.L" 63
22371  1.4796733 "ABF.L" 64
22376   .4884022 "ABF.L" 65
22377   .7684655 "ABF.L" 66
22378   .3217926 "ABF.L" 67
22379  -.2412499 "ABF.L" 68
22382  -.8490082 "ABF.L" 69
22383 -.20321368 "ABF.L" 70
22384  -.6530541 "ABF.L" 71
22385   .6937422 "ABF.L" 72
22386  -1.268688 "ABF.L" 73
22389   1.309339 "ABF.L" 74
22390  -6.118375 "ABF.L" 75
22391 -.25963387 "ABF.L" 76
22392  .21640652 "ABF.L" 77
22393  -.8248349 "ABF.L" 78
22396   .9976213 "ABF.L" 79
22397  -.6060578 "ABF.L" 80
22398 -.17382672 "ABF.L" 81
22399   .9092869 "ABF.L" 82
22400  -.5185954 "ABF.L" 83
22404  -.7829376 "ABF.L" 84
22405   .8695608 "ABF.L" 85
22406  1.2903584 "ABF.L" 86
22407   .4264242 "ABF.L" 87
22410   .8052581 "ABF.L" 88
22411 -3.3916016 "ABF.L" 89
22412  -.4376436 "ABF.L" 90
22413 -1.8592914 "ABF.L" 91
22414  2.7326605 "ABF.L" 92
22417 -1.0050433 "ABF.L" 93
22418     .43823 "ABF.L" 94
22419 -1.0549673 "ABF.L" 95
22420   2.616812 "ABF.L" 96
22421 -.56119645 "ABF.L" 97
22424   .8620755 "ABF.L" 98
22425 -1.7316363 "ABF.L" 99
end
format %td Date
format %tbcalendar bcal_date

Thursday, December 29, 2022

Average marginal effects of continuous variables by levels of a dichotomous variable

Hi,

I run this four-way interaction model in Stata 14:

xtreg Y c.X1##c.X2##c.X3##X4 Controls i.country i.year,r

Then, I want to get the average marginal effects (AMEs) of continuous X1, X2 and X3 when dichotomous X4 equals 0 and 1 respectively. My options are:

(i) margins X4, dydx(X1 X2 X3)

(ii) margins if X4==0, dydx(X1 X2 X3) & margins if X4==1, dydx(X1 X2 X3)

Is there any reason I'd prefer (i) to (ii)?

Since (i) treats all observations as if X4==0 and X4==1 respectively, is (ii) more accurate/preferable to accounting for real X4==0 and X4==1 observations separately one by one?

Best,

Lütfi

general time trend

Hi, we have a problem incorporating a general time trend in our panel data with year fixed and country fixed effects. The defendant variable is economic growth. We have tried the code c.timevar, but it doesn't work. I can send an image from our do-file.

Array

Wednesday, December 28, 2022

Meta-analysis help: how to convert hazard ratio to risk ratio

I am working on a meta-analysis using Stata. All the primary studies report risk ratio except one study that reported hazard ratio. Is there a formula to convert a hazard ratio to a risk ratio?

Thanks,
Al Bothwell

Downloading data from Public API from json format to csv file or txt file so that I can read data in stata on macOS

I have to download the following data from Business Dynamics Statistics ( which is public data provided by census bureau ) from the following API which returns the data in json format.

Does anyone have any idea how I can download the data in stata readable format from the following link ?

Code:

https://api.census.gov/data/timeseries/bds?get=NAME,YEAR,FIRM,ESTAB,EAGE,EMPSZES,EMPSZFI,ESTABS_ENTRY,ESTABS_EXIT,FAGE,FIRMDEATH_EMP,FIRMDEATH_ESTABS,FIRMDEATH_FIRMS,JOB_CREATION,JOB_CREATION_BIRTHS,JOB_CREATION_RATE,JOB_DESTRUCTION,JOB_DESTRUCTION_DEATHS,NET_JOB_CREATION,REALLOCATION_RATE,NET_JOB_CREATION_RATE,EMP&for=county:*&in=state:*&time=from+2000+to+2008&for=NAICS:*

kappaetc / Krippendorff's alpha with missing values and across multiple variables

My research team coded/classified a large set of documents. I took a random subset of 1000 from the coded docs to code and check the reliability. So, each document was coded by one of eight team members and a second time by me. However, because of the nature of the documents, some documents should be coded more than once because they meet multiple classification qualifications. But, not all were necessarily coded the same amount of times be myself and the other coder. (Sometimes I coded for 2+ classifications and the other did not or vice verse, depending on how thorough each was). About 6% of documents were coded more than once.

If the coder answered that the document was type 1, then they would code the T1 variables and the T2 variables would be missing. If it was type2, then they’d code the T2 variables. If it is both type 1 and 2, the coder should have coded it twice, once for each type. As the data is arrange now, each variable has four columns, the RA first coding, my first coding , the RA second coding (if applicable) and my second coding (if applicable)

doc_id	type_ra1	type_me1	type_ra2	type_me2	T1_ra1	T1_me1	T1_ra2	T1_me2	T2_ra1	T2_me1	T2_ra2	T2_me2
1	1	1	.	.	2	2	.	.	.	.	.	.
2	2	2	.	.	.	.	.	.	1	2	.	.
3	1	1		2	1	1		.	.	.	.	2
4	1	2	.	1	2	.	.	2	.	2	.	.

(Not all the numbers are necessarily 0 or 1, some vars go to 6. All the vars are categorical, not ordinal. )

So, in the above example, the first row should have perfect reliability between me and the RA.
The second row would be reliable for the type variable, but not the T2 variable.
The third row both coders coded it as type 1 the same, but I coded it a second time.
In the fourth row, my second coding matched the RA first coding, which should still count towards higher reliability, but it would have been better if the RA had coded a second time in the same way as my first.

For the ICR I’ve been grouping all attempts for comparable variables like this:
Kappaetc type_ra1 type_ra2 type_me1 type_me2
Kappaetc T1_ra1 T1_ra2 T1_me1 T1_me2
…

So, I have 2 questions:

1) Is there a better way to account for missing values here? When I make missing values 0 the reliability hits around .7 depending on the variable, but without it is around .85.

2) Is there a good way to measure total reliability? I want to be able to see if the document as a whole was reliably coded, not just the individual variables. Because there are a lot of variables coded for each document, it could be that most documents have something off when looking at the variables collectively, even though the individual variables tend to have high reliability.

Contrast tables

Hi all,

New Statalist user here with a simple question (I'm a long time Stata user who is trying to learn to use tables). I am trying to create a table with contrast estimates after running a regression model. MWE below:

clear all
sysuse auto.dta
reg price i.foreign##i.rep78 mpg
collect get, tag(model[]): margins r.foreign , at(rep78=(1(1)5)) asobserved
collect levelsof model
collect style header result, level(hide)
collect layout (colname#result[_r_b]) (model)

The problem: I want to replace the row headers (e.g., (Foreign vs Domestic) @ _at=1, (Foreign vs Domestic) @ _at=2, ...) with custom strings of my choosing such as "Row 1", "Row 2", etc.

Is there a simple way of doing this?

Any help would be welcome.

Thanks!
Matthew

Panel data: Creating a bilateral dataset with countries and years (replicating the dataset for each panel)

Hi everyone,

I have a panel dataset of exchange rate with 200 countries and 16 years, please find below an example:

country_origin	year	variable1
Albania	2006	98.1
Albania	2007	90.43
Albania	2008	83.89
Argentine	2006	3.05
Argentine	2007	3.1
Argentine	2008	3.14
Egypt	2006	5.73
Egypt	2007	5.64
Egypt	2008	5.43

I want to create a bilateral exchange rate dataset. It should look like this for example (as if I am replicating the same dataset but for each panel and I end up creating two variables: country_destination and variable1_replicated):

country_origin	year	variable1	country_destination	variable1_replicated
Albania	2006	98.1	Albania	98.1
Albania	2007	90.43	Albania	90.43
Albania	2008	83.89	Albania	83.89
Albania	2006	98.1	Argentine	3.05
Albania	2007	90.43	Argentine	3.1
Albania	2008	83.89	Argentine	3.14
Albania	2006	98.1	Egypt	5.73
Albania	2007	90.43	Egypt	5.64
Albania	2008	83.89	Egypt	5.43
Argentine	2006	3.05	Albania	98.1
Argentine	2007	3.1	Albania	90.43
Argentine	2008	3.14	Albania	83.89
Argentine	2006	3.05	Argentine	3.05
Argentine	2007	3.1	Argentine	3.1
Argentine	2008	3.14	Argentine	3.14
Argentine	2006	3.05	Egypt	5.73
Argentine	2007	3.1	Egypt	5.64
Argentine	2008	3.14	Egypt	5.43
Egypt	2006	5.73	Albania	98.1
Egypt	2007	5.64	Albania	90.43
Egypt	2008	5.43	Albania	83.89
Egypt	2006	5.73	Argentine	3.05
Egypt	2007	5.64	Argentine	3.1
Egypt	2008	5.43	Argentine	3.14
Egypt	2006	5.73	Egypt	5.73
Egypt	2007	5.64	Egypt	5.64
Egypt	2008	5.43	Egypt	5.43

I was wondering if there could be any command that would help execute this format. I would be grateful if you could advise on this. I am sorry in advance if this has been asked before and I couldn't locate the answer.

Many thanks in advance.

Jala Youssef

Estimating risk difference for binary outcome in cluster RCT

I am analysing data from a cluster RCT looking at the effect of an intervention (int) in health centres (centre) to reduce inappropriate antibiotic prescribing (abu). We initially planned to present results as ORs using mixed models with random effects for health centre. This is the code:

Code:

meglm abu int || centre: , family(binomial) link(logit) eform

However, antibiotic prescribing is over 90% and the OR showed a very strong effect (OR=0.25) while the absolute difference was small (5%). This seems misleading, and we thought maybe this was because the outcome is not rare, so we decided to present the results as RRs rather than ORs using this code:

Code:

meglm abu int || centre: , family(poisson) link(log) eform

This gives an RR of 0.93, which seems more reasonable.

We also wanted to present Risk Difference (RD) and have tried a few different ways to get this. Binomial with identity or log link as suggested in this post

HTML Code:

https://www.statalist.org/forums/forum/general-stata-discussion/general/1693010-adjusted-risk-ratios-and-risk-differences-after-logistic-regression

gives error messages saying that identity and log links are not allowed with binomial distribution. We also tried:

Code:

meglm abu int || centre: , family(gaussian) link(identity)

But this gives an error saying that initial values are not feasible.

Can anyone suggest another way of doing this? Or is the best way to use margins?

Balance an unbalanced dataset

Hi all,

I have a strongly unbalanced dataset of countries observed by year.
I would like to balance it by retaining as many observations as possible. For instance, say the maximum time span is 46 periods. Still, only 10 out of 100 countries have 46 time periods, whereas if I only drop 2 time periods (going from 46 to 44) I can reach 20 more countries ending up with 30 countries observed in 44 time periods. Then, I would prefer the second choice. Say the minimum acceptable time periods are 28. Is there a way to make the process automatic ideally with a "persist" and "restore" dropping all observations having less than k time periods and counting the number of countries with k time periods then repeating the process... Of course, the time periods should be the same (so for instance if I have 2 countries with 3 time periods, if country A has 1980-1981-1982 and country B has 2020-2021-2022, then this is not a matching of time periods. Instead, time periods should ideally coincide).

This is a snapshot of my code:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str4 person_ctrycode int earliest_publn_year
"AD" 1983
"AD" 1990
"AD" 1998
"AD" 2005
"AD" 2006
"AD" 2009
"AD" 2013
"AD" 2014
"AD" 2015
"AD" 2017
"AD" 2018
"AD" 2019
"AD" 2020
"AD" 2021
"AE" 1978
"AE" 1984
"AE" 1989
"AE" 1990
"AE" 1991
"AE" 1992
"AE" 1993
"AE" 1994
"AE" 1996
"AE" 1997
"AE" 1998
"AE" 2001
"AE" 2003
"AE" 2004
"AE" 2005
"AE" 2006
"AE" 2007
"AE" 2008
"AE" 2009
"AE" 2010
"AE" 2011
"AE" 2012
"AE" 2013
"AE" 2014
"AE" 2015
"AE" 2016
"AE" 2017
"AE" 2018
"AE" 2019
"AE" 2020
"AE" 2021
"AF" 2015
"AF" 2018
"AF" 2020
"AG" 2010
"AG" 2013
"AI" 1982
"AI" 1984
"AI" 1986
"AI" 1988
"AI" 1990
"AI" 1995
"AI" 1996
"AI" 2002
"AI" 2008
"AI" 2009
"AI" 2010
"AI" 2019
"AI" 2020
"AL" 1988
"AL" 1993
"AL" 2003
"AL" 2009
"AL" 2010
"AL" 2011
"AL" 2015
"AL" 2017
"AL" 2018
"AM" 1995
"AM" 2000
"AM" 2004
"AM" 2005
"AM" 2006
"AM" 2007
"AM" 2008
"AM" 2010
"AM" 2011
"AM" 2012
"AM" 2013
"AM" 2014
"AM" 2015
"AM" 2016
"AM" 2017
"AM" 2018
"AM" 2019
"AM" 2020
"AN" 1979
"AN" 1980
"AN" 1981
"AN" 1982
"AN" 1983
"AN" 1984
"AN" 1985
"AN" 1986
"AN" 1987
"AN" 1988
end

however, I think that state provides a default unbalanced dataset which might be more useful for an MWE.

Thank you

Weekly means

Hello Everyone!!
I need to obtain weekly means for some measurements. I need a weekly mean by subject and week for each of them. I'm not able to do it

+----------------------------------------+
| CODICE Settim~a GLI1 GLI2 |
|----------------------------------------|
1. | 20008 1 101 98 |
2. | 20008 2 95 100 |
3. | 20008 3 102 113 |
4. | 20008 3 110 90 |
5. | 20008 3 113 91 |
|----------------------------------------|
6. | 20008 3 113 91 |
7. | 20008 4 101 . 104 |
8. | 20008 5 92 91 |
9. | 20008 5 104 101 |
10. | 20008 6 101 85 |
+----------------------------------------+

Where CODICE stands for id, Settimana stands for a week, and GLI1, GLI2, and GLI3 are the measurements
Thank you all in advance

Replace missing values with values from other observations

Hello,

I´m stucked with my data cleaning. Below, a excerpt of my dataset is shown. I only have included data from one gvkey code, but in total there are approx. 7500 unique gvkey codes in my dataset.

As you can see below, I only have some values for my co_per_rol and becameceo variable. In case of missing values, I want to have the same values like for the observations with nonmissing values.

I tried to solve this problem by using total command, however this does not work if I have more than 1 nonmissing values for the observations.

Thanks in advance!

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input long gvkey int fyearqtr_ec long co_per_rol int becameceo
1045 217     .     .
1045 223     .     .
1045 220     .     .
1045 221     .     .
1045 222     .     .
1045 227     .     .
1045 224 49256 19701
1045 225     .     .
1045 226     .     .
1045 231     .     .
1045 228     .     .
1045 229     .     .
1045 230     .     .
1045 235     .     .
1045 232 49256 19701
1045 233     .     .
1045 234     .     .
1045 236     .     .
1045 237     .     .
1045 238     .     .
1045 243     .     .
1045 240     .     .
1045 241     .     .
1045 242     .     .
1045 247     .     .
1045 244     .     .
1045 245     .     .
1045 246     .     .
end
format %tq fyearqtr_ec
format %td becameceo

Tuesday, December 27, 2022

Using matching with DID for repeated cross sectional data

I am new to STATA and working on repeated cross-sectional data for Difference-in-Difference analysis. I have treatment and control groups before and after the intervention. My treatment and control groups are, however, unbalanced (the covariates are significantly different across the two groups both in pre-and post-intervention). I, therefore, want to apply the matching technique to match the three groups (pre-treatment, pre-control, post-control) to my post-treatment group and then use DID. Any guidance about the STATA commands and references to papers that have applied this technique would be helpful.

Add on consecutively by group

Hi all,

I have the first observation for each id for experience, and I would like to replace the value of experience +1 for the next years.
So for 2002 if would be experience+1, for 2003 experience+2 etc.
I tried sorting by id and add but did not work.

Any help would be appreciated, thank you!

Code:

clear
input str7 id double year float experience
"0100003" 2001 26
"0100003" 2002  .
"0100003" 2003  .
"0100003" 2004  .
"0100003" 2005  .
"0100003" 2006  .
"0100003" 2007  .
"0100003" 2008  .
"0100003" 2009  .
"0100003" 2010  .
"0100003" 2011  .
"0100003" 2012  .
"0100003" 2013  .
"0100003" 2014  .
"0100003" 2015  .
"0100003" 2016  .
"0100003" 2017  .
"0100003" 2018  .
"0100003" 2019  .
"0100003" 2020  .
"0100005" 2001  0
"0100005" 2002  .
"0100005" 2003  .
"0100005" 2004  .
"0100005" 2005  .
"0100005" 2007  .
"0100005" 2008  .
"0100005" 2009  .
"0100005" 2010  .
"0100005" 2011  .
"0100005" 2012  .
"0100005" 2013  .
"0100005" 2014  .
"0100005" 2015  .
"0100005" 2016  .
"0100005" 2017  .
"0100005" 2018  .
"0100005" 2019  .
"0100005" 2020  .
"0100005" 2021  .
"0100006" 2002  0
"0100006" 2003  .
"0100006" 2004  .
"0100006" 2005  .
"0100006" 2008  .
"0100006" 2009  .
"0100006" 2010  .
"0100006" 2011  .
"0100006" 2012  .
"0100006" 2013  .
"0100006" 2014  .
"0100006" 2015  .
"0100006" 2016  .
"0100006" 2017  .
"0100006" 2018  .
"0100006" 2019  .
"0100006" 2020  .
"0100006" 2021  .
"0100010" 2001 14
"0100010" 2002  .
"0100010" 2003  .
"0100010" 2004  .
"0100010" 2005  .
"0100010" 2006  .
"0100010" 2007  .
"0100010" 2008  .
"0100010" 2009  .
"0100010" 2010  .
"0100010" 2011  .
"0100010" 2012  .
"0100010" 2013  .
"0100010" 2014  .
"0100010" 2015  .
"0100010" 2016  .
"0100010" 2017  .
"0100010" 2018  .
"0100010" 2019  .
"0100010" 2020  .
"0100010" 2021  .
"0100011" 2001  9

end

Example of GMM moment-evaluator program with many equations

Hi, Statalist,

Does anybody know a simple example of how to implement the moment-evaluator program version of GMM with more than one equation? In the help files, I could only find examples with the 'nequations(1)' option.

Thanks in advance,

Vinicius

testing the validy of the instrument in xtivreg

Hi,

I am using xtivreg , where xtivreg y (x=z)

The issue is that how can i tell that the instrument z is a valid instrument? I don't see an indicator from the generated results that tests the instrument.

how to know whether a placebo test is passed

Hello!

I'm conducting a placebo test on a staggered DID, I randomly assign treatment firms and policy year to my sample firms, repeat that 500 times, and I obtain the following graph. I read some related literature, however, they seem to be vague about what is the threshold of passing/not passing a placebo test. I wonder whether the graph below could be accounted as passing the placebo test, and why or why not so. Thanks for any advice!

Array

Monday, December 26, 2022

Something wrong with -colorpalette-

Dear Stata users,

I have installed the -colorpalette- wrote by Ben Jann, and I remenber that it run successfully before. I run it today but fails. I reinstall -colorpalette- and -colrspace-, but the problem still exists. Below is the information:

Code:

. version
version 16.0

. colorpalette
function pexists() not declared in class ColrSpace
(176 lines skipped)
(error occurred while loading colorpalette.ado)
r(3000);

Avergaing values of variables to reduce survey observations

Hi,

I am looking for a solution to average the survey respones.
Below S007 is suvey respondent ID, loc is country name and fyear is the year. the variables B010 through B031 are the answers by each survey respondents, are labelled as string but there is a number behind varying from -4 to 4.

I need the respone consolidated for all respondents, like an average response for every loc and fyear.
Could someone help me achieve this please.

Much appreciated and many thanks in advance.

Array

Stata for loop error

Hi,

I am trying to run the below loop in stata. But i get an error.

Can someone help me. I am a beginner in using Stata.

The variables fyear, B031, B031_av are already defined and available in the dateset.

forvalues fyear = 2005 / 2022 {
2. local count=0
3. foreach l of local levels {
4.
. count=count+1
5.
. local B031_sum=sum(B031) if loc == 'l'
6. B031_av = B031_sum/count
7. }
8.
. }
=exp not allowed

Many thanks in advance.

Interpreting significant interaction effect while main effect is insignificant

Hi!

I am investigating the effect of balanced ambidexterity and combined ambidexterity on firm performance, with moderating variables environmental dynamism and environmental munificence.
While my main relation is insignificant, the interaction effect of my moderating variable is significiant.
How do I need to interpret this, and can I give support to the proposed hypothesis of a positive moderating effect if the main effect is insiginificant?

Note: for the moderating variables, variable * BD or variable * CD stands for interaction term with m

Code:

**Dependent: Sales Growth 3 year	Model 0		Model 1	Model 2	Model 3	Model 4	Model 5	Model 6	Model 7	Model 8
Independent variable
Balanced Ambidexterity		-.348			-.358		0.044		.049
Combined Ambidexterity				.019		.024		.016		.014
Moderator variables
Environmental Dynamism Environmental Dynamism balanced ambidexterity* Environmental Dynamism combined ambidexterity*					-0.862 -.195	-1.090 .053			-1.244 .065	-1.223 0.032
Environmental Munificence Environmental Munificence Balanced Ambidexterity* Environmental Munificence Combined ambidexterity*							30.155 3.951	34.137 1.135	30.145 4.054	34.193 1.179
Control variables
Exploration	-.287	.052		-.469	-.063	-.483	-.383	-.512	-.395	-.521
Exploitation	.036	.388		.023	.450	-.024	-.042	.014	-.551	.013[EM1]
Industry type	-.424**	-.424**		-.420**	-.398**	-0.394**	-.212*	-.229**	-.186**	-.202**
Firm size	-.342**	-.342**		-.340**	-.339**	-.338**	-.296**	-.302**	-.295**	-.300**
Firm age	-.035**	-.035**		-.036**	-.035**	-.035**	-.315**	-.035**	-.032**	-.032**
Year dummies	Yes	Yes		Yes	Yes	Yes	Yes	Yes	Yes	Yes
Observations	2412	2412		2412	2412	2412	2412	2412	2412	4212
Number of firms	402	402		402	402	402	402	402	402	402
Wald Chi-square	38.21	32.83		32.04	32.89	32.90	89.25	130.12	88.71	128.16
Log-likelihood
P > Chi-square	0.000**	0.000**		0.000**	0.000**	0.000**	0.000**	0.000**	0.000**	0.000**
*. is significant at the 0.05 level. . is significant at the 0.10 level. Year dummies are not reported.

THanks in advance!

Eline

Dealing with missing data after using business date for daily stock return

Dear distinguished Prof,
I have daily data on stock price of 15 companies, I want to calculate daily stock price for each companies in line with the business calendar which might have removed all holidays. In addition, I would Also love to omit any missing data that could be generated in the process of calculating the daily returns because I wanted to plot the return variable.

Year and Industry dummies with GEE

Hello everyone,

I have an urgent problem. I am investigating the implications of CEO´s political ideology and Corporate Strategic Change using Panel Data. I use a GEE to do so, which is common practice for these type of analysis and it is also approved by my supervisor. The problem I have is that as soon as I include year and industry dummies in my GEE regression, no Wald Chi2 is calculated. I use the following code:

Code:

xtset co_per_rol fyear
xtgee leading_strategic_change ceo_conservatism industry_control pre_entry ceo_tenure firm_age performance firm_size employees ceo_duality ceo_age incentive_compensation democratic_president  liberal_state i.fyear i.sic_industry, vce(robust)

The results that I get are attatched as a PDF file Array .

Could any of you please help me and tell me if either my model or the code I am using is wrong?

Thank you very much in advance,

Richard

Sunday, December 25, 2022

How to run a regression excluding certain countries ids

Dear all
I am trying to run a regression excluding certain countries that don’t require disclosure.
I have used this code:
reg xy if country’s Id != 39, 34,32,37 but it does not work.

thanks in advance for your help.

How to run a regression excluding certain countries ids

Manage dates with different digits

Dear Stata users,

I'm using the Stata 17 MP and want to manipulate dates with different digits (see below).

In the first row, for instance, I have "9012002" (7 digits), meaning 09dec2002. In the third row, I have "29052000" (8 digits), which is 29mai2000.

I tried to transform the dt_admissao variables in many ways, but unfortunately, I could not obtain operationalizable dates. Could you help me solve my problem?

* Example generated by -dataex-. For more info, type help dataex
clear
input str8 dt_admissao
"9012002"
"1022002"
"29052000"
"17122001"
"1022002"
"1032000"
"1082001"
"1092002"
"1081995"
"1032001"
end

Percentage problem regarding loss and networth

Hello all

I am using a definition to create a dummy variable.

If the accumulated loss of the year is more than 50% of the previous four years' average networth, then the dummy has a value of 1.

Loss is always negative, but networth can be both positive and negative.

When networth is positive, then the definition is clear, but when the networth is also negative, then how should I interpret the percentage of accumulated loss to the previous four years' average networth?

Thanks and Regards

Merge Datasets with multiple observations for one year

Hello everybody,

I am currently trying to merge two datasets. One Dataset (Platforms) contains the annual traffic of a company and the other (test1) contains the executives. In the Platforms dataset I have different years and companies and for each year and company one number of traffic. In the test1 database, I have multiple executives for each year and each company. I want to use the Platforms database as my base and merge the test1 on it so that I get for one year multiple executives but the same number of traffic in each row for that year. I need to do it this way because I am going to merge for each year one different dataset with the executives.

My idea for the code was the following:
The variable name for year is "year" and for company is "dp"

Code:

use "Platform"
merge 1:m dp year using "test1.dta"

Here you can find an example of one company and one year for each of the two datasets.

Platforms:

Code:

year    dp    totalplatact
2013    airbnb    5
2014    airbnb    66
2015    airbnb    54
2016    airbnb    34
2017    airbnb    21
2018    airbnb    46
2019    airbnb    12
2020    airbnb    8

test1

Code:

person_name    dp    year
Erin Coffman    airbnb    2020
Harrison Shoff    airbnb    2020
Brian Chesky    airbnb    2020
Scott Raymond    airbnb    2020
Damir Duskan    airbnb    2020
Justin K Chen    airbnb    2020
MatÃ*as Vives    airbnb    2020
Jackson Wang    airbnb    2020
David Leach    airbnb    2020
Chip Conley    airbnb    2020
Justin K Chen    airbnb    2020
Peter Coles    airbnb    2020
Josh Horowitz    airbnb    2020
Christopher Lukezic    airbnb    2020
Peter Coles    airbnb    2020
Justin K Chen    airbnb    2020
Joe Gebbia    airbnb    2020
Dzmitry Bazhko    airbnb    2020

If I use my code I get for the year 2020 just one executive in the merged data and not all the executives.

Can anyone help me with the code?

Thanks in advance and best regards Jana

Saturday, December 24, 2022

Command likes -fbar- that simultaneously show frequencies or percents of categorical variables

Dear Stata users,
The bar plot is frequently used in daily practice. Suppose that I have several categorical variables in my dataset, and I want to show frequencies or percents of these variables using bar plot. The -graph bar- will fail in this case. And the user-written command -fbar- (by Nick Cox in 2001.06.19) can do some work, i.e. it can show frequencies of one categorical variable one at a time. Is there some command can do this work and simultaneously show frequencies or percents of several categorical variables? Thank you.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(q1 q2 q3 q4 q5)
 3 3 2 2 2
 1 1 1 1 1
 2 3 2 1 1
 1 1 1 1 1
 2 2 2 2 2
 2 2 1 1 1
.a 2 2 2 2
 1 1 1 1 1
 2 2 2 2 1
 2 2 2 2 2
 1 1 1 1 1
 2 2 1 1 1
 2 2 2 2 2
 1 1 1 1 1
 1 1 1 1 1
 4 4 2 3 2
 2 2 2 2 2
 1 1 1 1 1
 2 2 2 2 2
 1 1 1 1 1
 2 2 2 2 2
 2 2 1 1 1
 2 2 2 1 2
 1 1 1 1 1
 3 2 1 1 1
 2 2 2 2 2
 1 1 1 1 1
 2 3 1 2 2
 2 2 2 2 1
 2 2 2 2 2
end
label values q1 q
label values q2 q
label values q3 q
label values q4 q
label values q5 q
label def q 1 "stongly agree", modify
label def q 2 "slightly agree", modify
label def q 3 "slightly disagree", modify
label def q 4 "strongly disagree", modify

Array

Comparing two groups of variables with each others

Hi all, first of all, I would like to wish you happy holidays.

I have a question about comparing two groups with each other. Just to outline the situation: I have two variables 'version' and 'risk tolerance'. The variable version can take two values, 0 or 1. The variable risktolerance can take 4 values (0 1 2 or 3).

These variables come from a questionnaire where there was a default option for risk tolerance, namely value '2'.

Now I would like to test whether this value '2' is indicated more in case the variable 'version' equals 0.

If I use the 'tabulate risk tolerance' function, I nicely get all the absolute frequencies of this variable. However, it is something like this that I want to obtain, but split for version 0 and version 1.

It seems simple and obvious, but I just can't find it... can someone help me out?

Huge thanks in advance!

Omitted variable (log) in a panel data with FE in Stata 16.1

Dear Statalisters,

My analysis is about finding to what extent government expenditures determine income inequality in an unbalanced panel data of European countries (1995-2020) while using time and country FE.
My dependent variable is the Gini index, and I run the same regressions with the population's top 10% income share as Y, as robustness check. I focus on Government expenditures as independent variable (total, education, health and defense), and I added a set of covariates that is standard in the literature.

One of these covariates is GDPpc and its squared value, to verify the Kuznets theory. In the literature, some authors take the log of both value. To my understanding, that is mainly helpful because it allows for easier interpretation, and also because it's in € while most of my other variables are in % of GDP.

My issue is that running the FE regression leads to the log(GDPpc)^2 to be omitted. I checked the correlation matrix, which indicated that log(GDPpc) and log(GDPpc^2) are indeed perfectly correlated, and I don't understand why. This issue does not arise when running the regressions with GDPpc and GDP^2 (no logs).

To construct both variables, I started off with GDPpc as raw data.
I then used the following commands to generate GDPpc^2, and then log(GDPpc) and log(GDPpc^2):

Code:

gen GDPpc2 = GDPpc^2

Code:

gen logGDPpc = log(GDPpc)

Code:

gen logGDPpc2 = log(GDPpc2)

I also tried generating the squared value with:

Code:

gen GDPpc2bis = c.GDPpc#c.GDPpc

but the outcome was the same.

As an indication: I run the following command for my regression, which is when I get the -omitted- message. I run the same one with the dependent var "top10sharePost" as robustness check.

Code:

xtreg gini_std_WIID lag_gini_std_WIID total_GE health_GE educ_GE defence_GE social_prot_GE logGDPpc logGDPpc2 hc Unempl TaxP TaxIn Trade CPI i.Year, fe robust

Here is a sniped of my data (I excluded some covariates as the full dataset exceeded linesize limit)

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float code str125 Country double Year float(gini_std_WIID lag_gini_std_WIID) double top10sharePost float lag_top10sharePost double(total_GE health_GE educ_GE defence_GE social_prot_GE GDPpc) float(GDPpc2 logGDPpc logGDPpc2 GDPpc2bis logGDPpc2bis)
1 "Austria"  1995 .3146     . .2654     . 55.8 6.7 5.5  .9 21.9 33790.4849857504 1141796864 10.427935  20.85587 1141796864  20.85587
1 "Austria"  1996     . .3146 .2644 .2654 55.6 6.7 5.5  .9 21.8 34537.7395746518 1192855424 10.449808 20.899616 1192855424 20.899616
1 "Austria"  1997     .     . .2606 .2644 52.3   7 5.4  .9 21.2 35220.8877426259 1240510976 10.469395  20.93879 1240510976  20.93879
1 "Austria"  1998 .3027     . .2698 .2606 52.2 7.1 5.3  .9 20.9 36442.2882333384 1328040320 10.503485  21.00697 1328040320  21.00697
1 "Austria"  1999     . .3027 .2622 .2698 52.1 7.3 5.4  .9 21.1 37664.9432792116 1418647936 10.536485  21.07297 1418647936  21.07297
1 "Austria"  2000     .     . .2737 .2622   51 7.1 5.2  .9 20.8 38842.8905195831 1508770176  10.56728  21.13456 1508770176  21.13456
1 "Austria"  2001 .2896     . .2607 .2737 51.4   7 5.2  .8 20.6 39184.8085972636 1535449216 10.576044  21.15209 1535449216  21.15209
1 "Austria"  2002     . .2896 .2749 .2607 51.1 7.3 5.2  .8 20.9 39636.4826114521 1571050752 10.587505  21.17501 1571050752  21.17501
1 "Austria"  2003     .     . .2917 .2749 51.3 7.4 5.3  .8   21 39815.2219533064 1585251840 10.592005  21.18401 1585251840  21.18401
1 "Austria"  2004  .296     . .2729 .2917 53.7 7.5 4.9  .8 20.6 40651.2266155811 1652522240 10.612784  21.22557 1652522240  21.22557
1 "Austria"  2005 .2991  .296 .2877 .2729 51.2 7.4 4.8  .8 20.2 41281.2707559522 1704143360 10.628164  21.25633 1704143360  21.25633
1 "Austria"  2006 .2876 .2991  .308 .2877 50.4 7.3 4.8  .7 19.9 42496.3510516831 1805939840 10.657173 21.314346 1805939840 21.314346
1 "Austria"  2007 .2968 .2876 .2877  .308 49.2 7.4 4.7  .8 19.4 43937.7128905744 1930522624 10.690528 21.381056 1930522624 21.381056
1 "Austria"  2008  .306 .2968 .2889 .2877 49.9 7.5 4.8  .9 19.6  44440.055889928 1974918528 10.701897 21.403793 1974918528 21.403793
1 "Austria"  2009 .3041  .306 .2716 .2889 54.1 7.8 5.1  .7 21.2 42655.1921304813 1819465472 10.660904  21.32181 1819465472  21.32181
1 "Austria"  2010 .3145 .3041 .2784 .2716 52.8 7.9 5.1  .6 21.4 43334.5089644691 1877879680 10.676704  21.35341 1877879680  21.35341
1 "Austria"  2011 .3038 .3145 .2722 .2784 50.9 7.7   5  .6 20.7 44451.0001918473 1975891456 10.702143 21.404285 1975891456 21.404285
1 "Austria"  2012 .3097 .3038  .254 .2722 51.2 7.7   5  .6 20.9  44549.881698231 1984691968 10.704365  21.40873 1984691968  21.40873
1 "Austria"  2013 .3063 .3097 .2545  .254 51.6 7.8   5  .6 21.3 44299.3781845446 1962434944 10.698726  21.39745 1962434944  21.39745
1 "Austria"  2014  .311 .3063   .27 .2545 52.4 7.9 4.9  .6 21.5 44245.1687398338 1957634944   10.6975    21.395 1957634944    21.395
1 "Austria"  2015 .3069  .311  .256   .27 51.1 8.2 4.9  .6 21.2 44195.8175947748 1953270272 10.696385  21.39277 1953270272  21.39277
1 "Austria"  2016 .3078 .3069 .2718  .256 50.1 8.2 4.9  .6   21 44590.2516278164 1988290560  10.70527  21.41054 1988290560  21.41054
1 "Austria"  2017 .3102 .3078 .2641 .2718 49.3 8.2 4.8  .6 20.6  45281.723399938 2050434432 10.720658 21.441317 2050434432 21.441317
1 "Austria"  2018 .3001 .3102 .2704 .2641 48.8 8.2 4.8  .6 20.2 46188.9665119872 2133420672 10.740497 21.480993 2133420672 21.480993
1 "Austria"  2019 .3101 .3001 .2573 .2704 48.6 8.3 4.8  .6 20.2 46669.7512148406 2178065664 10.750852 21.501703 2178065664 21.501703
1 "Austria"  2020  .304 .3101 .2795 .2573 56.7 9.1 5.1  .6 22.8 43346.4318485062 1878913152  10.67698  21.35396 1878913152  21.35396
2 "Belgium"  1995     .     . .2542     . 52.6   6 5.8 1.5 17.7  31329.892072751  981562112 10.352328 20.704657  981562112 20.704657
2 "Belgium"  1996 .2841     . .2549 .2542 53.1 6.4 5.9 1.4 17.9 31681.9363634502 1003745088 10.363502 20.727003 1003745088 20.727003
2 "Belgium"  1997 .2689 .2841 .2586 .2549 51.6   6 5.8 1.4 17.7 32804.3605402361 1076126080 10.398316  20.79663 1076126080  20.79663
2 "Belgium"  1998     . .2689 .2635 .2586   51 6.1 5.9 1.3 17.3 33376.5748398965 1113995776  10.41561  20.83122 1113995776  20.83122
2 "Belgium"  1999     .     . .2593 .2635 50.5 6.3 5.8 1.3 16.9 34479.9064484695 1188864000 10.448132 20.896263 1188864000 20.896263
2 "Belgium"  2000     .     . .2609 .2593 49.4 6.3 5.5 1.2 16.4 35674.7912530713 1272690688   10.4822   20.9644 1272690688   20.9644
2 "Belgium"  2001 .3315     . .2557 .2609 49.4 6.3 5.7 1.2 16.7 35943.2379602838 1291916416 10.489697 20.979393 1291916416 20.979393
2 "Belgium"  2002     . .3315 .2593 .2557 49.9 6.3 5.8 1.2 17.1 36393.2417665704 1324468096 10.502138 21.004276 1324468096 21.004276
2 "Belgium"  2003     .     . .2514 .2593   51 6.9 5.9 1.1 17.4 36617.3803840227 1340832512 10.508278 21.016556 1340832512 21.016556
2 "Belgium"  2004 .2834     . .2624 .2514 49.3 6.9 5.6 1.1 17.3 37761.2813189981 1425914368  10.53904  21.07808 1425914368  21.07808
2 "Belgium"  2005  .309 .2834 .2623 .2624 51.9 6.8 5.7 1.1 17.1 38426.0519083731 1476561408  10.55649  21.11298 1476561408  21.11298
2 "Belgium"  2006 .2938  .309 .2674 .2623 48.8 6.7 5.6   1 17.1 39147.7635597291 1532547328   10.5751   21.1502 1532547328   21.1502
2 "Belgium"  2007  .291 .2938 .2775 .2674 48.6 6.8 5.5   1 16.8 40290.2277563956 1623302400 10.603865  21.20773 1623302400  21.20773
2 "Belgium"  2008 .2953  .291 .2562 .2775 50.8 7.3 5.7 1.1 17.5 40151.8498740394 1612171008 10.600424  21.20085 1612171008  21.20085
2 "Belgium"  2009 .2844 .2953 .2628 .2562 54.7 7.8 6.1   1 19.2 39025.2207908291 1522967808 10.571963 21.143927 1522967808 21.143927
2 "Belgium"  2010 .2843 .2844 .2518 .2628 53.9 7.8   6   1 18.9 39777.9252767366 1582283392 10.591067 21.182135 1582283392 21.182135
2 "Belgium"  2011 .2849 .2843 .2524 .2518 55.3 7.7 6.2  .9 19.1 39929.0951439359 1594332672  10.59486  21.18972 1594332672  21.18972
2 "Belgium"  2012 .2833 .2849 .2517 .2524 56.5 7.9 6.3  .9 19.6 39975.5736402458 1598046464 10.596024 21.192047 1598046464 21.192047
2 "Belgium"  2013 .2793 .2833   .25 .2517 56.1   8 6.4  .9 20.1 39970.3174974794 1597626240 10.595892 21.191784 1597626240 21.191784
2 "Belgium"  2014 .2787 .2793  .252   .25 55.6   8 6.3  .9 19.8 40421.4207919096 1633891200 10.607115  21.21423 1633891200  21.21423
2 "Belgium"  2015 .2793 .2787 .2566  .252 53.7 7.8 6.3  .8 19.4  41008.296719472 1681680384  10.62153  21.24306 1681680384  21.24306
2 "Belgium"  2016 .2783 .2793 .2563 .2566 53.1 7.6 6.2  .8 19.4 41318.0196388476 1707178752 10.629054  21.25811 1707178752  21.25811
2 "Belgium"  2017 .2772 .2783 .2513 .2563   52 7.6 6.2  .8 19.3 41825.7628316432 1749394432 10.641268 21.282536 1749394432 21.282536
2 "Belgium"  2018 .2731 .2772  .247 .2513 52.2 7.6 6.2  .8 19.3  42403.533592766 1798059648 10.654987 21.309975 1798059648 21.309975
2 "Belgium"  2019 .2714 .2731 .2507  .247 51.8 7.5 6.1  .8 19.3 43065.5150654148 1854638592 10.670478 21.340956 1854638592 21.340956
2 "Belgium"  2020 .2715 .2714 .2489 .2507 59.2 8.8 6.6  .9 22.7 40424.6388411871 1634151424 10.607195  21.21439 1634151424  21.21439
3 "Bulgaria" 1995  .394     . .3513     . 31.7 1.9 3.6 1.3  8.5 4021.72281730795   16174254  8.299465  16.59893   16174254  16.59893
3 "Bulgaria" 1996  .367  .394 .3298 .3513 36.8 1.8 3.3 1.1  8.9 4252.94649834109   18087554  8.355368 16.710735   18087554 16.710735
3 "Bulgaria" 1997  .376  .367 .3201 .3298 31.8 2.1 3.3 1.9  8.6 3674.93181669676   13505124   8.20929  16.41858   13505124  16.41858
3 "Bulgaria" 1998  .355  .376 .3042 .3201 33.9 2.5 3.3 2.9  9.6 3839.71670227527   14743424  8.253154 16.506308   14743424 16.506308
3 "Bulgaria" 1999  .336  .355 .2959 .3042 41.4   5 4.1 2.9   12 3537.10582207864   12511118  8.171064 16.342129   12511118 16.342129
3 "Bulgaria" 2000  .342  .336 .2917 .2959   43 3.8 4.2 3.2 12.7 3717.67704162788   13821123  8.220855  16.44171   13821123  16.44171
3 "Bulgaria" 2001  .343  .342 .3071 .2917 40.6   4 3.6 2.6 12.6 3937.43486493199   15503393  8.278285  16.55657   15503393  16.55657
3 "Bulgaria" 2002   .38  .343 .3309 .3071 39.1 4.8 3.7 2.6 12.7 4260.11613781855   18148590  8.357052 16.714104   18148590 16.714104
3 "Bulgaria" 2003  .361   .38 .3116 .3309 38.6 5.5 4.2 2.5 12.2 4518.87824967162   20420260  8.416019 16.832039   20420260 16.832039
3 "Bulgaria" 2004  .368  .361 .3131 .3116 37.8 5.8   4 1.7   11 4849.54289640561   23518066   8.48664  16.97328   23518066  16.97328
3 "Bulgaria" 2005  .348  .368 .3182 .3131 36.7 4.9 4.1   2 10.6 5230.98370373178   27363190  8.562355  17.12471   27363190  17.12471
3 "Bulgaria" 2006   .32  .348 .3173 .3182 33.7 3.8 3.6 1.6 10.8 5629.41986656582   31690368  8.635761 17.271523   31690368 17.271523
3 "Bulgaria" 2007  .361   .32 .2975 .3173 37.7   4 3.6 1.4 10.1 6044.75941177202   36539116  8.706947 17.413895   36539116 17.413895
3 "Bulgaria" 2008 .3615  .361  .294 .2975 37.1 4.4   4 1.2 10.7 6459.64091191502   41726960  8.773329 17.546658   41726960 17.546658
3 "Bulgaria" 2009 .3361 .3615 .3139  .294 39.3   4 4.1 1.2 12.9 6288.67740963612   39547464  8.746506 17.493011   39547464 17.493011
3 "Bulgaria" 2010 .3384 .3361 .3158 .3139 36.1 4.4 3.6 1.7 12.9 6427.81006212601   41316744  8.768389 17.536777   41316744 17.536777
3 "Bulgaria" 2011 .3565 .3384   .32 .3158 33.7 4.1 3.4 1.2 12.2 6605.08982045823   43627212  8.795596 17.591192   43627212 17.591192
3 "Bulgaria" 2012 .3432 .3565 .3409   .32 34.3 4.4 3.3   1 12.3 6693.61236429071   44804448  8.808909 17.617819   44804448 17.617819
3 "Bulgaria" 2013 .3601 .3432 .3238 .3409 37.8 4.5 3.7 1.2 13.5 6693.45015260161   44802276  8.808885  17.61777   44802276  17.61777
3 "Bulgaria" 2014 .3657 .3601 .3398 .3238 43.2 5.5 4.1 1.3 13.3 6796.68911922266   46194984  8.824191 17.648382   46194984 17.648382
3 "Bulgaria" 2015 .3742 .3657 .3458 .3398 40.4 5.3 3.9 1.3   13 7074.68102325059   50051112  8.864278 17.728556   50051112 17.728556
3 "Bulgaria" 2016 .3857 .3742 .3827 .3458 34.8   5 3.4 1.1 12.3 7341.04761379451   53890980  8.901237 17.802473   53890980 17.802473
3 "Bulgaria" 2017 .4059 .3857 .3945 .3827 34.8 4.8 3.5 1.1 12.3 7599.12495683519   57746700  8.935788 17.871576   57746700 17.871576
3 "Bulgaria" 2018 .4038 .4059 .3999 .3945 36.9   5 3.5 1.2   12 7859.67803704994   61774540  8.969501 17.939001   61774540 17.939001
3 "Bulgaria" 2019 .4128 .4038 .3759 .3999 35.5 4.6 3.8 1.2 11.4 8234.78133356177   67811624  9.016122 18.032244   67811624 18.032244
3 "Bulgaria" 2020 .4027 .4128 .3765 .3759 41.8 5.9   4 1.5 13.1 7920.91133027298   62740836  8.977262 17.954523   62740836 17.954523
4 "Croatia"  1995     .     . .2682     .   52 7.8 6.3 2.6 13.1 7263.57088408122   52759460  8.890627 17.781254   52759460 17.781254
4 "Croatia"  1996     .     . .2776 .2682 50.1 7.2 5.9 2.3 13.2 7803.92196833084   60901200  8.962381 17.924763   60901200 17.924763
4 "Croatia"  1997     .     . .2806 .2776 48.8 6.9 5.7 2.3 12.7  8322.9639118574   69271728  9.026773 18.053547   69271728 18.053547
4 "Croatia"  1998 .2875     . .2762 .2806 50.8 7.2 6.2 2.2 14.4  8511.7818579482   72450432  9.049207 18.098413   72450432 18.098413
4 "Croatia"  1999 .2771 .2875 .2911 .2762 52.7 7.4 6.4 2.3 14.9 8473.71054411562   71803768 9.0447235 18.089447   71803768 18.089447
4 "Croatia"  2000 .3133 .2771 .2781 .2911 49.3 6.7 5.7   2 13.4 8805.50039515219   77536840  9.083132 18.166264   77536840 18.166264
4 "Croatia"  2001  .311 .3133 .2756 .2781 49.4 6.2 5.5 1.8 14.8  9428.4107203318   88894928  9.151483 18.302965   88894928 18.302965
4 "Croatia"  2002     .  .311 .2631 .2756 50.3 6.2   6 1.6 17.3 9962.04486657184   99242336  9.206537 18.413074   99242336 18.413074
4 "Croatia"  2003   .29     .   .26 .2631 49.6 6.1 5.7 1.4 14.8  10509.549672237  110450632  9.260039 18.520079  110450632 18.520079
4 "Croatia"  2004 .2968   .29 .2664   .26 48.6 6.4 5.4 1.6 14.8  10942.572413249  119739888  9.300416 18.600832  119739888 18.600832
4 "Croatia"  2005     . .2968  .279 .2664 46.5 6.2 5.1 1.4 13.7 11399.7753370191  129954880  9.341349 18.682697  129954880 18.682697
4 "Croatia"  2006     .     . .2784  .279 46.2 5.9 4.8 1.3 14.1 11959.9232084277  143039760  9.389317 18.778633  143039760 18.778633
4 "Croatia"  2007     .     . .2828 .2784 45.8 5.9 4.6 1.4 13.5 12550.2339482454  157508368  9.437494 18.874989  157508368 18.874989
4 "Croatia"  2008 .3371     . .2781 .2828 45.9 5.6 4.3 1.6 13.5 12789.6357975779  163574784   9.45639  18.91278  163574784  18.91278
4 "Croatia"  2009     . .3371 .2675 .2781 48.9 6.2 4.5 1.5   15 11870.7940008389  140915744  9.381836 18.763672  140915744 18.763672
4 "Croatia"  2010 .3256     . .2704 .2675 48.5 6.2 4.6 1.5   15 11748.9448312757  138037712  9.371519 18.743038  138037712 18.743038
4 "Croatia"  2011  .324 .3256 .2797 .2704 48.9 6.2 4.6 1.5 15.3 11779.5299874137  138757328  9.374119 18.748238  138757328 18.748238
4 "Croatia"  2012 .3231  .324 .2715 .2797 48.4 6.4 4.7 1.5   15 11546.7017379524  133326320  9.354156 18.708311  133326320 18.708311
4 "Croatia"  2013  .325 .3231 .2798 .2715 48.3 6.3 4.8 1.3 14.9 11536.8601564677  133099144  9.353302 18.706604  133099144 18.706604
4 "Croatia"  2014 .3202  .325 .2842 .2798 48.7 6.4 4.9 1.2 15.3   11543.87627967  133261080   9.35391  18.70782  133261080  18.70782
4 "Croatia"  2015 .3212 .3202 .2869 .2842 48.2 6.6 4.9 1.3 15.4 11933.3773788281  142405488 9.3870945 18.774189  142405488 18.774189
4 "Croatia"  2016 .3115 .3212 .2715 .2869 46.9 6.4 4.7 1.1 14.3 12441.5025378073  154790992  9.428793 18.857586  154790992 18.857586
end

Would you have an idea for my problem?

Thanks a lot for your time!
Melanie