Monday, February 28, 2022

Graph of event study for multiple regressions using eventdd package

Using the following command and using the data, you can use an event study graph for only one set of regressions, but I want to create the same graph for multiple regressions. So, I can show how for different groups the treatment is behaving differently ( like for low-education and high education people the treatment will have a different impact). With the following code, there will be one set of regression for either: low-education or high education. But, I want to edit the following code so that I can show in the same graph how both groups are behaving since the effect of treatment and before treatment. Right now my graph looks like this ( the attached image )

BUT I WANT TO EDIT MY CURRENT CODE SO THAT FOR MULTIPLE GROUPS I CAN USE THE SAME REGRESSION TO PRODUCE THE GRAPH I'M GOING TO ATTACH IN THE COMMENT.

Code:
*** ssc install eventdd

use "http://www.damianclarke.net/stata/bacon_example.dta", clear
gen timeToTreat = year - _nfd
eventdd asmrs pcinc asmrh cases i.year i.stfips, timevar(timeToTreat) method(ols, cluster(stfips))

*Then storing the estimates

estimates store leads_lags

#delimit ;

coefplot leads_lags, keep(lead5 lead4 lead3 lead2 lead1 lag_0 lag1 lag2 lag3 lag4 lag5plus)
vertical title( "{stSerif:{bf:Figure 1. {it:Trends in On-Premises Alcohol Licenses}}}", color(black) size(large))
         xtitle("{stSerif:Years Since Law Came into Effect}") xscale(titlegap(2)) xline(6, lcolor(black))
yline(-.2 0 .2 .4 .6, lwidth(vvvthin) lpattern(dash) lcolor(black))
note("{stSerif:{it:Notes}. OLS coefficient estimates (and their 95% confidence intervals) are reported. The dependent}"
     "{stSerif:variable is equal to the number of on-premises liquor licences per 1,000 population in county {it:c}}"
     "{stSerif: and year {it:t}. The controls include year fixed effects and the data cover the period 1977-2011.}", margin(small))
graphregion(fcolor(white) lcolor(white) lwidth(vvvthin) ifcolor(white) ilcolor(white)  
ilwidth(vvvthin)) ciopts(lwidth(*3) lcolor(black)) mcolor(black) ;

#delimit cr

graph export "Anderson_Fig1_v1.png", as(png) replace

Correlation between variables

Hello Statalists,
Please, could anyone help with the correct command?
I have two datasets measuring the weight of a product using two different methods. I want to know the correlation between this two variables called 'grams1' and 'grams2'.
The two data sets have the same id variable, the same time var and the grams var differ between both data. The two data sets samples and my codes are below. Is this the correct way to assess the correlation between this two variables? What I expect is that the correlation is very high indicating that both weighting methods are similar and yield close weight in grams.
Thank you!

Grams1.dta

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(id grams1 time)
11 13 1
11 12 2
11 19 3
22 21 1
22 10 2
22 15 3
33 35 1
33 29 2
33 20 3
end

Grams2.dta

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(id grams2 time)
11   21 1
11   12 2
11   18 3
22 21.5 1
22   11 2
22 14.7 3
33   36 1
33   29 2
33 20.6 3
end

I then tried to do the following:
Code:
use grams1,clear
sort id time
merge 1:1 id time using grams2
I get the following merged dataset:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(id grams1 time grams2) byte _merge
11 13 1   21 3
11 12 2   12 3
11 19 3   18 3
22 21 1 21.5 3
22 10 2   11 3
22 15 3 14.7 3
33 35 1   36 3
33 29 2   29 3
33 20 3 20.6 3
end
label values _merge _merge
label def _merge 3 "matched (3)", modify
And finally I did:
Code:
pwcorr grams1 grams2, star(0.01)

Finding sum per row based on conditions

Members,

I have a dataset with multiple columns (var1, var2, var3, etc.) with missing observations. Per row, I want to find the sum of the values for var1, var2, var3, etc. that come after a missing value. For instance, if var1 is missing, I would only take the sum of var2, var3, etc. If var2 is missing, I would only take the sum of var3, var4, etc.

I would appreciate any assistance with this!

Thanks,
A

extract letters from words

Dear All, Suppose that I have this data set (with variable "Journal_e":
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str63 Journal_e str15 wanted1 str20 wanted
"Shanghai Insurance Monthly"                              "SIM"     "ShInMo"        
"Journal of Shanghai University of Finance and Economics" "JoSUoFE" "JoofShUnofFiEc"
"World Agriculture"                                       "WA"      "WoAg"          
"The Journal of World Economy"                            "TJoWE"   "ThJoofWoEc"    
"Forum of World Economics & Politics"                     "FoWE&P"  "FoofWoEc&Po"   
end
I wish to extract the first one and first two letters of each word to construct `wanted1' and `wanted2'. Any suggestions? Thanks.

Heteroskedasticity - Breusch–Pagan test

Dear Stata community!


I am running the three Breusch–Pagan tests versions most common in Stata (estat hettest estat hettest, iid estat hettest, fstat). While for the estat hettest there seems to be heteroskedasticity, the other two commands (to my understanding they appear more generalised forms of BP test) show no evidence of heteroskedasticity. I cannot understand why and what to do to be honest... which of the three test should I go for (i have a stock market event study and analysing the corss-section returns).


Thank you for your explanations!
Gabriele

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity
Assumption: Normal error terms
Variable: Fitted values of CAR1

H0: Constant variance

chi2(1) = 9.71
Prob > chi2 = 0.0018

. estat hettest, iid

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity
Assumption: i.i.d. error terms
Variable: Fitted values of CAR1

H0: Constant variance

chi2(1) = 1.81
Prob > chi2 = 0.1781


using collect to place frequency distributions from two variables next to each other

Dear Stata Forum,

I would like to use the collect command to make a table that has two frequency distributions (from two different variables) next to each other.
Below I have started an example that would place the actual repair records of cars in the auto data set next to their expected repair records (a new variable I made) so the two distributions can be easily compared.

I have used the table command to show and collect the percentages I want in each column. However, I have been unable to place the two sets of percentages next to each other in one table. I think the "collect layout" command and the dimension called "across" might be useful here, but I can't seem to get it right.

Any suggestions would be appreciated.

Thanks,

Jeremy


Code:
sysuse auto, clear

*actual repair record
tab rep78

*expected repair record
gen rep78_exp = rep78+(round( runiform(0,1)))
    replace rep78_exp = 5 if rep78_exp==6
label var rep78_exp "expected repair record"
tab1 rep78 rep78_exp

*separate tables for actual and expected repair record
collect clear
table rep78, stat(percent)
table rep78_exp, stat(percent) append

*one table: col1 = expected repair record; col2=actual repair record
collect dims
collect label list across

Working with dates

Hello,
I am attempting to turn a string value "2020-2021" into a date value. Nothing I am doing is actually working, its only returning 4.490e+21 as the output. I have attached the code below:

gen date2=date("2020-2021", "YY")

format date2 %ty

display date2


I understand that the issue may lie in STATA using a reference date in 1960 for the date() function. I have no idea how to mediate this.

Stock variable problem

Dear Statalist, I am trying to address the perpetual method in order to build a stock variable. Basically, first it generates a initial stock (first year), and then it sum up on the following values accounting for a depreciation rate. However, the problem is when the average growth rate is negative (and higher than the depreciation rate which I set to be 0.05) ending in a negative stock variable. It is advised to use geometric mean instead of arithmetic one. However, I am not sure if I am calculating right the growth rate as it is, or if I should I multiply it by 100 (in the first line of the first loop). When multiplying by 100, the stock is positive, but the geometric average is higher than the arithmetic one (which should not be).

Have anybody any advice?

Code:
foreach x in new_sumx1   {  
        bys sic (year): gen g_`x' = (( `x' - L.`x')/L.`x')
        egen ameang_`x' = mean( g_`x' ), by(sic)
        egen gmeang_`x' = mean(log(1 + g_`x')), by(sic) 
        replace gmeang_`x' = exp(gmeang_`x') - 1    
        gen stock_`x'g = ( `x' /(0.05 + gmeang_`x' )) if year==2004
        gen stock_`x'a = ( `x' /(0.05 + ameang_`x' )) if year==2004
        }

sort sic year
brow sic year new_sumx1 g_new_sumx1 gmeang_new_sumx1 stock_new_sumx1g ameang_new_sumx1 stock_new_sumx1a

foreach x in new_sumx1  {  
           forvalues i = 2005/2016 {
           bys sic: replace stock_`x'a = (1-0.05)*L.stock_`x'a + `x' if year==`i'
           bys sic: replace stock_`x'g = (1-0.05)*L.stock_`x'g + `x' if year==`i'
           }
           }
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte sic int year float new_sumx1
 5 2004   2001276
 5 2005  43234.54
 5 2006 76543.516
 5 2007 144225.97
 5 2008  374271.3
 5 2009  518562.8
 5 2010  657756.1
 5 2011 28355.887
 5 2012 113218.86
 5 2013  3358.796
 5 2014         .
 5 2015         .
 5 2016 1659679.3
10 2004  27806592
10 2005  20727666
10 2006  23789176
10 2007  18034548
10 2008  25033572
10 2009  28634690
10 2010  28286054
10 2011  19459068
10 2012  17952214
10 2013  18554806
10 2014   9926886
10 2015  11262449
10 2016  10461789
end

Panel data OLS time and ID restrictions.

Hello

I want to make an OLS regression in Stata my regression will be

Code:
regress  regress interestrate    lag_interestrate        inf_diff_quarterly      GDP4            if      ID==5
My data is xtset in panel format by

Code:
xtset ID DATE
My date is in quarterly format from 1997q1 to 2019q4 and I am trying to limit my time (want to regress for a specified time period). İs there way that ı can make regression for both ID restriction and also for example from 1997q1 to 2005q4.
Thank in advance

Error (3200) in meglm, when using X##X, but no error with X and X_squared

Dear Statalist members,

I am using Stata 17 MP (duly updated).

The models and the data

I am fitting negative binomial (NB2) 3-levels growth curve models of weekly counts, using meglm.

The nesting structure is : 30 occasions (weeks) < 1710 settlements < 96 districts. An extract (1 settlement 30 weeks) of the data is provided below.

A) Time-varying predictors :
at level-1
-weeknum (the ordinal number of a week in the series of 30)
-weektp (week type: four categories entered as 3 dummies)
-tclass (eight ordered categories entered as 7 dummies)

at level-3
-L1weekstock
-L1weekmob

B) Time-invariant predictors: all other.

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input double weeknumber float distrid long settlid float(countinf std2_weeknum std2_weeknumsqr weektp tclass std2_L1weekstock std2_L1weekmob std2_pcrural std2_pc65yover std2_pcforeign std2_medinc milp17)
 1 74 1   0  -.8376152   .7015992 1 2  -.3326102   -.8563091 -.39327615 -.6413893 .4492839 .4750123 27.122
 2 74 1   0  -.7798486   .6081639 1 1  -.3356403   -.1969749 -.39327615 -.6413893 .4492839 .4750123 27.122
 3 74 1   0  -.7220821  .52140254 1 1  -.3348828    -.335782 -.39327615 -.6413893 .4492839 .4750123 27.122
 4 74 1   0  -.6643155   .4413151 1 1  -.3375342   .05461322 -.39327615 -.6413893 .4492839 .4750123 27.122
 5 74 1   0   -.606549  .36790165 1 1  -.3356403   .14136773 -.39327615 -.6413893 .4492839 .4750123 27.122
 6 74 1   0 -.54878235   .3011621 1 2  -.3322314    .1587186 -.39327615 -.6413893 .4492839 .4750123 27.122
 7 74 1   0  -.4910158  .24109654 2 3  -.3363979   .21944684 -.39327615 -.6413893 .4492839 .4750123 27.122
 8 74 1   0  -.4332492   .1877049 2 2   -.334504   .05461322 -.39327615 -.6413893 .4492839 .4750123 27.122
 9 74 1   0  -.3754827  .14098725 2 4  -.3299588   -.4138612 -.39327615 -.6413893 .4492839 .4750123 27.122
10 74 1   3  -.3177161  .10094354 2 4 -.29624802  .002560559 -.39327615 -.6413893 .4492839 .4750123 27.122
11 74 1   0 -.25994954  .06757376 2 3  -.2568557  -.24035217 -.39327615 -.6413893 .4492839 .4750123 27.119
12 74 1   6   -.202183  .04087796 2 3 -.26594624   -.6307475 -.39327615 -.6413893 .4492839 .4750123 27.119
13 74 1   6  -.1444164   .0208561 2 4  -.2716278   -.8042565 -.39327615 -.6413893 .4492839 .4750123 27.113
14 74 1  14 -.08664985 .007508196 2 5 -.24928026    -.604721 -.39327615 -.6413893 .4492839 .4750123 27.107
15 74 1   3 -.02888328 .000834244 3 4  -.2326143  -.17094846 -.39327615 -.6413893 .4492839 .4750123 27.093
16 74 1   6  .02888328 .000834244 3 4  -.2299629   .08931507 -.39327615 -.6413893 .4492839 .4750123  27.09
17 74 1   6  .08664985 .007508196 3 5  -.1913281 -.006114925 -.39327615 -.6413893 .4492839 .4750123 27.084
18 74 1  14   .1444164   .0208561 3 4 -.19890356    .3669295 -.39327615 -.6413893 .4492839 .4750123 27.078
19 74 1  14    .202183  .04087796 3 5  -.1966309    .4450085 -.39327615 -.6413893 .4492839 .4750123 27.064
20 74 1  14  .25994954  .06757376 3 5 -.17579845   .59249115 -.39327615 -.6413893 .4492839 .4750123  27.05
21 74 1  41   .3177161  .10094354 3 6 -.02088059    .6185175 -.39327615 -.6413893 .4492839 .4750123 27.036
22 74 1 136   .3754827  .14098725 3 8   .4128136   .59249115 -.39327615 -.6413893 .4492839 .4750123 26.995
23 74 1 272   .4332492   .1877049 3 8   1.637763 -.032141294 -.39327615 -.6413893 .4492839 .4750123 26.859
24 74 1 272   .4910158  .24109654 4 8   3.026342   -.1969749 -.39327615 -.6413893 .4492839 .4750123 26.587
25 74 1 136  .54878235   .3011621 4 8   3.214592  -.01479041 -.39327615 -.6413893 .4492839 .4750123 26.315
26 74 1 136    .606549  .36790165 4 8  1.8313158   -.3878348 -.39327615 -.6413893 .4492839 .4750123 26.179
27 74 1  68   .6643155   .4413151 4 6   1.193842   .01123596 -.39327615 -.6413893 .4492839 .4750123 26.043
28 74 1  68   .7220821  .52140254 4 6  .54235375   .09799048 -.39327615 -.6413893 .4492839 .4750123 25.975
29 74 1  68   .7798486   .6081639 4 6   .3870571    .2367977 -.39327615 -.6413893 .4492839 .4750123 25.907
30 74 1  41   .8376152   .7015992 4 8     .26585   .28017497 -.39327615 -.6413893 .4492839 .4750123 25.839
end
The categorical predictors excepted, all predictors are 2 SD standardized: grand-mean centered then divided by 2 standard deviations.
(Gelman A 2008 “Scaling regression inputs by dividing by two standard deviations” Statistics in Medicine).


The problem:

-A) when I use the # operator to create quadratic terms (weeknum##weeknum or weeknum#weeknum), the final iteration returns the error message shown below.
Code:
meglm countinf c.std2_weeknum##c.std2_weeknum ib2.weektp ib2.weektp#c.std2_weeknum ib2.weektp#c.std2_weeknum#c.std2_weeknum ///
ib1.tclass std2_pcrural ib1.tclas s#c.std2_pcrural std2_L1weekstock std2_L1weekmob std2_pc65yover std2_pcforeign std2_medinc ///
, exposure(milp17) || distrid : || settlid: c.std2_weeknum##c.std2_weeknum ///
, difficult family(nbinomial mean) link(log) cov(unstructured) vce(robust)
or alternatively
Code:
meglm countinf std2_weeknum c.std2_weeknum#c.std2_weeknum ib2.weektp ib2.weektp#c.std2_weeknum ib2.weektp#c.std2_weeknum#c.std2_weeknum ///
ib1.tclass std2_pcrural ib1.tclass#c.std2_pcrural std2_L1weekstock std2_L1weekmob std2_pc65yover  std2_pcforeign  std2_medinc ///
, exposure(milp17) || distrid : || settlid: std2_weeknum c.std2_weeknum#c.std2_weeknum ///
, difficult family(nbinomial mean) link(log) cov(unstructured) vce(robust)
the iteration log ends with:
Code:
Iteration 11:  log pseudolikelihood = -131293.52  (not concave)
Iteration 12:  log pseudolikelihood = -131285.01  (not concave)
Iteration 13:  log pseudolikelihood = -131277.98  
Iteration 14:  log pseudolikelihood = -131272.86  
Iteration 15:  log pseudolikelihood =  -131271.5  
Iteration 16:  log pseudolikelihood = -131270.98  
Iteration 17:  log pseudolikelihood = -131270.98  
                       *:  3200  conformability error
     _gsem_ereturn__sd():     -  function returned error
         _gsem_ereturn():     -  function returned error
       st_gsem_ereturn():     -  function returned error
                 <istmt>:     -  function returned error
r(3200);

end of do-file

r(3200);

- B) when I do not use #, but instead create X^2 (weeknumsqr), the estimation procedure ends normally,
Code:
meglm countinf std2_weeknum std2_weeknumsqr ib2.weektp ib2.weektp#c.std2_weeknum ib2.weektp#c.std2_weeknumsqr ///
ib1.tclass std2_pcrural ib1.tclass#c.std2_pcrural std2_L1weekstock std2_L1weekmob std2_pc65yover  std2_pcforeign  std2_medinc ///
, exposure(milp17) || distrid : || settlid:  std2_weeknum std2_weeknumsqr ///
, difficult family(nbinomial mean) link(log) cov(unstructured) vce(robust)
leading to the following results:
Code:
Iteration 11:  log pseudolikelihood = -131293.52  (not concave)
Iteration 12:  log pseudolikelihood = -131285.01  (not concave)
Iteration 13:  log pseudolikelihood = -131277.98  
Iteration 14:  log pseudolikelihood = -131272.86  
Iteration 15:  log pseudolikelihood =  -131271.5  
Iteration 16:  log pseudolikelihood = -131270.98  
Iteration 17:  log pseudolikelihood = -131270.98  

Mixed-effects GLM                               Number of obs     =     51,300
Family: Negative binomial
Link:   Log
Overdispersion: mean

        Grouping information
        -------------------------------------------------------------
                        |     No. of       Observations per group
         Group variable |     groups    Minimum    Average    Maximum
        ----------------+--------------------------------------------
                distrid |         96         90      534.4      2,670
                settlid |      1,710         30       30.0         30
        -------------------------------------------------------------

Integration method: mvaghermite                 Integration pts.  =          7

                                                Wald chi2(31)     =   29843.31
Log pseudolikelihood = -131270.98               Prob > chi2       =     0.0000
                                                    (Std. err. adjusted for 96 clusters in distrid)
---------------------------------------------------------------------------------------------------
                                  |               Robust
                         countinf | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
----------------------------------+----------------------------------------------------------------
                     std2_weeknum |   3.298914   .7736116     4.26   0.000     1.782663    4.815165
                  std2_weeknumsqr |  -5.317793   1.640426    -3.24   0.001    -8.532969   -2.102617
                                  |
                           weektp |
                               1  |  -3.454767   2.946621    -1.17   0.241    -9.230038    2.320504
                               3  |  -.3165163   .0706098    -4.48   0.000     -.454909   -.1781236
                               4  |   4.843987   .4872818     9.94   0.000     3.888932    5.799042
                                  |
            weektp#c.std2_weeknum |
                               1  |   -3.15082   8.370738    -0.38   0.707    -19.55716    13.25553
                               3  |  -2.463644   .7949061    -3.10   0.002    -4.021631   -.9056565
                               4  |  -14.65873   1.526555    -9.60   0.000    -17.65073   -11.66674
                                  |
         weektp#c.std2_weeknumsqr |
                               1  |   7.216591   5.968627     1.21   0.227    -4.481703    18.91488
                               3  |   10.91105   2.087026     5.23   0.000     6.820551    15.00154
                               4  |   11.94664   1.911539     6.25   0.000     8.200094    15.69319
                                  |
                           tclass |
                               2  |   1.579361   .1812937     8.71   0.000     1.224032     1.93469
                               3  |    2.36908   .1987663    11.92   0.000     1.979505    2.758655
                               4  |   2.763744   .1977046    13.98   0.000      2.37625    3.151238
                               5  |   3.076674   .2004631    15.35   0.000     2.683773    3.469574
                               6  |   3.453519   .2009128    17.19   0.000     3.059737    3.847301
                               7  |   3.750355   .2029003    18.48   0.000     3.352678    4.148032
                               8  |   4.095897   .2064415    19.84   0.000     3.691279    4.500515
                                  |
                     std2_pcrural |    .624201   .2692187     2.32   0.020      .096542     1.15186
                                  |
            tclass#c.std2_pcrural |
                               2  |  -.7134393   .2503491    -2.85   0.004    -1.204115    -.222764
                               3  |  -.7459787   .2489973    -3.00   0.003    -1.234005   -.2579529
                               4  |  -.8330824    .251747    -3.31   0.001    -1.326498   -.3396672
                               5  |  -.7421679   .2588821    -2.87   0.004    -1.249568   -.2347682
                               6  |  -.6488989   .2634231    -2.46   0.014    -1.165199   -.1325992
                               7  |  -.5745428   .2704703    -2.12   0.034    -1.104655   -.0444308
                               8  |  -.5618221     .28133    -2.00   0.046    -1.113219   -.0104255
                                  |
                 std2_L1weekstock |   .2227779   .0439275     5.07   0.000     .1366815    .3088743
                   std2_L1weekmob |   .1751561   .0303288     5.78   0.000     .1157127    .2345994
                   std2_pc65yover |  -.0672125   .0224725    -2.99   0.003    -.1112578   -.0231672
                   std2_pcforeign |   .2212634   .0261615     8.46   0.000     .1699877     .272539
                      std2_medinc |   .0396633   .0216403     1.83   0.067    -.0027508    .0820775
                            _cons |  -4.043469   .2109965   -19.16   0.000    -4.457014   -3.629923
                       ln(milp17) |          1  (exposure)
----------------------------------+----------------------------------------------------------------
                         /lnalpha |  -1.695915   .0367845                     -1.768011   -1.623818
----------------------------------+----------------------------------------------------------------
distrid                           |
                        var(_cons)|   .0587462   .0118504                      .0395616    .0872341
----------------------------------+----------------------------------------------------------------
distrid>settlid                   |
                 var(std2_weeknum)|   1.195491   .1552879                      .9267871    1.542101
              var(std2_weeknumsqr)|   2.423071   .2574262                      1.967591    2.983991
                        var(_cons)|   .1038953   .0107819                      .0847737    .1273299
----------------------------------+----------------------------------------------------------------
distrid>settlid                   |
 cov(std2_weeknum,std2_weeknumsqr)|  -1.232618   .1880016    -6.56   0.000    -1.601095   -.8641421
           cov(std2_weeknum,_cons)|  -.1914082   .0298032    -6.42   0.000    -.2498215    -.132995
        cov(std2_weeknumsqr,_cons)|  -.0505272   .0422252    -1.20   0.231    -.1332871    .0322327
---------------------------------------------------------------------------------------------------

Specifying simpler models or more complex ones does not help.

I would like to use the # operator and then "margins" extentively.

What mistakes do I make?






Sunday, February 27, 2022

How to get price spells ?

I have Monthly prices data of many goods for 10 years. I need to get the price spells i.e. the duration for which the prices remain unchanged. What commands should I use to get the price spells ? Also I need to study the relationship between price spells and other covariates ? How to use survival analysis on this kind of data. Please help.

A squared term is negative and significant but the graph is not.

A squared term is negative and statistically significant, suggesting an inverted-U relationship as hypothesized, but the graph for the predicted probability of the variable indicates a linear relationship. What does it mean and what should I do?

Help with interpreting regression results

I run this code in Stata 17
Code:
reg sbpd ib6.group i.black_or_not ib6.group#i.black_or_not i.sex age stratum urpot, cformat(%9.2f) vce(cluster clinic)
testparm i.black_or_not#i.group
contrast r.black_or_not@group, cformat(%9.2f)
And the result is displayed below:

Code:
Linear regression                               Number of obs     =        884
                                                F(2, 3)           =          .
                                                Prob > F          =          .
                                                R-squared         =     0.1284
                                                Root MSE          =     10.644

                                          (Std. err. adjusted for 4 clusters in clinic)
---------------------------------------------------------------------------------------
                      |               Robust
                 sbpd | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
----------------------+----------------------------------------------------------------
                group |
          Acebutolol  |      -8.00       1.87    -4.27   0.024       -13.96       -2.04
          Amlodipine  |      -6.47       1.30    -4.97   0.016       -10.60       -2.33
      Chlorthalidone  |      -7.51       2.29    -3.28   0.046       -14.80       -0.22
           Doxazosin  |      -5.61       1.24    -4.52   0.020        -9.56       -1.66
           Enalapril  |      -5.77       2.56    -2.26   0.109       -13.90        2.37
                      |
         black_or_not |
               Black  |       3.10       1.55     2.01   0.139        -1.82        8.02
                      |
   group#black_or_not |
    Acebutolol#Black  |       0.27       0.70     0.39   0.722        -1.95        2.50
    Amlodipine#Black  |      -0.71       0.81    -0.88   0.444        -3.29        1.86
Chlorthalidone#Black  |      -4.16       2.25    -1.84   0.162       -11.33        3.02
     Doxazosin#Black  |       3.46       2.26     1.53   0.223        -3.74       10.67
     Enalapril#Black  |       1.35       4.73     0.29   0.793       -13.71       16.42
                      |
                2.sex |      -0.55       0.49    -1.12   0.345        -2.10        1.01
                  age |      -0.27       0.05    -5.71   0.011        -0.42       -0.12
              stratum |       2.15       0.65     3.31   0.045         0.08        4.22
                urpot |      -0.04       0.05    -0.82   0.471        -0.21        0.13
                _cons |       2.46       3.59     0.68   0.543        -8.95       13.86
---------------------------------------------------------------------------------------

.
. testparm i.black_or_not#i.group

 ( 1)  1.group#2.black_or_not = 0
 ( 2)  2.group#2.black_or_not = 0
 ( 3)  3.group#2.black_or_not = 0
 ( 4)  4.group#2.black_or_not = 0
 ( 5)  5.group#2.black_or_not = 0
       Constraint 1 dropped
       Constraint 2 dropped

       F(  3,     3) =   15.80
            Prob > F =    0.0242

.
. contrast r.black_or_not@group, cformat(%9.2f)

Contrasts of marginal linear predictions

Margins: asbalanced

------------------------------------------------------------------------
                                     |         df           F        P>F
-------------------------------------+----------------------------------
                  black_or_not@group |
    (Black vs Not Black) Acebutolol  |          1        7.58     0.0705
    (Black vs Not Black) Amlodipine  |          1        6.64     0.0820
(Black vs Not Black) Chlorthalidone  |          1        0.34     0.6012
     (Black vs Not Black) Doxazosin  |          1       27.40     0.0136
     (Black vs Not Black) Enalapril  |          1        1.52     0.3059
       (Black vs Not Black) Placebo  |          1        4.02     0.1386
                              Joint  |          3       19.14     0.0185
                                     |
                         Denominator |          3
------------------------------------------------------------------------

--------------------------------------------------------------------------------------
                                     |   Contrast   Std. err.     [95% conf. interval]
-------------------------------------+------------------------------------------------
                  black_or_not@group |
    (Black vs Not Black) Acebutolol  |       3.37       1.23         -0.53        7.28
    (Black vs Not Black) Amlodipine  |       2.39       0.93         -0.56        5.34
(Black vs Not Black) Chlorthalidone  |      -1.06       1.81         -6.82        4.71
     (Black vs Not Black) Doxazosin  |       6.57       1.25          2.57       10.56
     (Black vs Not Black) Enalapril  |       4.46       3.62         -7.06       15.97
       (Black vs Not Black) Placebo  |       3.10       1.55         -1.82        8.02
--------------------------------------------------------------------------------------

.

I am surprised to find out that the interaction for the main effect of race "Black" is the same coefficient as the contrast of "Black vs. not Black Placebo". Both are highlighted in red text above. Any guidance on my code and/or output will be appreciated.

Al Bothwell

Case control matching with age, gender and BMI

Hi
I am trying to match data by gender, age range +/- 5 years and BMI +/- 3. With the code below it matches but it is including BMI values outside the +/- 3 range for some matches. Could some see what is wrong with this code? Thanks


clear

** creating matched data for age (+/- 5), gender(exact match) and BMI (-/+ 3)

******************************Data preparation task********************************************** ********

use "F:\OSA data\Latestcode\AllwithOSAdata_191121latest.dta"

** create cases subset
keep if id_casecntrl==1
keep if flag==1
rename id id_case
save "F:\OSA data\Latestcode\Cases1.dta", replace

** create controls subset
use "F:\OSA data\Latestcode\AllwithOSAdata_191121latest.dta"
keep if id_casecntrl==2
rename id id_cntl
save "F:\OSA data\Latestcode\Controls1.dta", replace

gen rand = runiform()
sort rand
drop rand

save "F:\OSA data\Latestcode\Controls2.dta", replace

*rename * *_cntl
*rename id_cntl id
*duplicates drop id, force
*save "C:\Users\venka\Desktop\NSWHealth\Venkatesha - Consults\1970 - Premala Sureshkumar\Controls3.dta", replace

******************************End of Data preparation task********************************************** ********


*Read the cases data file. Replace the file path of the data set appropraitely in the program
use "F:\OSA data\Latestcode\Cases1.dta"

* matching (exact) on Gender, within +/- 5 years for age
compress
rangejoin ageatvisit -5 5 using "F:\OSA data\Latestcode\Controls2.dta", by (gender)

order id_case id_cntl gender ageatvisit
drop *_U

gen rand = runiform()
sort rand
drop rand

*rename *_U *_cntl
*rename id id_cases
*sort id_cases
*drop if id_casecntrl_cntl==.

*use matched control only twice for each matched case(preserving 1:2 case : control ratio)
*bysort id_cases: keep if _n <= 2

*Check how many controls were found for every case
*bysort id_cases: gen byte numcontrols = _N if _n == 1
*tab numcontrols
*drop if numcontrols == 1
*drop numcontrols

** Matching on age and gender is complete.

*rename id_cntl id
*drop *_cntl

*gen rand = runiform()
*sort rand
*drop rand

* matching within +/- 3 units of BMI

rangejoin bmi -3 3 using "F:\OSA data\Latestcode\Controls2.dta", by (id_cntl)
drop if ageatvisit_U==.
drop if gender_U==""

order id_case id_cntl gender gender_U ageatvisit ageatvisit_U bmi bmi_U

drop *_U
*sort id_case

*use matched control only twice for each matched case(preserving 1:2 case : control ratio)

bysort id_case id_cntl: keep if _n == 1
bysort id_case: keep if _n <= 2

*Check how many controls were found for every case
bysort id_case: gen byte numcontrols = _N if _n ==1
tab numcontrols
drop if numcontrols == 1
drop numcontrols

rename * *_case
rename (id_case_case id_cntl_case) (id_case id_cntl)

*drop *_U
*rename * *_case
*rename (id_cases_case id_case) (id_case id)

save "F:\OSA data\Latestcode\MatchedData_08December\Matched_Age GenderBMI0.dta", replace


use "F:\OSA data\Latestcode\Controls2.dta"

rename * *_cntl
rename id_cntl_cntl id_cntl
duplicates drop id_cntl, force

save "F:\OSA data\Latestcode\Controls3.dta", replace

use "F:\OSA data\Latestcode\MatchedData_08December\Matched_Age GenderBMI0.dta"

merge m:m id_cntl using "F:\OSA data\Latestcode\Controls3.dta"

order id_case id_cntl gender_case gender_cntl ageatvisit_case ageatvisit_cntl bmi_case bmi_cntl
drop if id_case==""
drop _merge

bysort id_case id_cntl: keep if _n == 1

*Check how many controls were found for every case
bysort id_case: gen byte numcontrols = _N if _n ==1
tab numcontrols
drop if numcontrols == 1
drop numcontrols
sort id_case

order id_case id_cntl gender_case gender_cntl ageatvisit_case ageatvisit_cntl bmi_case bmi_cntl

save "F:\OSA data\Latestcode\MatchedData_08December\Matched_Age GenderBMI1.dta", replace
** Matching on age,gender and BMI is complete.

Cannot install estout package

Hello everyone, I have been experiencing some issues while trying to install the estout package on my computer using Stata 17. The code I used was:

. ssc install estout

And I got the following message:

cannot write in directory C:\Users\Kaiser M�t�\ado\plus\_

(the strange symbols are just letters á and é which stata cannot read properly, since my windows language is not based on english)
This is where things started to get strange because my adopath command supplies this info:

adopath
[1] (BASE) "C:\Program Files\Stata17\ado\base/"
[2] (SITE) "C:\Program Files\Stata17\ado\site/"
[3] "."
[4] (PERSONAL) "C:\Users\Kaiser M�t�\ado\personal/"
[5] (PLUS) "C:\Users\Kaiser M�t�\ado\plus/"
[6] (OLDPLACE) "c:\ado/"

However, the plus and personal directory does not exist, the ado folders are not there. Even after putting the accessability of the Users folder to all around access, stata does not seem to be able to download the estout package. Updates work properly, and this is my personal laptop, no other users, not a university network.
I have tried manually creating these folders but the problem does not seem to be solvable. I have read all similar threads but at this point I am out of ideas. Would be greatful for some insight and help!

Saturday, February 26, 2022

Dealing with Highly Collinear Independent Variables

Dear Stata Members

I have a panel data where my independent variables are highly COLLINEAR(Index1 to Index4). In that case, rather than dropping one or more of the collinear variables, is it legitimate to transform the variables so that we can retain them. I will demonstrate my data and results with example.

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(Index1 Index2 Index3 Index4) long id int year float dep_var
5.46687       .       .       . 1 1999        0
 3.5714 53.3333 49.2386 37.4359 1 2000 .0469986
3.77717       .       .       . 1 2001        0
3.97991  55.102 35.3535 34.1837 1 2002        0
4.09675 56.6326 44.9495 43.3674 1 2003        0
3.94243  55.665 34.6342  44.335 1 2004        0
3.94921 51.4706 33.1707      50 1 2005        0
4.05847 57.0732 37.0732 48.0392 1 2006        0
3.92085 59.2233 33.4951 50.9709 1 2007        0
4.64972 58.7379 36.4078 50.9709 1 2008        0
 4.8054 57.8947 36.8421  45.933 1 2009        0
4.70902 58.3732 33.3333 44.4976 1 2010        0
4.83402 58.7678 37.9147 44.5498 1 2011        0
4.82298 57.8199 40.2844 44.0758 1 2012        0
4.66564 54.9763 44.5498 44.0758 1 2013        0
4.55899 65.3846 45.6731   43.75 1 2014        0
4.52303 68.2692 48.5577 44.2308 1 2015        0
4.86224 66.8269 49.0385 44.2308 1 2016        0
5.33097 68.2692 46.1539 48.5577 1 2017        0
5.62695 69.2308 46.1539 47.1154 1 2018        0
5.89539 71.6346 45.1923 42.7885 1 2019        0
 3.5714 53.3333 49.2386 37.4359 2 2000        .
3.77717       .       .       . 2 2001        0
3.97991  55.102 35.3535 34.1837 2 2002        0
4.09675 56.6326 44.9495 43.3674 2 2003        0
3.94243  55.665 34.6342  44.335 2 2004        .
3.94921 51.4706 33.1707      50 2 2005        .
4.05847 57.0732 37.0732 48.0392 2 2006 .5771455
3.92085 59.2233 33.4951 50.9709 2 2007        .
4.64972 58.7379 36.4078 50.9709 2 2008        .
 4.8054 57.8947 36.8421  45.933 2 2009        0
4.70902 58.3732 33.3333 44.4976 2 2010        0
4.83402 58.7678 37.9147 44.5498 2 2011        0
4.82298 57.8199 40.2844 44.0758 2 2012        0
4.66564 54.9763 44.5498 44.0758 2 2013        0
4.55899 65.3846 45.6731   43.75 2 2014        0
4.52303 68.2692 48.5577 44.2308 2 2015        0
4.86224 66.8269 49.0385 44.2308 2 2016        0
5.33097 68.2692 46.1539 48.5577 2 2017        0
5.62695 69.2308 46.1539 47.1154 2 2018        .
5.89539 71.6346 45.1923 42.7885 2 2019        0
5.46687       .       .       . 3 1999        0
 3.5714 53.3333 49.2386 37.4359 3 2000        .
3.77717       .       .       . 3 2001        .
3.97991  55.102 35.3535 34.1837 3 2002        .
4.09675 56.6326 44.9495 43.3674 3 2003        .
3.94243  55.665 34.6342  44.335 3 2004        .
3.94921 51.4706 33.1707      50 3 2005        .
4.05847 57.0732 37.0732 48.0392 3 2006        .
3.92085 59.2233 33.4951 50.9709 3 2007        0
4.64972 58.7379 36.4078 50.9709 3 2008        0
 4.8054 57.8947 36.8421  45.933 3 2009        .
4.70902 58.3732 33.3333 44.4976 3 2010        0
4.83402 58.7678 37.9147 44.5498 3 2011        0
4.82298 57.8199 40.2844 44.0758 3 2012        0
4.66564 54.9763 44.5498 44.0758 3 2013        .
4.55899 65.3846 45.6731   43.75 3 2014        0
4.52303 68.2692 48.5577 44.2308 3 2015        .
4.86224 66.8269 49.0385 44.2308 3 2016        0
5.33097 68.2692 46.1539 48.5577 3 2017        0
5.62695 69.2308 46.1539 47.1154 3 2018        0
5.89539 71.6346 45.1923 42.7885 3 2019        0
5.46687       .       .       . 4 1999        0
 3.5714 53.3333 49.2386 37.4359 4 2000        0
3.77717       .       .       . 4 2001        0
3.97991  55.102 35.3535 34.1837 4 2002        0
4.09675 56.6326 44.9495 43.3674 4 2003        .
3.94243  55.665 34.6342  44.335 4 2004        0
3.94921 51.4706 33.1707      50 4 2005        0
4.05847 57.0732 37.0732 48.0392 4 2006        0
3.92085 59.2233 33.4951 50.9709 4 2007        0
4.64972 58.7379 36.4078 50.9709 4 2008        0
 4.8054 57.8947 36.8421  45.933 4 2009        0
4.70902 58.3732 33.3333 44.4976 4 2010        0
4.83402 58.7678 37.9147 44.5498 4 2011        0
4.82298 57.8199 40.2844 44.0758 4 2012        0
4.66564 54.9763 44.5498 44.0758 4 2013        0
4.55899 65.3846 45.6731   43.75 4 2014        0
4.52303 68.2692 48.5577 44.2308 4 2015        0
4.86224 66.8269 49.0385 44.2308 4 2016        0
5.33097 68.2692 46.1539 48.5577 4 2017        0
5.62695 69.2308 46.1539 47.1154 4 2018        0
5.89539 71.6346 45.1923 42.7885 4 2019        0
5.46687       .       .       . 5 1999        .
 3.5714 53.3333 49.2386 37.4359 5 2000        .
3.77717       .       .       . 5 2001        0
3.97991  55.102 35.3535 34.1837 5 2002        0
4.09675 56.6326 44.9495 43.3674 5 2003        0
3.94243  55.665 34.6342  44.335 5 2004        .
3.94921 51.4706 33.1707      50 5 2005        0
4.05847 57.0732 37.0732 48.0392 5 2006        .
3.92085 59.2233 33.4951 50.9709 5 2007        0
4.64972 58.7379 36.4078 50.9709 5 2008        .
 4.8054 57.8947 36.8421  45.933 5 2009        0
4.70902 58.3732 33.3333 44.4976 5 2010        0
4.83402 58.7678 37.9147 44.5498 5 2011        0
4.82298 57.8199 40.2844 44.0758 5 2012        0
4.66564 54.9763 44.5498 44.0758 5 2013        0
4.55899 65.3846 45.6731   43.75 5 2014        .
4.52303 68.2692 48.5577 44.2308 5 2015        0
end
label values id id
label def id 1 "000002.SZ", modify
label def id 2 "000004.SZ", modify
label def id 3 "000005.SZ", modify
label def id 4 "000006.SZ", modify
label def id 5 "000007.SZ", modify

Code:
pwcorr dep_var Index1 Index2 Index3 Index4 , sig star(.01)

             |  dep_var   Index1   Index2   Index3   Index4
-------------+---------------------------------------------
     dep_var |   1.0000 
             |
             |
      Index1 |  -0.1242   1.0000 
             |   0.2819
             |
      Index2 |  -0.0867   0.7584*  1.0000 
             |   0.4757   0.0000
             |
      Index3 |  -0.0658   0.3183*  0.5552*  1.0000 
             |   0.5884   0.0021   0.0000
             |
      Index4 |   0.0787   0.1896   0.1301  -0.3035*  1.0000 
             |   0.5172   0.0719   0.2190   0.0034
             |


Code:
 reg dep_var Index1 Index2 Index3 i.id  i.year
note: 2017.year omitted because of collinearity.
note: 2018.year omitted because of collinearity.
note: 2019.year omitted because of collinearity.

      Source |       SS           df       MS      Number of obs   =        70
-------------+----------------------------------   F(22, 47)       =      1.28
       Model |  .123540378        22  .005615472   Prob > F        =    0.2345
    Residual |  .206200354        47  .004387242   R-squared       =    0.3747
-------------+----------------------------------   Adj R-squared   =    0.0819
       Total |  .329740732        69  .004778851   Root MSE        =    .06624

------------------------------------------------------------------------------
     dep_var | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      Index1 |   .0607583   .2623863     0.23   0.818    -.4670948    .5886114
      Index2 |  -.0082303   .0411951    -0.20   0.843    -.0911041    .0746435
      Index3 |   .0068583   .1109061     0.06   0.951     -.216256    .2299725
             |
          id |
  000004.SZ  |   .0420866   .0246522     1.71   0.094    -.0075072    .0916804
  000005.SZ  |   .0090231   .0272016     0.33   0.742    -.0456995    .0637458
  000006.SZ  |  -.0035913   .0218879    -0.16   0.870     -.047624    .0404413
  000007.SZ  |   .0108503   .0272195     0.40   0.692    -.0439084    .0656089
             |
        year |
       2002  |   .0473329    1.50762     0.03   0.975    -2.985606    3.080272
       2003  |  -.0182899   .4098085    -0.04   0.965    -.8427182    .8061384
       2004  |    .073309   1.573491     0.05   0.963    -3.092147    3.238765
       2005  |   .0441975   1.844819     0.02   0.981    -3.667099    3.755494
       2006  |   .2388757   1.268526     0.19   0.851     -2.31307    2.790822
       2007  |   .1058522   1.615725     0.07   0.948    -3.144567    3.356271
       2008  |   .0398561   1.309828     0.03   0.976    -2.595177    2.674889
       2009  |   .0099532   1.289834     0.01   0.994    -2.584859    2.604765
       2010  |   .0444741   1.660176     0.03   0.979    -3.295369    3.384318
       2011  |   .0087066    1.14843     0.01   0.994    -2.301636    2.319049
       2012  |  -.0146761   .9171183    -0.02   0.987     -1.85968    1.830328
       2013  |  -.0584361   .5495863    -0.11   0.916    -1.164061    1.047189
       2014  |   .0264604   .1801508     0.15   0.884    -.3359562     .388877
       2015  |   .0321464   .3668656     0.09   0.931     -.705892    .7701847
       2016  |  -.0031748   .3119827    -0.01   0.992     -.630803    .6244534
       2017  |          0  (omitted)
       2018  |          0  (omitted)
       2019  |          0  (omitted)
             |
       _cons |  -.0904379   6.801528    -0.01   0.989    -13.77335    13.59247
------------------------------------------------------------------------------

. estat vif

    Variable |       VIF       1/VIF  
-------------+----------------------
      Index1 |    349.95    0.002858
      Index2 |    886.38    0.001128
      Index3 |   6173.44    0.000162
          id |
          2  |      1.47    0.681963
          3  |      1.45    0.691748
          4  |      1.46    0.684869
          5  |      1.45    0.690839
        year |
       2002  |   1953.88    0.000512
       2003  |    109.92    0.009098
       2004  |   1096.42    0.000912
       2005  |   2227.48    0.000449
       2006  |   1053.19    0.000949
       2007  |   2244.14    0.000446
       2008  |   1122.88    0.000891
       2009  |   1430.15    0.000699
       2010  |   2916.77    0.000343
       2011  |   1395.73    0.000716
       2012  |    890.11    0.001123
       2013  |    259.65    0.003851
       2014  |     27.90    0.035844
       2015  |    115.70    0.008643
       2016  |     83.67    0.011952
-------------+----------------------
    Mean VIF |   1106.51
.
So my question is rather than dropping, can we do something to deal with Multicollinearity

Trim and fill using metatrim

I looked through other forum posts (post1, post2, post3, post4) and this question has been asked many times in the forum, but it remained unanswered.

I am using the command 'metatrim' in stata.
I am doing a meta-analysis of effective reproduction number. The LFK index shows a major asymmetry and publication bias. So, I try to trim and fill in the effect size.
But the result states: "No trimming was performed: data unchanged". I am not sure why it is not working. Can anyone please help?

Code:
metatrim rt lci uci, reffect funnel
where, rt is the effect size and lci and uci are the lower and upper confidence intervals respectively.

I expect to get some trimming and filling. However, in the result, state states:
Code:
Note: no trimming performed; data unchanged

For some weird reason, I can reproduce my result using some fake data.

Here I am just manipulating the automiss data, to have an effect size- mpg with its lower and upper 95% confidence lintervals- lci and uci
Code:
webuse automiss.dta, clear
keep make mpg rep78
replace rep78 = 2 if rep78 == .
generate lci = mpg - rep78
generate uci = mpg + rep78
drop rep78
drop if mpg == .
Installing metafunnel and meta bias package
package st0061 from http://www.stata-journal.com/software/sj4-2
Code:
net sj 4-2 st0061
Installing metabias command
package sbe19_6 from http://www.stata-journal.com/software/sj9-2
Code:
net sj 9-2 sbe19_6
I am creating a funnel plot and publication bias test. In my real data, the funnel plot looks assymetrical but the metabias command shows no bias

Code:
metafunnel mpg lci uci, ci name(funnel_ci, replace) // funnel plot
metabias mpg se, egger //
I thus will use LFK index to test for bias.

Code:
generate se = (uci-lci)/ 3.92
generate logmpg = log(mpg)
generate loglci = log(lci)
generate loguci = log(uci)
generate logse = (loguci-loglci)/ 3.92

admetan logmpg loglci loguci, model(reml) effect("mpg") eform
Now for LFK index, install the file. 'LFK': module to compute LFK index and Doi plot for detection of publication bias in meta-analysis

Code:
net describe lfk, from(http://fmwww.bc.edu/RePEc/bocode/l)
Running the code
Code:
lfk _ES _seES
Installing metatrim command:

sbe39_2 from http://www.stata.com/stb/stb61
Code:
net describe sbe39_2, from(http://www.stata.com/stb/stb61)

Code:
metatrim mpg lci uci, ci funnel reffect



Putting in all the same codes as above together for easier copy-pasting in stata:
Code:
webuse automiss.dta, clear
keep make mpg rep78
replace rep78 = 2 if rep78 == .
generate lci = mpg - rep78
generate uci = mpg + rep78
drop rep78
drop if mpg == .

generate se = (uci-lci)/ 3.92
generate logmpg = log(mpg)
generate loglci = log(lci)
generate loguci = log(uci)
generate logse = (loguci-loglci)/ 3.92

metafunnel mpg lci uci, ci name(funnel_ci, replace) // funnel plot
metabias mpg se, egger

admetan logmpg loglci loguci, model(reml) effect("mpg") eform

lfk _ES _seES

metatrim mpg lci uci, ci funnel reffect

Creating a new variable (that changes value in every 3 days....)

I am working with survey dataset. It has 30 people (variable name = dlp, numeric) who start work in different date (variable name= date, data type=date). I would like to create a new variable "work" which reflects the fact that each of the worker complete the work in every 3 days ( represented by Round 1), then continue the same work for next three days (represented by Round 2). And after every 6 days, they take 1 day off. Before they continue with Round 3 work. And this trend is likely to continue till december 2022.

encode with strange number?

Dear All, I have this data set in Stata format (dataex may not be appropriate for my purpose), encode.dta. I use
Code:
encode Journal_e, gen(id)
but do not obtain `id' from 1,2, .... Please see
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str63 Journal_e long id
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accountant"                    842
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Accounting Research"           843
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural Economy"          844
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Agricultural History of China" 845
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Asia-pacific Economic Review"  846
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Auditing Research"             847
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Business & Economy"            848
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"Chemical Industry"             849
"China Business and Market"     850
"China Business and Market"     850
"China Business and Market"     850
"China Business and Market"     850
end
label values id id
label def id 842 "Accountant", modify
label def id 843 "Accounting Research", modify
label def id 844 "Agricultural Economy", modify
label def id 845 "Agricultural History of China", modify
label def id 846 "Asia-pacific Economic Review", modify
label def id 847 "Auditing Research", modify
label def id 848 "Business & Economy", modify
label def id 849 "Chemical Industry", modify
label def id 850 "China Business and Market", modify
Any suggestions? Thanks.

Convert date variable to scale variable

Hi all
I imported an excel dataset to Stata 15. Unfortunately a scale variable (Body Mass Index measured up to 2 decimal places) had a wrongly placed date entry in one observation. As a result Stata converted the whole variable "bmi" as a date variable displaying it in the %td format. I need to convert this "bmi" to the number format with 2 decimal places. Here is the example data:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double bmi
 -21893.3
 -21893.9
 -21892.5
 -21895.2
   -21892
 -21895.1
 -21892.8
 -21896.3
 -21895.7
   -21894
 -21893.5
 -21894.4
 -21892.7
 -21894.1
 -21895.2
 -21893.7
 -21886.5
 -21895.3
 -21896.5
 -21893.8
-21893.75
 -21896.3
   -21892
 -21891.9
 -21893.2
 -21891.8
 -21893.5
 -21895.3
 -21887.3
 -21887.5
 -21891.8
 -21889.9
 -21896.1
 -21893.5
   -21892
 -21894.1
 -21894.9
   -21896
 -21895.1
 -21885.2
   -21885
 -21890.9
 -21892.2
 -21891.9
 -21889.7
 -21887.8
 -21893.2
   -21893
   -21882
-21889.85
 -21892.3
-21896.25
-21894.35
-21891.88
 -21890.1
 -21889.8
 -21892.7
-21895.11
 -21890.9
 -21893.1
 -21893.9
-21895.11
-21896.22
 -21889.9
-21891.55
 -21892.2
-21890.46
 -21886.4
 -21891.5
 -21894.7
-21895.22
   -21896
 -21889.6
 -21888.8
   -21891
   -21884
 -21884.6
-21895.58
-21890.38
 -21890.9
 -21892.2
-21895.56
-21895.13
-21892.36
-21892.87
-21896.25
-21894.04
-21894.88
-21893.56
-21896.93
-21892.91
-21891.24
-21896.68
-21894.79
-21895.11
-21893.24
-21893.63
-21885.35
-21897.02
-21895.78
end
format %td bmi

Stata histogram Help

Hi,
I'm attempting to find the 80th percentile of sales transaction totals in stata and graph it. I'd like the top of the first bar end at the 80% mark. Is there a way to do this in stata?

As an aside: would this even be the best way to represent sale transaction data? Sales range from pennies to 10K. I'm way too far over my skis with this one, so any suggestions would be appreciated


Current Command:
histogram nrr_subtotal, bin(100) percent
(bin=100, start=0, width=105.381)



Array

Mediation: Direct effect greater than the Total effect !

Dear all,

i hope you are doing good in this period of Covid

in my work i have studied the following relationships:
  1. impact of financial constraints on CEO stock option using a threshold model
  2. impact of CEO stock option on Risk taking using quantile regression
  3. impact of financial constraints on risk taking using threshold model
Now i would study the mediating role of CEO stock option in the relationship between financial constraints and risk taking following The Baron and Kenny (1986) method.

following are the results:


Array Array


as shown in the tables above, there exist a Partial mediation. but as marked in yellow, the direct effect is found to be greater than the total effect. i get confused how to interpret this. i have find that they call this a "Inconsistent mediation". i am wondering how can i interpret this !! what is the role of the mediator here exactly !!


Kind regards
SEDKI


Interpret IRF scale

Dear all,

I have got an annual database with a short term interest rate in percent, a long term interest rate in percent and consumption expenditures, computed as annual growth in percent. I set up a SVAR in that order and I would like to run an impulse response function next. The IRF chart shows that a shock of the short term interest rate (impulse variable) causes the consumption expenditures (response variable) to reach a value of -1.5 after two years before converging back to zero.
What is the unit of the impulse variable and the response variable each ? Is my interpretation correct that the IRF usually uses one standard deviation of the impulse variable and that the consumption expenditures would be 1.5 percentage points lower in the second year? I would like to know the reaction of the response variable if the short term interest rates rises by 1 percentage point. How could I recalibrate the IRF or calculate the effect by hand ?

Best regards
George

Friday, February 25, 2022

How to create a binary variable that identify whether another variable has duplicates in the dataset using Stata?*

The dataset is laid out in longitudinal format where the patient id (pid) is the first column, and the second column "PE"is if
the patient received duplicated physical check-up during their hospital stay. I want to create a new variable "RE" that can achieve
the goal like this: if PE has duplicates, then RE=1; If not, RE=0.

* Example generated by -dataex-.
clear
input byte (pid PE)
1 1
1 2
1 3
1 4
2 1
2 2
2 3
2 3
3 1
3 1
3 2
3 3
4 1
4 1
4 2
4 3
end
Thank you for your help!

add a command and all generated variables and data lost.

rename __merge __merge1 /* rename previous merge commands in order to merge tax+financial data with hotel groups data*/
variable __merge not found

Please Help with Code

Hi All, I am trying to make a variable that identifies whether a person's second to last work history entry was employment (as opposed to being unemployed or abroad). The problem I am running into is that I do not have a traditional panel data set, as each person's work history entries are classified by different start dates over a period of 30 years, and I want to identify each person's most recent work history entry and their second most recent work history entry using the L. command. So far, I have sorted my data by worker id (fwid) and work history entry start date, and I have tried to generate a variable that identifies their work history entry number by generating a running sum of ones (called file_num) for each individual (after sorting the data). My idea was to use the xtset command with the worker's id as the panel variable and the file_num as the time variable. Then I was going to identify the max value for the file_num variable and use the L. command to identify the second to last entry for each person. The weird this is that when I run this same exact set of code from start to finish, my variable of interest (separated_to_employ_v2) winds up with different summary statistics every time, and I cannot figure out why. Any help you can provide to resolve this issue would be greatly appreciated, as this issue is preventing me from replicating my regression results when I re-run the code.


use "C:\Users\Zach\Dropbox\H-2A\Generated Data Files\NAWS Workgrid with Main File Merged 1989-2018.dta", clear
rename *, lower
encode c06, gen(work_type)
gen abroad = work_type==1
gen farm_work = work_type==2
gen non_farm_work = work_type==3
gen non_employed = work_type==4
gen file_num = .
gen ones = 1
sort fwid start_date
by fwid: replace file_num = sum(ones)
xtset fwid file_num
gen separated_to_employv2 = (l.farm_work==1 | l.non_farm_work==1) & l.end_date<start_date
sum separated_to_employv2, d

Here are the summary stats for the variable "separated_to_employ_v2" from two separate runs of the code from start to finish...note the difference in the means.

Array Array

Thursday, February 24, 2022

Invalid name in foreach loop

Hi all,

I'm trying to iterate over two variables, with the names "category" and "month_created". The code is below. I'm getting the r(198) invalid name error and I'm not sure why. I'm a newbie to Stata and I've looked over all the documentation for foreach and local macros, but I can't find anything wrong. Any help would be appreciated!

foreach var of varlist category month_created {
tabulate 'var'
}

Merging two data sets based on ranges and assign a number

I would like to merge two data sets where the first dataset contains probabilities. The second data set contains a column indicating the number of weeks and two more columns showing a minimum and maximum range. The probability in dataset 1 should be assigned a "weeks" value from dataset 2 based on the range in which it falls. There can be more than one "weeks" value assigned to each row. For example, in row 3, there would be a "weeks" value assigned to each 0.72, 0.79, 0.84, 0.43, and 0.33. Additionally, "weeks" values are only assigned to numbers following a "." or numbers preceding "." when there are no values following ".". For instance, in row 1, a "weeks" value would only be assigned to 0.12 and in row 4 a "weeks" value would only be assigned to 0.71.

I would appreciate any assistance with this.

Thanks,
A

Dataset 1
V1 V2 V3 V4 V5 V6
0.61 0.29 . . . 0.12
0.52 0.63 . . . 0.11
0.72 0.79 0.84 0.43 0.33 .
0.57 0.37 0.22 . 0.71 .
Dataset 2
Weeks Min Max
1 0 0.3
2 0.3 0.6
3 0.6 1

Negative statistic chi2 after hausman test

Hello everyone,
I have panel data and my dependent variable is a dummy variable. N=801 (individuals) and T=8 (periods)
I am running xtlogit regression in Stata. In order to choose between the fixed effect model and random effect model, I run Hausman test, but I got a negative chi2 statistic. Someone can help me?

Thanks

Stata vs Mplus for LPA?

Dear researchers,

I am trying to conduct Latent Profile Analysis but wonder conducting it with Stata doesn't get me into any trouble.
Mplus is mostly used for LPA but I wonder if it is okay to run it with Stata15.
Do you know what strengths Mplus has for LPA over Stata?
I appreciate your comments in advance.

How should I fix this error in my scatter plot?

Hello,

I've been trying to make a more advanced scatter plot and have managed to figure out the bits and pieces of it. Essentially, I want the burden of disease on the y axis and GDP on the x axis, with horizontal and vertical lines going through their averages. I also want to weight the circles by population and have the country's names show up. However, I am having some issues when I put my individual codes together.

(I have added a photo of the graph I would like to produce)


One of my issues has been:
country: may not use time-series operators on string variables Would someone be able to give me insight into how to surpass this error?

Also, would somebody have tips on the order so that the graph can look like my drawing? This is my current code:

input *Merged Data Set using "Combine Datasets" Button* *Generate log of both X & Y Variables* generate log_dalys = log(dalys) *Find Means of both X & Y Variables* summarize log_dalys // mean = 8.868019 summarize log_nhexp_reh // mean = 16.7221 *Making Scatter Plot* scatter log_dalys log_nhexp_reh mlabel(country) msymbol(circle_hollow) [w=pop] || lfit dalys nhexp_reh || yline(8.868019) xline(16.7221) end
Array

My data set looks similar to this:
USA 2016 100 100 1 150 2
USA 2017 200 150 2 150 2
USA 2018 300 200 3 150 2
Korea 2016 100 100 1 150 2
Korea 2017 200 150 2 150 2
Korea 2018 300 200 3 150 2

Controlling for differences caused by occupation

Hello - I am new to the world of econometrics and am working on a small research paper. I am trying to identify the difference in hourly wages for second generation Hispanic and Asian Americans.

A big thing that I've identified through my research is that Asian Americans and Hispanic Americans tend to enter different occupations, so for my analysis I think it's important to add occupation as a control. I am unsure of how to do this without listing out a dummy variable for each of the 25 occupation groups that I've put together. I came across the idea of adding i.occupation in my regression... is this the way to go?

Wednesday, February 23, 2022

Using the character ehat_t in the y-axis

Dear all,

I hope you are all keeping well!

I want to use the character ehat_t in my y-axis

So I tried to add to my code the following command:
ytitle(e{&and}{subscript:t}) But I can not get it right. Any ideas?

Many many thanks in advance!!

geonear r(459) error message advice

Hi All,

I am trying to calculate the distance of disaster occurrences from specific place names within a 50km radius.

The code that I am using is:

geonear place_name latitude_afro longitude_afro using "GDIS Geonear Test.dta", n(disastertype latitude longitude) ignoreself long within(50)

and it is coming up with the following error: latitude_afro or longitude_afro not constant within place_name group r(459);

I have tried two variables at the start of the code place_name and merge_location and they have each presented the same error.

Any advice would be much appreciated.

Cheers,
Rob

Different lagged variables

Hello, I am trying to generate a variable based on the lagged values of other variables, where the time variable is actually the individual's work history entry number in the data set as opposed to a genuine time variable. My data setting is such that I have different people identified by the variable fwid and most people have multiple work history entries (some only have one). Each person that has multiple entries has a different start date for each work entry (farm work, non-farm work, non_employed, and abroad). I am trying to generate a variable that identifies the consecutive entries based on the start dates of each individual's work history entries so that I can use the L. command to pull data values from the previous work history entry. My goal is to identify the most recent work history entry for each person and determine whether their second most recent entry in the data base was employed (either in farm work or in non-farm work). However, every time I run this code, I get a different mean for the variable I am interested in (separated_to_employv2) so my regression coefficients are always slightly different, although they are usually very close to each other. So far, I tried to make the time variable equal to the running sum of a variable that consists only of ones for each worker to identify each individuals' work history entry by sorting the data by worker id and start date and generating a time variable called "file_num". Can someone tell me what I am doing wrong or help me resolve this. The summary statistics for two separate runs of the same exact code are also shown below...note the different means. Not sure why I am getting different summary stats for this variable when I run the exact same code multiple times. Any help would be greatly appreciated. Thanks is advance.

use "C:\Users\Zach\Dropbox\H-2A\Generated Data Files\NAWS Workgrid with Main File Merged 1989-2018.dta", clear
rename *, lower
encode c06, gen(work_type)
gen abroad = work_type==1
gen farm_work = work_type==2
gen non_farm_work = work_type==3
gen non_employed = work_type==4
gen start_date = c09a
gen end_date = c09b
gen file_num = .
gen ones = 1
sort fwid start_date
by fwid: replace file_num = sum(ones)
xtset fwid file_num
gen separated_to_employv2 = (l.farm_work==1 | l.non_farm_work==1) & l.end_date<start_date
sum separated_to_employv2, d

Array

Array

Plotting nonlinear predicted probabilities from multivariate model against a continous variable

Apologies if this has already been answered, but I have been unable to find a clear answer/I am not exactly sure what I am looking for. I would like to plot nonlinear predicted probabilities from a multivariate model against a continous variable included in the model at two different levels of an indicator variable. I estimate predicted probabilties of students passing an exam. Here's the model I'm fitting:

Code:
 areg pass i.low_income std_prior_test_score std_prior_test_score_sq std_prior_test_score_cubed other_covs, absorb(fixedeffect)
, where pass is a binary dummy variable =1 if a student passed the exam, low-income is a binary dummy =1 if a student is classified as low-income, std_prior_test_score is the student's standardized test score on a prior exam, std_prior_test_score_sq is that score squared and std_prior_test_score_cubed cubed.

I want to plot the predicted probability of passing against std_prior_test_score at the means of the other covariates. I have played around with the lpoly and margins commands, but have run into issues. Here's what I've done with margins:
Code:
margins i.lowinc_max, at(std_prior_test_score_sq(-2.5(.1)1.5)) atmeans
marginsplot
and obtained the following:
Array


Of course, I'm hoping to obtain the nonlinear relationship with the x-axis variable. Am I able to obtain this nonlinear predictions using the margins command? Or perhaps using a combination of predict and lpoly?

Eventually, I would like to plot the predictions from the following model as well:

Code:
 areg pass i.low_income i.low_income#c.std_prior_test_score i.low_income#c.std_prior_test_score_sq i.low_income#c.std_prior_test_score_cubed other_covs, absorb(fixedeffect)
I appreciate any and all input. I'm sure there exists an easy solution that I am just unfamilar with. Thank you.

Help manipulating data groups

Hello,
I am currently working with data about initial operations and subsequent readmissions which are recorded as separate observations. The index procedures and subsequent readmissions are identified by a variable "nod_visitlink" which tracks all observations about a particular patient. I have a variable "procname" which stores information regarding the type of operation that was received at index admission. I would like to add this data to all instances of patient admissions, i.e. subsequent readmissions which at this point only has missing values.


gen index_procname = ""
by nrd_visitlink, sort: gen pid = _n

gen procnamedummy = 0
replace procnamedummy = 1 if procname == "bpdds"
replace procnamedummy = 2 if procname == "roux"
replace procnamedummy = 3 if procname == "sleeve"

forvalues i=1/ ?? {
generate include = 1 if pid != `i' & procname != ""
by(nrd_visitlink): gen work = procnamedummy
replace index_procname = work if pid == `i'
drop include work
}

This is what I have so far but does not seem to be working. Anybody have any experience with anything similar? Thank you.

technical inefficiency with determinants and marginal effects

I am doing research in tax efficiency. I would like to determine tax effort (persistent and transient including determinants) and marginal effect.

I would like to use the Kumbhakar et al, 2018 model explained in this article "panel data stochastic frontier model with determinants of persistent and transient inefficiency".

I have "a practictioner's Guide to Stochastic frontier Analysis using STATA '' 2015 but the authors don't explain how to figure out persistent with determinants, transient with determinants and marginal effects. I attached to this message the article.

Please, I would like to know what command in STATA could I use. What packages do I need ? or what commands should I use to estimate persistent, transient and marginal effects?

Thanks, I really need your help.

Array Array .

Stata overrides the command for which level to omit in a regression

Hi Statalist,

I would like to seek help for a problem I encountered. I tried to estimate a regression where I regress outcome variable IHS on factor variable Chinese_event (I chose 7 as the base period) and Stata did use 7 as the base period. However, when I tried to add an interaction between the factor variable Chinese_event and a continuous variable std_repub_share, I also specified 7 as the base period, Stata did not omit period 7. I have tried to use io7.Chinese_event. It still did not work. Please see the output below. what should I do? I appreciate any suggestions!

reg IHS ib7.Chinese_event ib7.Chinese_event#c.std_repub_share


Tuesday, February 22, 2022

Calculating 95% CI of difference in AUC

Array
After running 2 different logistic regressions with the predict command and then running roccomp- how do we calculate the 95% confidence interval for the difference in AUC? STATA only spits out the p-value and X2.



Accessing ado files (icalc) to save to a newer computer

Hi,

This is a basic question (for which I apologise), but it's tripping me up. I'm trying to access the "icalc" suite of commands as described in Kaufman (2019). The download website noted in the book (icalcrlk.com) no longer exists so using "net from..." doesn't work. So, I figured I would go back to an old computer and get it from there...this is where my problem arises. Following previous guidance found on Statalist, I type "adopath" into Stata, then try to locate the "Personal" and "Plus" folders listed, but they don't appear to exist on my computer. Clearly, "icalc" is there somewhere since I have used it in the past and can bring up the help file with "help icalc"; I just can't seem to find it.

If anyone could help me on this one, I would really appreciate it.

Thanks!

Owen Gallupe

Kaufman, R. L. (2019). Interaction Effects in Linear and Generalized Linear Models. SAGE Publications Inc. http://us.sagepub.com/en-us/nam/inte...els/book253602


Margins for panel

Dear all,

I am having a macro panel for a larger period of years (1945- 2020) and for large number of countries with the solid macro variables, dummies, categorical variables and some custom quality indicators, expressed in
percentage, which you can see in the
data example below for shorter time length and less countries . According to my model, indicators definitely determine macro variables. My main model is :
ΔY= (a-1) + b1yI,t-1 + b2TytI,-1 +b3T2ytI,-1 +Zi,t + Di,t + e,it,
Where Y is the dependent variable of interest, Z a set of control variables, and D a set of dummy variables

In sort, I fit a model with polynomial terms, such as

y = a*x^2 + b*x + ...
So, if the coefficient a is negative then I have a U-shape curve facing downwards that I will be able to locate the maximum of y when x -b/2a. I try to achieve that by -margins-,

regress Dgdp2 cpi u industry dummy1 dummy2 dummy3 c.indicator1##c.indicator1, vce(robust)
margins, at(indicator1=(-16(1)48)) plot

nlcom -_b[indicator1]/(2*_b[indicator1#indicator1])

However, the -regress- command with -vce(robust)- does not implement a panel estimation. Due to the reason that I cannot run a panel estimation for each id and year combination, each id, or each year xt- estimations are no longer available to my dataset. -regress- just pools observations together, treating them cross-sectional if you run it for
each id or each year when there are sufficient observations. . I tried with other panel commands without success. I am insisting to that because I want to be able to use -margins- to visualize the quadratic relationship (U-shaped curve)

I would appreciate any help you can provide
Thank you in advance

Best.
Mario Ferri!

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(id ts) str97 country float(indicator1 cpi u gdp industry ka gw dummy1 dumm2 dummy3 Dgdp2)
1 1981 "Australia"       20.613514  9.487666  5.78  4.540717e+11 43.42793 .41687185   .20370182 0 0 0  146663.34
1 1982 "Australia"       20.613514  11.35182  7.16  4.691852e+11 43.09072 .41687185  -.15889087 0 0 0  151135.19
1 1983 "Australia"        6.500914  10.03891  9.96  4.587671e+11 41.44292 .41687185  .027918227 1 1 1 -104181.27
1 1984 "Australia"       3.3282194  3.960396  8.99  4.797846e+11 44.21151 .47684005   1.4786447 1 0 0  210174.94
1 1985 "Australia"             5.8  6.734694  8.26    5.0497e+11 46.74982  .8200954   -.4968416 0 0 0   251854.5
1 1986 "Australia"             5.8   9.05035  8.08  5.253583e+11 46.31632  .8800636  -.23686917 0 0 0  203883.16
1 1987 "Australia"        6.556594  8.533022  8.11  5.387611e+11 49.62373  .9400318   .18794964 1 0 0     134028
1 1988 "Australia"             7.5   7.21594  7.23  569704251392 53.17501         1    .4695499 0 0 0   309431.2
1 1989 "Australia"             7.5  7.533903  6.18  5.917015e+11 54.96535         1   .03558734 0 0 0  219972.23
1 1990 "Australia"       10.999176  7.333022  6.93  612831133696 56.63783         1   .06843487 1 1 1   211296.6
1 1991 "Australia"            12.2  3.176675  9.58  6.103932e+11 56.07949         1    1.616588 0 0 0  -24379.39
1 1992 "Australia"            12.2 1.0122311 10.73  6.129091e+11 56.31247         1      1.9267 0 0 0   25159.27
1 1993 "Australia"       17.553352 1.7536534 10.87  6.376088e+11 58.00952         1  -.55177194 1 1 1  246996.67
1 1994 "Australia"           19.11 1.9696348  9.72  6.630105e+11 60.82281         1  .007553967 0 0 0  254016.88
1 1995 "Australia"           19.11 4.6277666  8.47  6.884552e+11 62.44015         1   -.5324481 0 0 0   254446.8
1 1996 "Australia"       15.414823 2.6153846  8.51  7.151575e+11 64.93562  .9400318     .980418 1 1 1  267023.16
1 1997 "Australia"          14.538 .22488755  8.36  7.435245e+11 65.80802  .8800636   10.272155 0 0 0  283669.97
1 1998 "Australia"       17.287497  .8601346  7.68  7.775533e+11  67.9767  .8200954   -.7689896 1 0 0   340288.5
1 1999 "Australia"       28.634005 1.4831294  6.87  8.170032e+11 68.75488  .7601272   -.3804706 0 0 0   394499.2
1 2000 "Australia"          28.634  4.457435  6.28  8.491371e+11 72.54832   .700159   -.6822261 0 0 0  321338.75
2 1981 "Austria"                 2  6.803042  2.06  2.080323e+11 36.20454   .700159    -.200444 0 0 0  -3006.136
2 1982 "Austria"                 2  5.436031  3.35  212216348672 35.98919   .700159   .20921445 0 0 0   41840.48
2 1983 "Austria"         1.0946958 3.3391645  4.11  218525728768 36.29808   .700159    .5968089 1 1 1    63093.8
2 1984 "Austria"          .5089108  5.663186   3.8  2.186378e+11 38.19626   .700159  -.44796655 0 0 0  1120.5017
2 1985 "Austria"          .5089108  3.189517   3.6  224100843520 39.99393   .700159     .823105 0 0 0   54630.64
2 1986 "Austria"           .561659 1.7054446  3.12  2.292583e+11 40.40985   .700159    1.638252 0 0 0   51574.38
2 1987 "Austria"          12.24376 1.4019527  3.79  2.323697e+11 40.80097   .700159   .50470144 0 0 0    31114.2
2 1988 "Austria"         12.920382  1.915717  3.55  2.400283e+11 42.56515   .700159  -.23894365 0 0 0   76586.19
2 1989 "Austria"         12.920382 2.5683484  3.14  2.493584e+11 45.09982   .700159  -.26313263 0 0 0   93300.82
2 1990 "Austria"         12.619598  3.261872  3.25  260194631680 48.31841   .700159 -.008617969 1 0 0   108362.3
2 1991 "Austria"               5.1  3.337427  3.42  269149552640 49.15068  .7601272  .012651701 0 0 0   89549.21
2 1992 "Austria"               5.1  4.020848  3.59  274784272384 48.56378  .8200954  -.10854957 0 0 0    56347.2
2 1993 "Austria"               5.1  3.631785  4.25  276231847936  47.7724  .8800636   .03468369 0 0 0  14475.756
2 1994 "Austria"          5.248024 2.9534094  3.54  282867269632 49.68189  .9400318   .26526877 1 0 0   66354.22
2 1995 "Austria"          7.179246 2.2433662  4.35  290414133248 52.20742         1    .7674453 1 0 0   75468.63
2 1996 "Austria"         17.413164 1.8609712  5.28  2.972375e+11  52.7208         1   .16558754 0 0 0   68233.79
2 1997 "Austria"         17.166555 1.3059785  5.15  303460483072 56.07532         1   .13704391 0 0 0   62229.71
2 1998 "Austria"         17.155384  .9224672  5.52  314328678400  60.6142         1    .4728145 0 0 0  108681.95
2 1999 "Austria"          13.81384 .56899375   4.7  325507252224 64.24256         1    .5897323 1 0 0  111785.73
2 2000 "Austria"          3.427717  2.344863  4.69  3.364955e+11 70.08217         1   -.7848026 0 0 0  109882.24
3 1981 "France"          1.6382614 13.314405  7.54 1.4984657e+12 79.29216 .41687185   -.9999912 1 0 0   158494.9
3 1982 "France"          -4.953055 11.978472   8.2 1.5360083e+12 78.65401 .16434518    .0278443 0 0 0   375425.6
3 1983 "France"          -4.953055  9.459548  7.92  1.555068e+12 78.70774 .41687185   .17290956 0 0 0  190597.03
3 1984 "France"         -4.7091045  7.673803  9.53 1.5786074e+12 80.06466 .41687185   .13435355 0 0 0  235394.83
3 1985 "France"               -4.4    5.8311 10.26  1.604225e+12 80.26139 .41687185   .34410325 0 0 0  256173.67
3 1986 "France"           17.15525  2.538526 10.23   1.64172e+12 82.29907 .41687185   305361.44 1 1 1   374951.1
3 1987 "France"          23.033955  3.288898 10.74 1.6837792e+12 83.73698 .41687185  -.09669671 0 0 0     420593
3 1988 "France"           9.400383  2.700815 10.18 1.7636432e+12 86.72044 .41687185   .26424965 1 0 0   798640.1
3 1989 "France"                1.6  3.498302  9.62 1.8402534e+12 89.72697 .41687185   -.2393705 0 0 0   766101.4
3 1990 "France"                1.6 3.1942835  9.36  1.894061e+12 91.03416 .47684005  -.08574203 0 0 0  538078.06
3 1991 "France"                1.6  3.213407  9.13 1.9139143e+12 90.95167 .53680825    -.999992 0 0 0  198530.83
3 1992 "France"                1.6 2.3637605 10.21 1.9445244e+12 90.21667  .5967765       .4548 0 0 0  306101.63
3 1993 "France"           6.996447 2.1044629 11.32    1.9323e+12    86.85  .9400318  .022043044 1 1 1  -122245.6
3 1994 "France"            8.69136 1.6555153 12.59   1.97787e+12 90.03333         1    .2997584 0 0 0     455702
3 1995 "France"            8.69136 1.7964814 11.84 2.0195377e+12 92.69833         1   .05559472 0 0 0   416676.6
3 1996 "France"            8.69136 1.9828837 12.37 2.0480737e+12    93.46         1   118525.52 0 0 0  285359.47
3 1997 "France"           5.041234  1.203943 12.57  2.095923e+12 97.54333         1    .4296218 1 1 1   478491.4
3 1998 "France"           2.364474  .6511269 12.07 2.1711383e+12 101.5933         1    .8784832 0 0 0   752155.4
3 1999 "France"           2.364474  .5371416 11.98  2.245421e+12  104.295         1    .1964599 0 0 0   742825.7
3 2000 "France"           2.364474   1.67596 10.22 2.3335238e+12 108.6008         1   -.9999971 0 0 0   881029.3
4 1981 "United Kingdom"       18.6 11.876627  10.4 1.2182697e+12 77.52663  .8800636    .3102416 0 0 0  -96729.83
4 1982 "United Kingdom"       18.6  8.598864  10.9 1.2425728e+12  77.7118  .9400318     .230822 0 0 0   243031.1
4 1983 "United Kingdom"  18.766483 4.6093035 11.09 1.2950324e+12 79.73109         1    .6651978 1 0 0     524596
4 1984 "United Kingdom"       18.9  4.960711  10.9  1.324418e+12 80.24253         1  -.12341004 0 0 0  293856.88
4 1985 "United Kingdom"       18.9  6.071394 11.49 1.3793472e+12 84.36929         1  -.16183244 0 0 0   549291.3
4 1986 "United Kingdom"       18.9 3.4276094 11.51 1.4228014e+12 86.21222         1   1.1283993 0 0 0   434541.7
4 1987 "United Kingdom"  17.745354 4.1489224 11.02 1.4995292e+12 89.77464         1 -.015180643 1 0 0   767278.4
4 1988 "United Kingdom"     16.809 4.1553516  9.01 1.5854885e+12 94.41283         1   .18605655 0 0 0   859592.4
4 1989 "United Kingdom"     16.809  5.760249  7.41  1.626356e+12 96.38803         1  -.27347788 0 0 0   408675.9
4 1990 "United Kingdom"     16.809  8.063461  6.97 1.6382895e+12 96.41448         1    83413.91 0 0 0   119334.5
4 1991 "United Kingdom"     16.809  7.461783  8.55 1.6202173e+12  93.1948         1   .09337864 0 0 0  -180722.1
4 1992 "United Kingdom"  13.403038 4.5915494  9.78 1.6267156e+12 93.60826         1    .6429237 1 1 1   64982.88
4 1993 "United Kingdom"       12.1  2.558578 10.35  1.667218e+12 95.62276         1   -.9999845 0 0 0   405025.6
4 1994 "United Kingdom"       12.1 2.2190125  9.65 1.7313395e+12  100.681         1   123614.65 0 0 0   641213.4
4 1995 "United Kingdom"       12.1  2.697495  8.69 1.7751713e+12 102.4492         1  -.13095939 0 0 0   438317.9
4 1996 "United Kingdom"       12.1  2.851782  8.19 1.8194017e+12 103.8479         1 -.033561174 0 0 0   442303.8
4 1997 "United Kingdom"   5.227907  2.201143  7.07 1.9099222e+12 106.6101         1    .4296572 1 1 1   905205.5
4 1998 "United Kingdom"      1.806 1.8205616   6.2 1.9807378e+12 107.4667         1   .27417862 0 0 0   708155.8
4 1999 "United Kingdom"      1.806 1.7529508  6.04  2.046007e+12  108.652         1  .024280345 0 0 0   652692.7
4 2000 "United Kingdom"      1.806 1.1829562  5.56 2.1177446e+12 110.4949         1    .3995405 0 0 0   717375.4
5 1981 "United States"    7.720604 10.334715   7.6  6.661146e+12 50.72995         1    .4146612 0 0 0  1648576.4
5 1982 "United States"         8.3  6.131427   9.7  6.541054e+12 48.11071         1    .6862257 1 0 0 -1200923.6
5 1983 "United States"         8.3  3.212435   9.6  6.840891e+12 49.41756         1    .9804647 0 0 0  2998371.5
5 1984 "United States"         8.3 4.3005357   7.5   7.33594e+12 53.80223         1   -.1878409 1 0 0    4950495
5 1985 "United States"    16.16676  3.545644   7.2  7.641824e+12 54.46311         1   -.8751099 0 0 0  3058832.5
5 1986 "United States"        16.6 1.8980477     7  7.906433e+12 55.01276         1   17.963144 1 0 0    2646097
5 1987 "United States"        16.6  3.664563   6.2  8.179962e+12 57.87624         1     -.44877 0 0 0    2735289
5 1988 "United States"        16.6  4.077741   5.5  8.521643e+12 60.88547         1  -.05447189 1 0 0    3416806
5 1989 "United States"   14.988736  4.827003   5.3  8.834614e+12 61.43758         1   -.1320547 0 0 0    3129706
5 1990 "United States"        14.9  5.397956   5.6  9.001231e+12 62.05701         1  -.08746292 1 0 0  1666176.8
5 1991 "United States"        14.9  4.234964   6.8  8.991487e+12 61.15007         1    .2589748 0 0 0  -97444.16
5 1992 "United States"        14.9 3.0288196   7.5  9.308206e+12 62.92408         1   13.807285 1 0 0    3167192
5 1993 "United States"    .0375137  2.951657   6.9  9.564447e+12 64.99349         1   .04719354 0 0 0    2562405
5 1994 "United States"       -.781  2.607442  6.12  9.949782e+12 68.42391         1   .17382646 1 0 0  3853359.5
5 1995 "United States"       -.781   2.80542  5.65 1.0216863e+13 71.58787         1 -.032830253 0 0 0    2670807
5 1996 "United States"       -.781  2.931204  5.45 1.0602295e+13 74.84686         1  -.02388202 1 0 0    3854314
5 1997 "United States"   1.3458682 2.3376899     5 1.1073802e+13 80.20259         1   .29131374 0 0 0  4715079.5
5 1998 "United States"       1.463  1.552279  4.51 1.1570064e+13 84.89986         1   .55129653 1 0 0  4962616.5
5 1999 "United States"       1.463 2.1880271  4.22 1.2120017e+13 88.64145         1   -.9274685 0 0 0    5499530
5 2000 "United States"       1.463  3.376857  3.99  1.262027e+13 92.05371         1    5.664342 1 0 0    5002515
end