BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Saturday, April 30, 2022

Plotting event study by month

Array

Hi,

I am trying to run and plot an event study, but my regression is not reporting any statistics other than the coefficients and I am unsure why. I've posted the code below and have attached an image of the regression result:

Code:

format date %td
gen metal_type=0
replace metal_type=1 if steel_prices
label var metal_type "Type of metal"
reg metal_price metal_type metal_type#i.date i.date

Also in trying to plot the regression, the labels read " Steel#12815", but I want the labels to read dates such as "01feb1995" and I am unsure how to specify this in the coefplot command.

Thanks!

Getting Predicted Values for Regressions

Hi all,

I have a dataset of about 800 observations with 19 variables. One of the variables is a dependent while the others are independent (a mix of continuous, factor, and dummy variables). I found all possible combinations for my regression model which comes down to 1024 regression models. Now, I am trying to figure out how to get the predicted y values, t_05, t_50, t_95 , t_mean (0.5*mse), mmult (matrix multiplier), predicted mean, predicted t_05, predicted t_50, and predicted_t95 for each of the regression model but I am unsure on how to go about and how to create a loop that would perform this function.

All help would be greatly appreciated! Thanks so much.

no observations to perfom with the code

Code:

asdoc cor gini_k decile incdegini eta agesquared  ncomp married self_employed smallcity disaving  debt_growthrate eyeah , label nonum replace

Once I exceute the code stata informs me that there are no observations to perfom with this what should I do to rectify this

Identify a segment of a string

Hi，

I want to assign a binary code to each column, 1 if it contains information from the first row and 0 otherwise.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str33 WHODISCLOSURE2 str31 WHODISCLOSURE3 str26 WHODISCLOSURE4 str24 WHODISCLOSURE5
"Securities regulator - Encourages" "Securities regulator - Requires" "Corporate law - Encourages" "Corporate law - Requires"
""                                  "Requires"                        ""                           ""                        
""                                  ""                                ""                           ""                        
"Encourages"                        ""                                ""                           ""                        
""                                  "Requires"                        ""                           ""                        
"Encourages"                        ""                                ""                           ""                        
"Encourages"                        "Requires"                        ""                           ""                        
""                                  "Requires"                        ""                           "Requires"                
"Encourages"                        ""                                ""                           "Requires"                
""                                  "Requires"                        ""                           ""                        
""                                  ""                                ""                           ""                        
""                                  "Requires"                        "Encourages"                 ""                        
"Encourages"                        ""                                ""                           "Requires"                
"Encourages"                        ""                                ""                           ""                        
""                                  "Requires"                        ""                           ""                        
""                                  ""                                ""                           ""                        
"Encourages"                        ""                                ""                           "Requires"                
""                                  "Requires"                        "Encourages"                 ""                        
"Encourages"                        ""                                ""                           ""                        
""                                  "Requires"                        ""                           ""                        
""                                  ""                                ""                           ""                        
"Encourages"                        ""                                ""                           ""                        
""                                  "Requires"                        "Encourages"                 "Requires"                
""                                  ""                                ""                           ""                        
"Encourages"                        "Requires"                        ""                           ""                        
""                                  ""                                ""                           ""                        
""                                  "Requires"                        ""                           ""                        
""                                  ""                                ""                           ""                        
""                                  "Requires"                        ""                           "Requires"                
""                                  ""                                "Encourages"                 ""                        
""                                  "Requires"                        ""                           "Requires"                
"Encourages"                        "Requires"                        ""                           "Requires"                
""                                  "Requires"                        "Encourages"                 ""                        
""                                  ""                                ""                           ""                        
""                                  "Requires"                        ""                           ""                        
"Encourages"                        ""                                ""                           "Requires"                
""                                  "Requires"                        ""                           ""                        
"Encourages"                        "Requires"                        ""                           ""                        
""                                  "Requires"                        "Encourages"                 ""                        
""                                  ""                                ""                           ""                        
"Encourages"                        "Requires"                        ""                           ""                        
""                                  "Requires"                        ""                           "Requires"                
""                                  ""                                ""                           ""                        
""                                  ""                                ""                           ""                        
"Encourages"                        ""                                ""                           ""                        
""                                  "Requires"                        ""                           ""                        
"Encourages"                        ""                                "Encourages"                 ""                        
"Encourages"                        ""                                "Encourages"                 ""                        
""                                  ""                                ""                           ""                        
""                                  ""                                ""                           ""                        
""                                  ""                                ""                           ""                        
""                                  ""                                ""                           ""                        
"Encourages"                        ""                                "Encourages"                 ""                        
end

I have tried:

Code:

local N = _N
forvalues i = 2/`N'{
foreach j of varlist WHODISCLOSURE* {
local m1 = `j'[1] 
if `j'[`i'] == "`m1'" {
replace `j' = "1" in `i' if `j' != ""
}
replace `j' = "0" in `i' if `j' == ""
}
}

However, it doesn't work because the text doesn't exactly match what's in the first line. Does anyone know how to retrieve some information to use from an inexact match?
Many thanks in advance!

to get away from command ab is unrecognized

Hi my question is simple, i get command ab is unrecognized how can I get away from this

Including a ratio coefficient of other coefficients in the regression

Dear All,
I have been trying to run analysis using the paper of Acemoglu et al. (2008) about the association between democracy and income. One of coefficient in their regression output is (L1.gdppercapita / (1- L1.democracy) ).
This follows that this coefficient using the coefficient in the the regression for gdp per capita at t-1 divided by the coefficient of 1 minus lagged democracy status. I did try to generate new variables one for 1-L.Democracy and one for the results of the divination, unfortunately with no success as it includes it in the regression, and does not use it for the ration between the total coefficients. How can I run this based on the fact it is the ratio between coefficients?

Additionally, I have been trying to include in my regression output (using Outreg2) additional rows for robustness checks such as, Hansen J test and AR(2) test (in addition to the regular results of the regression. Does anyone kindly know how would it be possible to do so?

Thank you in advance.
Regards

Friday, April 29, 2022

mata analysis - Can someone please help me with this mata analysis

*Data is about: The effect of Maternal Lipid Levels on Pregnant Women Without Complication in Developing Risk of Large for Gestational Age Newborn

*large-for-gestational (LGA) : Refers to a fetus or infant who is larger than expected for their age and gender.

* Import the stata data file "REVMAN.dta"

*Read Description:
* The data provides mean difference, and standard error of each samples being compared, but does not provide Effect size and standard error for the comparisons!
* An alternative is to use Cohen's d , which is the standard mean difference.
* Hint: use metan with N M Sd of LGA and nLGA!
* Type "help metan" and go to "Meta-analysis of two-group comparison of continuous outcomes, using the sample size, mean and standard deviation in the treatment and control groups"
* The metan of cohen's d can produce the effect size and standard error for you to be later used in the future analysis.
* LGA is the group affected by LGA and nLGA is the group not affected by nLGA

** Important ----> report up to 2 decimals and do not round up or down <----

ANSWER THESE QUESTIONS BASED ON THE "REVMAN.dta" file

Q1. What is the overall inverse-variance(IV)?

Q2. What is the exact name of the study with highest SMD?

Q3. Use the estimated ES and seES from the previous questions and set them as ES and SE then
perform meta regress on Weight. What is the coefficient of Weight?

Q4. Is this effect statistically significant?

Log Transformation

Hello,

Should you ever log transform ordinal data? Or can you only log transform continuous variables?

Best,
Tess

Understand group differences with Likert item dependent and independent variable

Hi,

I'd like to understand what may drive differences in my dependent variable between two sub-populations (wealthy and poor for example) and was thinking of applying something similar to the Blinder-Oaxaca decomposition.

However, my dependent variable is a Likert item and may also be interested in including Likert items among other independent variables (education, income, etc.). I am not sure whether this would be the right approach. Any suggestions would be greatly appreciated.

Thank you

Reference Country (for industries)

Dear all,

Thank you so much for all your help!
In the data below, I want to compare the value added per worker for a set of three industries (tech_intensity) using the US (country1==840) as a reference country. For instance, I want obtain the ratio of value added per worker( r_valworker) in industry (tech_intensity==1) for country (Bolivia==68) over the ratio of value added per worker (r_valworker) in industry (tech_intensity==1) for country US (country1==840). How can I get this variable for all set of countries using these industries' classification as reference and hence obtain three variables for the types of industries? Ideally, I would like to get these variables using the US as a country of reference (for the denominators). Thank you so much again!

Code:

 * Example generated by -dataex-. To install: ssc install dataex
clear
input int(country1 year) float tech_intensity double(Establishments Employment Wages OutputINDSTAT4 ValueAdded GrossFixed) float(r_valworker r_output_worker lval_per_worker share_emp_high share_emp_low)
4 1973 3                  .               51.8                  .                  . . . .         . . .015424144  .8860162
4 1973 2                  .                331                  .                  . . . .         . . .015424144  .8860162
4 1973 1                  . 2975.5714285714284                  .                  . . . .         . . .015424144  .8860162
4 1974 2                  .              332.4                  .                  . . . .         . .   .1526138  .7700793
4 1974 1                  . 3311.1428571428573                  .                  . . . .         . .   .1526138  .7700793
4 1974 3                  .              656.2                  .                  . . . .         . .   .1526138  .7700793
4 1975 2                  .              430.6                  .                  . . . .         . .    .158998  .7563943
4 1975 3                  .              809.2                  .                  . . . .         . .    .158998  .7563943
4 1975 1                  . 3849.5714285714284                  .                  . . . .         . .    .158998  .7563943
4 1976 2                  .              383.6                  .                  . . . .         . .  .16229185  .7682115
4 1976 1                  .  4240.285714285715                  .                  . . . .         . .  .16229185  .7682115
4 1976 3                  .              895.8                  .                  . . . .         . .  .16229185  .7682115
4 1977 1                  .  4536.571428571428                  .                  . . . .         . .   .1601448   .770586
4 1977 3                  .              942.8                  .                  . . . .         . .   .1601448   .770586
4 1977 2                  .              407.8                  .                  . . . .         . .   .1601448   .770586
4 1978 1                  .  4949.857142857143                  .                  . . . .         . .  .15292776   .768342
4 1978 3                  .              985.2                  .                  . . . .         . .  .15292776   .768342
4 1978 2                  .              507.2                  .                  . . . .         . .  .15292776   .768342
4 1979 3                  .              875.4                  .                  . . . .         . .  .13531454  .7813079
4 1979 1                  .  5054.571428571428                  .                  . . . .         . .  .13531454  .7813079
4 1979 2                  .              539.4                  .                  . . . .         . .  .13531454  .7813079
4 1980 2                  .                515                  .                  . . . .         . .  .14345832  .7692531
4 1980 1                  .  4538.571428571428                  .                  . . . .         . .  .14345832  .7692531
4 1980 3                  .              846.4                  .                  . . . .         . .  .14345832  .7692531
4 1981 3                  1                571                  .                  . . . .         . .  .10808428  .8097262
4 1981 2                8.2              434.2                  .                  . . . .         . .  .10808428  .8097262
4 1981 1 23.571428571428573  4277.714285714285                  .                  . . . .         . .  .10808428  .8097262
4 1982 3                  1              535.6                  .                  . . . .         . .  .11449337  .7896537
4 1982 2                7.4              448.4                  .                  . . . .         . .  .11449337  .7896537
4 1982 1 25.285714285714285               3694                  .                  . . . .         . .  .11449337  .7896537
4 1983 3                1.8              930.2                  .                  . . . .         . .  .21483813  .7188766
4 1983 1 24.285714285714285 3112.5714285714284                  .                  . . . .         . .  .21483813  .7188766
4 1983 2                  8                287                  .                  . . . .         . .  .21483813  .7188766
4 1984 2                  8              270.8                  .                  . . . .         . .  .20192805  .7342654
4 1984 3                1.8                857                  .                  . . . .         . .  .20192805  .7342654
4 1984 1 28.285714285714285  3116.285714285714                  .                  . . . .         . .  .20192805  .7342654
4 1985 1                 32 3275.5714285714284                  .                  . . . .         . .   .2065464   .732205
4 1985 3                1.8                924                  .                  . . . .         . .   .2065464   .732205
4 1985 2                8.8                274                  .                  . . . .         . .   .2065464   .732205
4 1986 1 39.285714285714285  4101.285714285715                  .                  . . . .         . .   .1600479  .7849823
4 1986 2                9.4              287.2                  .                  . . . .         . .   .1600479  .7849823
4 1986 3                2.2              836.2                  .                  . . . .         . .   .1600479  .7849823
4 1987 3                2.2              936.8                  .                  . . . .         . .   .1578889  .7862904
4 1987 1 44.285714285714285  4665.285714285715                  .                  . . . .         . .   .1578889  .7862904
4 1987 2               13.6              331.2                  .                  . . . .         . .   .1578889  .7862904
4 1988 3                2.4              990.4                  .                  . . . .         . .   .1737091  .7649385
4 1988 1 48.142857142857146  4361.285714285715                  .                  . . . .         . .   .1737091  .7649385
4 1988 2               14.8              349.8                  .                  . . . .         . .   .1737091  .7649385
4 1990 2                  .                  .                  .                  . . . .         . .          .         .
4 1990 3                  .                  .                  .                  . . . .         . .          .         .
4 1990 1                  .                  .                  .                  . . . .         . .          .         .
4 1991 3                  .                  .                  .                  . . . .         . .          .         .
4 1991 2                  .                  .                  .                  . . . .         . .          .         .
4 1991 1                  .                  .                  .                  . . . .         . .          .         .
4 1998 2                  .                  .                  .                  . . . .         . .          .         .
4 1998 3                  .                  .                  .                  . . . .         . .          .         .
4 1998 1                  .                  .                  .                  . . . .         . .          .         .
4 1999 3                  .                  .                  .                  . . . .         . .          .         .
4 1999 1                  .                  .                  .                  . . . .         . .          .         .
4 1999 2                  .                  .                  .                  . . . .         . .          .         .
4 2001 1                  .                  .                  .                  . . . .         . .          .         .
4 2001 2                  .                  .                  .                  . . . .         . .          .         .
4 2001 3                  .                  .                  .                  . . . .         . .          .         .
4 2002 1              28.75             907.75             944394         2671288.25 . . .  1642.744 .  .28250483  .3043843
4 2002 2               50.5               1232 1188568.3333333333           615059.5 . . . 597.68066 .  .28250483  .3043843
4 2002 3                5.5              842.5           905367.5              48240 . . . 26.487877 .  .28250483  .3043843
4 2003 1               33.5             1440.5         1454829.75            1927701 . . .  870.9985 .    .352271 .33841035
4 2003 2 52.666666666666664 1316.6666666666667 1231267.6666666667 1346656.3333333333 . . .  775.0224 .    .352271 .33841035
4 2003 3                 10             1499.5            1396271          1291969.5 . . .  876.4713 .    .352271 .33841035
4 2004 2 59.333333333333336 1446.6666666666667            1451344            2744957 . . . 1279.0132 .  .39170825  .3096508
4 2004 3               19.5             1897.5            1903635          2612586.5 . . . 1367.1324 .  .39170825  .3096508
4 2004 1               32.5               1500          1435877.5            4101760 . . . 1908.3418 .  .39170825  .3096508
4 2005 3                 65               2427          2369591.5           13365095 . . . 2914.0854 .   .3279951  .3963106
4 2005 1              68.25             2932.5         3023117.75        12556925.75 . . .  2521.687 .   .3279951  .3963106
4 2005 2 54.666666666666664               2040            2043644 15732356.333333334 . . . 4576.0645 .   .3279951  .3963106
4 2006 2  73.33333333333333               3050 3427117.6666666665 18293986.333333332 . . . 3503.0034 .   .3696422  .3481529
4 2006 3               85.5               3995          4321052.5           13940819 . . .  1822.082 .   .3696422  .3481529
4 2006 1              87.75            3762.75            3945432        14526693.75 . . .  2161.713 .   .3696422  .3481529
4 2007 2                 80               2490 3677740.3333333335           19694961 . . . 4658.5093 .   .3895239  .3752598
4 2007 3               97.5             4123.5            4753811           15091464 . . .  1886.143 .   .3895239  .3752598
4 2007 1              93.25             3972.5         4579799.25        16877821.25 . . . 2355.5303 .   .3895239  .3752598
4 2008 2                 79 3324.3333333333335 3982411.6666666665           20106290 . . .  3194.276 .   .3842091  .3377979
4 2008 1                 96             4039.5         4888196.75         16179228.5 . . . 2050.6348 .   .3842091  .3377979
4 2008 3                105             4594.5            5595744           22534501 . . .  3087.225 .   .3842091  .3377979
4 2009 1              91.25            4022.25         5083234.25         16283159.5 . . . 2346.4924 .    .384414  .3370483
4 2009 2                 81               3324            4431416 20853783.666666668 . . .  3622.996 .    .384414  .3370483
4 2009 3              101.5             4587.5          5791346.5         23306507.5 . . . 3562.5386 .    .384414  .3370483
4 2010 3                104             5192.5          7377542.5         25914665.5 . . .  3501.906 .   .4206014  .3123629
4 2010 1              90.75            3856.25          5478988.5         18891894.5 . . . 2730.8606 .   .4206014  .3123629
4 2010 2  78.33333333333333 3296.6666666666665  5135056.333333333 24767973.666666668 . . . 4130.6343 .   .4206014  .3123629
4 2011 3                102               5197          7604208.5           26116110 . . . 3247.0576 .  .42005575  .3128393
4 2011 1                 92             3870.5         5663292.75         20503237.5 . . .  2590.373 .  .42005575  .3128393
4 2011 2  78.33333333333333 3304.6666666666665            4835347 25112480.666666668 . . . 3987.5266 .  .42005575  .3128393
4 2012 2  76.66666666666667               3080  4246413.666666667           23288179 . . .  3829.427 .   .4237512  .3182113
4 2012 3                100               5058          6972510.5           23956529 . . .  3048.956 .   .4237512  .3182113
4 2012 1               89.5            3798.25         5232574.25         20110405.5 . . .  2418.603 .   .4237512  .3182113
4 2013 2  73.66666666666667 3096.3333333333335 4025702.6666666665 20724421.666666668 . . . 3402.9866 .   .4341611  .2997185
4 2013 3                 98             5051.5            6569455           22849533 . . .  2840.688 .   .4341611  .2997185
4 2013 1              89.75            3487.25            4533881         17272357.5 . . . 2296.0713 .   .4341611  .2997185
4 2014 1                  .                  .                  .           11528111 . . .         . .          .         .
end

Panel data error due to several companies inside country

I am currently writing my master thesis and need to check how percentages of females on the management board of firms in the US and Germany impact sustainability practices of those firms.
Therefore I use 40 German firms and 95 US firms and I wish to analyse those with panel data regression in STATA. However, I seem to face some problems because the way I organize my data is the following: I put the country in the first column and add 1 as a country ID for Germany and 2 for US in the second column. Then I have the years and then ESG scores and then female percentages as my dependent and independent variables the last two.

However when I wish to declare my data as panel data it does not work and
. xtset compid Year
repeated time values within panel
r(451);

. occurs.

I think this is due to the fact that I have several companies for the countries and therefore several times data for one year. Do you know how to solve that issue so to show to STATA that I have only 2 countries but in those countries several companies? Or can I only control for the country meaning I have to add all the data for each firm of 2014 for example together and divide it by the number of companies I have for that country. Because it seems that I have two panel data, first the country and second the companies in that country.

Summing observations that fall below a score

I am hitting a wall trying to create a variable that is a conditional sum where it looks at the score for a certain observation and sums all scores less than it in a particular day.

Here is some code to recreate the issue:

Code:

set obs 10000
set seed 1979

* Creating sample dataset
gen score = runiform()*100
gen day = runiformint(1,31)
bysort day (score): gen daily_rank = _n

Therefore, for each sorted observation, there would be a variable that is a reducing amount until the next day. That variable would look at the score for that observation (say, 80, for example) and sum all observations in that day that have scores from 0 to 79. I have not been able to figure out how to do this.

Thanks in advance!

generate mean value of a locality excluding the focal firm and firms in the same industry as the focal firm

Dear Statalist,

I am trying to generate the mean value of a locality, excluding the focal firm and firms in the same industry as the focal firms. Specifically, I have 100 firms located in 10 provinces across 13 industries. I attempt to calculate the average leverage of all firms located in province j, excluding the focal firm i and its industry rivals.

Any suggestions would be highly appreciated!

Best regards,

Yiyi

Thursday, April 28, 2022

Messy string data: how to do crosstabulations and descriptives

Dear STATA community,

Please help!
I have messy data containing over thirty variations of different words (e.g., leadership) in four columns for over 800 observations. A screenshot of the data is attached. I could not use the dataex command due to the large size of the file.

How do I quickly calculate how many times each of the words appear among all four columns?

I also need to do crosstabulations between these two words at a time. How would I do it?

Do I need to recode each word into a numeric value?

I would appreciate your help!
Olena

Labeling interaction terms in coefplot

Hi,

I am trying to create a coefficient plot using just an interaction term that interacts a dummy variable and a year, but the labels are rather busy while I just want them to read as the year.

Below is the code I used for the regression and coefplot:

Code:

reg employed i.num_kids##i.year age i.statefip [pw=wtsupp]
est sto model9


coefplot model9, keep(*.num_kids#*.year) coeflabels(1.num_kids##2000year= "2000")

I've also attached the coefplot that was generated below. As opposed to "2 or more kids#year=1981" I want the label to just read "1981".

Thanks in advance for any help. Array

Statistical inference

The country is Japan, the period is 1972-2020, and the data is annual information.
The current account over GDP ratio(cay_t), the net foreign assets over GDP ratio(nfay_t), the fiscal deficit over GDP ratio(fdy_t), the real interest rate(rir_t) and the weighted sum of the current account over GDP ratio(cay*_t) of a sample of countries with which Japan trades. The vector of variables Y_t = (cay_t, nfay_t, fdy_t, rir_t, cay*_t).
What I would to know is three things. One is the purpose of the estimated model along with the assumptions that are done one the stochastic variables that define Y_t.
Secondly, the statistical inference that these estimations results allow to do, indicating the null and alternative hypothesis.
Thirdly, whether it is possible to obtain a consistent and efficient estimate of the parameters that define the current account determinants function.

Generating new variables using macros and loops

Hello, I am trying to generate new variables with the use of macros and having issues with the syntax. Would anyone be able to help me with this please? Thank you!

Issue 1: I have 500 variables named bsw1-bsw500. Now I want to generate new variables based from these 500 variables by multiplying each of the variables with a variable called "product1" .
My desired output would be 500 new variables named p1bsw1 p1bsw2 p1bsw3...p1bsw500. This is the code that I am using but it returns an "invalid syntax" error.

Code:

gen product1=123456*54321
local bswtemp “bsw1-bsw500”
foreach x of local bswtemp{
gen p1`x'=`x'*product1
}

issue 2 : Again using the 500 bsw1-bsw500 variables, I want to create new variables where each of the bsw variables would have a suffix attached to it. The suffix will come from the categories of the variable wave. Wave is composed of years 2003 2004 2005....So my desired output is that if bsw1 is found in wave==2003, the generated output should be bsw1_2003. If bsw2 is found in both 2003 and 2004, there should be two variables generated named bsw2_2003 and bsw_2004

Code:

local bswtemp “bsw1-bsw500”
levelsof wave, local(levels)
foreach x of local bswtemp{
gen`x’`levels’=`x’ if `levels’==`levels’
}

This code returns an error that says "bsw1 is already defined"

How to generate the Strong Parallel Trend Assumption presented in Callaway, Goodman-Bacon & Sant'Anna 2021

I'm running a DID model with multiple timer periods and continuous treatment. I currently run the traditional Parallel Trends Assumption presented in Callaway, Goodman-Bacon & Sant'Anna (2021):

E[Yt(0) − Yt−1(0)|D = d] = E[Yt(0) − Yt−1(0)|D = 0]

However, the paper explains how under continuous treatment, stronger parallel trend assumptions are needed. The "Strong Parallel Trends Assumption" is presented as:

E[Yt(d) − Yt−1(0)] = E[Yt(d) − Yt−1(0)|D = d]

Does anyone know if Stata currently has a command that can run this stronger test? The paper writes how "pre-trend tests commonly used to detect violations of parallel trends cannot distinguish between “standard” and “strong” parallel trends". Has this changed since the paper has been published?

Thanks in advance!

Help with clock() function

Dear all,

I had a time variable in string format. I use clock() function to change it to Stata readable format.
Here is the problem:
RegistrationDate="2022-01-01 00:00:00" --> RegTime=31dec2021 23:59:33
My code is:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str19 RegistrationDate
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
"2022-01-01 00:00:00"
end
g RegTime = clock(RegistrationDate, "YMD hms")

Thanks in advance,
Best regards,
Cu

Wednesday, April 27, 2022

merging data with different time frequencies

Hello,
I have looked for guidance from other posts but could not solve this: I have two datasets- 'Dataset A has annual observations (for instance, on local infrastructure expenditure) while Dataset B has observations at interval of 5 years (for instance, on elections- winners, number of votes, winner's details). There are key variables which can be used for merging Dataset A and Dataset B- area code. But the person elected in 2001, for example, is responsible for infrastructural investment for the subsequent 5 years. How do I merge them to match each observation in Dataset A with corresponding leaser's observations in Dataset B?

Stata not showing results/command/log window?

Hi Everyone,

I'm running into a weird issue with Stata - the main results/command/log window is ~invisible? Stata is open and is running, but if you preview the Stata icon at the bottom of the screen, it just shows a white rectangle (like it's loading). When you select the white rectangle the computer acts as though you have selected something, but the Stata results/command/log window is not visible (e.g., the Chrome window would still be visible on the screen). Weirdly, you can still INTERACT interact with the results window (e.g., I blindly typed 'help regress' and a .pdf immediately opened and typing 'edit' opens the data editor). I have tried repairing and reinstalling the program but it hasn't fixed the issue.

This issue just started today and Stata was running fine previously.

I'm running Windows 10 Home on an Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz 2.11 GHz.

Has anyone else experienced this before? If so, how did you remedy it?

Cheers,

David.

Twoway subtitle options using -by-

How to adjust text options for subtitles in a simple twoway plot using -by- ?

Code:

sysuse auto, clear
tw scatter mpg disp, by(foreign, title("Not here", size(4)))
tw scatter mpg disp, by(foreign, subtitle("Or here", size(4)))

Various options using title/subtitle/subtitle1/subtitle2 in the -by- and outside....have failed. Surely it's right there!

Is there a way to format variable?

Hello friends,

I am currently using data that has one variable that shows the value of the date, for example, a variable named effective_date shows the value 20120105 means 2012-January-5th. If I only need the value of year (for my analysis purpose, as I need to create an arrival cohort based on years, not the exact date).

Right now I recode it like (20120000/20129999 = 2012), It works, but I am curious is there a way to formate variable just show the first four digitals instead of recoding it?

Please advise

Thanks

Difficulty in accessing my results tables via LaTex

Hello,

I am practicing how to create customised LaTex regression results tables in STATA. However, when I try compiling the LaTex results tables into pdf, Latex gives me an error message that reads "LaTex Error: Environment Table Undefined". Could there be a problem with the STATA code I wrote to create the tables? Here is the STATA code I wrote:

use wagepan, clear

xtset nr year
gen lhour = ln(hour)

levelsof year, loc(yr)
loc nyr: word count `yr'
di "NOTE: This is an `nyr' panel data"

forval i = 1/`nyr' {
loc ny: word `i' of `yr'
quietly {
reg lwage lhour educ if year == `ny'
margins, dyex(educ)
matrix eta = r(b)
matrix s2eta = r(V)
matrix list eta
matrix list s2eta
eststo, add(eta_educ eta[1, 1] s_educ sqrt(s2eta[1, 1]))
reg lwage lhour educ black hisp if year == `ny'
margins, dyex(educ)
matrix eta = r(b)
matrix s2eta = r(V)
matrix list eta
matrix list s2eta
eststo, add(eta_educ eta[1, 1] s_educ sqrt(s2eta[1, 1]))
reg lwage lhour educ c.exper##c.exper black hisp if year == `ny'
margins, dyex(educ exper)
matrix eta = r(b)
matrix s2eta = r(V)
matrix list eta
matrix list s2eta
eststo, add(eta_educ eta[1, 1] s_educ sqrt(s2eta[1, 1]) eta_exper eta[1, 2] s_exper sqrt(s2eta[1, 2]))
}
esttab _all using lwage`ny'.tex, replace ti("Wage Equation Estimation for `ny'") ///
label nomtitle nodepvars not se noobs ar2 booktabs ///
scalar(eta_educ s_educ eta_exper s_exper) ///
addnotes("$\eta$: Semi-elasticity of lwage with respect to educ, exper" ///
"$\eta_{se}$: Standard errors of the semi-elasticity") ///
substitute("_cons" "Constant" "eta_educ" "$\eta_{educ}$" "s_educ" ///
"$\eta_{se}$" "eta_exper" "$\eta_{exper}$" "s_exper" "$\eta_{se}$" ///
"c.exper#c.exper" "Experience$^2$")
}

I NEED HELP PLEASE

Unbalanced Panel Data Model

Hello Everybody,

Working on our MBA thesis regarding, the determinants of carbon emissions and have created panel data consisting
of 14 EU countries for a period of 20 years over several variables. For some of the variables we got missing data but
not more 3-4% out of 280 observations and only for some variables.

The steps we have followed so far are as follows:
1. Test for stationarity using Levin-Lin-Chiu test for the variables that contain balanced data. For the ones that
are unbalanced we used Fisher test. For the variables that had unit root we did the first differences transformation and
that transformed data to stationary.

2. We tried panel regression considering random effects then ran xttest0 (LM Test Breush Pagan) which failed to reject
the null thus the random effects seem not appropriate.

3. We tried panel regression considering fixed effects then observed the F-statistic which indicates that fixed effects
seem not appropriate as well. At this stage we also tried the fixed effects model with dummies for the individual countries
and the F statistic was as expected the same with the model without the dummies.

4. When we run the Hausman test we noticed that the order "hausman fixed random" returns Prob > chi2 = 0.0643 which is
more than 0.005 and fails to indicate for fixed effects. We also run "hausman fixed random, sigmamore" that returned
chi2(10) = 17.38 and Prob > chi2 = 0.0664 thus failing to reject the null hypothesis "Difference in coefficients not systematic".

5. When we run the Hausman test we noticed that the order "hausman random fixed" returns negative chi2 and a warning
of failure to meet the asymptotic assumption of the Hausman test.

6. We also checked for time fixed effects and we got F( 18, 159) = 0.77 | Prob > F = 0.7291 which is higher than
0.05 thus reject the null hypothesis of having time fixed effects. (maybe this was not necessary)

7. As this was not conclusive for random effects we considered it as an Indication to go with Pooled OLS knowing that
this is the less preferable method.

8. After that we run a simple vif test which looks good but the hettest is not good.

After this detailed description of the steps followed so far i would need a hand with some queries we are
struggling around with my thesis partner:

- Do you think that running a Pooled OLS with the robust option would eliminate any additional test for heteroskedasticity?
- Are there any other tests that could be done after the Pooled OLS regression to argue about a solid estimation model?
- After running the initial model we noticed that some introduced control variables are affecting severely the significance
of other variables thus removed from the model. Since we were following a process of elimination we removed the ones
with the higher P value in each step. Is this a rational approach or are there any suggestions?

Of course any comments are more than welcome! I read a great saying in the forum that "we are all beginners" but
noticed that there are some very experienced ones in here!

Grateful in advance!

Tuesday, April 26, 2022

Difference-in-differences with time varying differentiation.

I have a data panel with information on natural disasters occurred in different municipalities and at different points in time. A municipality can have zero, one or more than one disaster during the period of time. For example, a municipality X suffered an earthquake in t, t+2 and t+5.

I want to measure the effect of the disaster on educational outcomes such as dropout rate. This by using a difference-in-differences model with time varying differentiation.

I need help in testing the parallel trends assumption. Any advice on how to do it given the setting? I’ll appreciate it a lot

year	municipality	disasters
2011	A	24
2012	A	10
2013	A	32
2014	A	27
2015	A	21
2016	A	20
2017	A	15
2011	B	1
2012	B	1
2013	B	0
2014	B	0
2015	B	3
2016	B	1
2017	B	0

sts var, trend

Dear All, What is the minimal group size for level sufficient to use "sts test var, trend"? Specifically, having many levels in var, say, var = 1,2,3,..15 I want to check the trend for survivor functions for 1, 2, ,,,15 but not sure what is the minimal groups size required for each level var = 1, 2, 3, ....15. I know we can use this also for, say, var = 1, 1.5, 2.5 , .... but can't find any suggested minimal group size for each var value to make sure I can use this "sts test var, trend" test. Stata calculates this even having one observation for some levels (so it looks I can do kind of univariate analysis for continous variable, but one-observation survivor function used in testing the trend for all the levels rather doesn't make sense). If using "sts test var, trend strata(dummy)", what would be then minimal group required for all levels in dummy==0 and in dummy==1. Many Thanks.

A need for post-estimation tool returning sample units (Countries)

Dear friends,

I have my estimations and models for time-series cross-sectional data. I have 172 countries in my sample. After running regressions, Stata reports that "(Std. err. adjusted for 159 clusters in id)". Actually, I have a large number of missing data for different countries, therefore the number of countries decreases to 159. However, I don't want to count these excluded countries by reviewing my dataset. There should be a command like "tab country if missing= ......."

How can I learn which countries are included in the estimation automatically? I am preparing an appendix, and want to report these 159 countries by their name.

Best,

Diff in Diff (xtdidreg) with Multiple Treatment Times: Granger Test and Parallel Trends

HI Everyone,

I am using the new -xtdidreg- in Stata 17 to estimate the effects on GDP/capita by a policy treatment. Using these models is straightforward but the problem is I have a treatment time that varies when it happens between country (so not possible to run the granger test or do postestimation commands). Even more complicated is the fact that the treatment varies a lot within a country. It can occur in just one year or two followed by no treatments, or one year with and one year without, followed by a lull followed by a year with it etc. It also frequently occurs in consecutive years - sometimes for a long period of time (e.g. 10-12 years). This is vexing because I would expect these policies to have some short-term effects raising the issue of whether is worth counting "post-treatment" after a 12 year spell, for example. Indeed, it is difficult overall to determine exactly when the "post-treatment" effect technically begins. If it's one year followed by no policy years for many subsequent years, it's easy. But if the treatment lasts a long time or if it's on/off for a while, it's tricky. I don't know if anyone has thoughts on this. I understand it's a "theoretical" question relevant to the literature but thought I would raise it. I was thinking of defining "post-treatment" in several ways to account for this variation to run results to compare.

The more pressing issue is how I can run a granger test or graphically show pre/post treatment effects (i.e. parallel assumptions). Are there any substitutes or workarounds with this kind of data structure? I could define years leading up to treatment (negative values) and years after (positive values) and take the average. But this gets back to this issue of knowing when pre/post treatment even is.

Here is an example of my data. Note that in this case I define "post-treatment as the year after the first treatment ends until the rest of the period (imperfect though it is. I have other ideas here but just one example for simplicity).

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str55 country float country_ID str9 ISO int year double GDP_capita float(treat_policy ever_policy) byte(_seq _spell _end) float post_treatment
"Afghanistan" 1 "AFG" 1981                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1982                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1983                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1984                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1985                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1986                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1987                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1988                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1989                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1990                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1991                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1992                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1993                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1994                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1995                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1996                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1997                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1998                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 1999                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2000                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2001                 . 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2002 1189.784667657181 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2003  1235.81006329565 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2004 1200.278013217345 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2005 1286.793658939272 0 1  0 0 0 0
"Afghanistan" 1 "AFG" 2006 1315.789117418347 1 1  1 1 0 0
"Afghanistan" 1 "AFG" 2007 1460.825751379395 1 1  2 1 0 0
"Afghanistan" 1 "AFG" 2008 1484.114461325385 1 1  3 1 0 0
"Afghanistan" 1 "AFG" 2009 1758.904476637649 1 1  4 1 0 0
"Afghanistan" 1 "AFG" 2010 1957.029069908116 1 1  5 1 1 0
"Afghanistan" 1 "AFG" 2011 1904.559925655938 0 1  0 0 0 1
"Afghanistan" 1 "AFG" 2012 2075.491614353309 1 1  1 2 0 1
"Afghanistan" 1 "AFG" 2013 2116.465257712514 1 1  2 2 0 1
"Afghanistan" 1 "AFG" 2014 2102.384603759743 1 1  3 2 1 1
"Afghanistan" 1 "AFG" 2015 2068.265904133638 0 1  0 0 0 1
"Afghanistan" 1 "AFG" 2016 2057.067977553294 1 1  1 3 0 1
"Afghanistan" 1 "AFG" 2017 2058.400221069795 1 1  2 3 0 1
"Afghanistan" 1 "AFG" 2018 2033.804388937174 1 1  3 3 1 1
"Albania"     2 "ALB" 1981                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1982                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1983                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1984                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1985                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1986                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1987                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1988                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1989                 . 0 1  0 0 0 0
"Albania"     2 "ALB" 1990 4827.318483754909 0 1  0 0 0 0
"Albania"     2 "ALB" 1991 3496.580246100285 0 1  0 0 0 0
"Albania"     2 "ALB" 1992  3265.01742857304 0 1  0 0 0 0
"Albania"     2 "ALB" 1993 3599.027058196382 1 1  1 1 0 0
"Albania"     2 "ALB" 1994 3921.851207130857 1 1  2 1 0 0
"Albania"     2 "ALB" 1995 4471.871069987153 1 1  3 1 0 0
"Albania"     2 "ALB" 1996 4909.228104923629 1 1  4 1 1 0
"Albania"     2 "ALB" 1997 4400.577827365508 0 1  0 0 0 1
"Albania"     2 "ALB" 1998 4819.387533604832 1 1  1 2 0 1
"Albania"     2 "ALB" 1999 5475.169135425933 1 1  2 2 0 1
"Albania"     2 "ALB" 2000 5893.136232563291 1 1  3 2 0 1
"Albania"     2 "ALB" 2001 6441.853452386757 1 1  4 2 0 1
"Albania"     2 "ALB" 2002 6754.536003013315 1 1  5 2 0 1
"Albania"     2 "ALB" 2003 7154.784825121795 1 1  6 2 0 1
"Albania"     2 "ALB" 2004 7580.629091086668 1 1  7 2 0 1
"Albania"     2 "ALB" 2005 8040.878716781119 1 1  8 2 0 1
"Albania"     2 "ALB" 2006  8569.19111251045 1 1  9 2 0 1
"Albania"     2 "ALB" 2007 9150.518747043685 1 1 10 2 0 1
"Albania"     2 "ALB" 2008 9912.577242429737 1 1 11 2 1 1
"Albania"     2 "ALB" 2009 10313.92644130402 0 1  0 0 0 1
"Albania"     2 "ALB" 2010 10749.48744816581 0 1  0 0 0 1
"Albania"     2 "ALB" 2011 11052.79046351362 0 1  0 0 0 1
"Albania"     2 "ALB" 2012 11227.99448719723 0 1  0 0 0 1
"Albania"     2 "ALB" 2013 11361.29367637877 0 1  0 0 0 1
"Albania"     2 "ALB" 2014  11586.8637666424 1 1  1 3 1 1
"Albania"     2 "ALB" 2015 11878.48809334456 0 1  0 0 0 1
"Albania"     2 "ALB" 2016 12291.87337739279 0 1  0 0 0 1
"Albania"     2 "ALB" 2017 12770.97503708106 0 1  0 0 0 1
"Albania"     2 "ALB" 2018  13323.7533559132 0 1  0 0 0 1
"Argentina"   3 "ARG" 1981                 . 0 1  0 0 0 0
"Argentina"   3 "ARG" 1982                 . 0 1  0 0 0 0
"Argentina"   3 "ARG" 1983                 . 1 1  1 1 1 0
"Argentina"   3 "ARG" 1984                 . 0 1  0 0 0 1
"Argentina"   3 "ARG" 1985                 . 1 1  1 2 0 1
"Argentina"   3 "ARG" 1986                 . 1 1  2 2 0 1
"Argentina"   3 "ARG" 1987                 . 1 1  3 2 0 1
"Argentina"   3 "ARG" 1988                 . 1 1  4 2 1 1
"Argentina"   3 "ARG" 1989                 . 0 1  0 0 0 1
"Argentina"   3 "ARG" 1990 14144.76367018504 1 1  1 3 1 1
"Argentina"   3 "ARG" 1991  15221.7921466357 0 1  0 0 0 1
"Argentina"   3 "ARG" 1992 16209.32597721907 1 1  1 4 0 1
"Argentina"   3 "ARG" 1993 17312.03457467553 1 1  2 4 0 1
"Argentina"   3 "ARG" 1994 18092.02081388348 1 1  3 4 0 1
"Argentina"   3 "ARG" 1995 17362.52180256983 1 1  4 4 0 1
"Argentina"   3 "ARG" 1996 18104.69780949586 1 1  5 4 0 1
"Argentina"   3 "ARG" 1997 19347.53704007929 1 1  6 4 0 1
"Argentina"   3 "ARG" 1998 19866.24505296556 1 1  7 4 0 1
"Argentina"   3 "ARG" 1999 18981.16838273078 1 1  8 4 0 1
"Argentina"   3 "ARG" 2000 18625.28355378113 1 1  9 4 0 1
"Argentina"   3 "ARG" 2001 17610.75538632018 1 1 10 4 0 1
"Argentina"   3 "ARG" 2002 15523.03877631015 1 1 11 4 0 1
"Argentina"   3 "ARG" 2003 16714.67013318461 1 1 12 4 0 1
"Argentina"   3 "ARG" 2004  18032.6114274365 1 1 13 4 0 1
end
format %ty year

This is the code I use to run my models (without any controls for simplicity)

Code:

xtdidreg (GDP_capita  i.post_treatment i.ever_policy) (treat_policy), group(country_ID) time(year) nogteffects

loop variables with * wildcard

I have a list of variables that I want to loop through:

yr2008_q110atte
yr2009_q112atte
yr2010_q112atte
yr2011_q110atte
yr2012_q110atte

As you can see, they all start with "yr" and ends with "atte", although the question number in each year is different.

So, I write the loop below:

foreach i in 2008 2009 2010 2011 2012{
tab yr`i'_*atte yr`i'_age_grp
gen yr`i'_atte = . if yr`i'*atte == 1
gen yr`i'_atte = 1 if yr`i'*atte == 1
replace yr`i'_atte = 0 if yr`i'*atte == 1
}

The first line of tabulating variables works, but starting from the second line within the loop, the yr`i'*atte condition doesn't work, as stata returns "yr2008 ambiguous abbreviation".

Any clue on how to resolve this? Thank you!

Using Repeated Imputation Inference

Does anyone know how to use the rii command in order to whittle down the number of observations in a dataset? I merged a main file, summary extract file and weights and the number of obs is now at 28,888 but there are about 5 times more. The correct number of observations is 5,777. How do I incorporate using the rii into the codes in order to get the 5777 as final n, please?

Thank you.

Panel data regression with fixed effects

Hi Statalist,

Recently I am using the state-level mental health data and trying to find its association with state-level literacy. However, I have difficulty understanding the regression results when I use different commands and therefore list them below:
1. Panel regression with fixed effects (I first run a baseline regression and only include y and x: xtreg mental literacy, fe)
In this regression, we can see that the coefficient of x is significantly negative
Array
2.Panel regression with year fixed effects ( I use: xtreg y x i.year, fe, which gives me the same results when I use xtreg y x i.year i.state)
In this regression, we can see that the coefficient of x is no longer significant, my understanding is that the time fixed effect can be the explanation and 'absorb' the significance of x
Array

Array

3.I also use a 'reg y x i.state i.year, r' and obtain the same results
Array
'
Above all, I want to ask what is the difference between 'xtreg y x, fe' and 'xtreg y x i.year, fe'. Does the first one controls only the state fixed effect and the second controls both state and time fixed effect? It is hard for me to understand why the first regression shows significant coefficient of x, but the coefficients are insignificant for regression 2 and 3. To be more specific, How should I interpret the first regression (xtreg y x, fe)? Look forward to your reply!

Best regards,
Suyi Liu

Monday, April 25, 2022

Website that converts SAS command codes into STATA?

Hi
Does anyone know of a website or platform that can automatically generate STATA codes from SAS codes? For e.g. , below is a SAS code but I want to generate it in STATA.
thank you
marital status of the reference person: 1=married/living with partner, 2=neither married nor living with partner; IF (X8023 IN (1 2)) THEN MARRIED=1; ELSE MARRIED=2;

Canonical Correlation Analysis

Hello,

I would like to make a canonical correlation analysis with:

y1 y2 y3 y4 y5 as dependent variables

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 as independent variables.

I would like to get the canonical coefficients for each independent variable in each linear combination, as well as the F-ratio and significance of each linear combination.

Any suggestion?

Thanks!

Forest plot

Hello , I was wondering if you could anyone could help me with coding to reproduce the following graph please?

Array

Here is a photo of the start of my excel spreadsheet with headings:

Array

I would be very grateful if you could suggest some appropriate coding that I could follow please! Best wishes, Dearbhla

Clustered standard errors for a single variable in panel data

Dear Stata users,

I am working with panel data for funds and look for a solution to calculate standard errors (SEs) of a single variable (return) on a given day t. These SEs need to be clustered around the respective values for the cluster_variable (which refers to different investment styles in this case). I.e. I want the SEs only to be calcluated for all observations with the same cluster_variable on day t, and not for the whole sample on the day. As you can see, the cluster_variable is static over time for each fund.

Here is a short example.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(fund t) double return byte cluster_variable
1 1  .1 1
2 1  .2 1
3 1 .08 2
4 1  .9 2
5 1  .7 2
1 2  .4 1
2 2  .5 1
3 2 .03 2
4 2  .2 2
5 2  .4 2
end

I have contemplated to produce the SDs and then count the observations (obs) of each cluster variable to produce SEs, following SE = SD/sqrt(obs). So I started with: egen SD = sd(return) by (cluster_variable t) to generate the following.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(fund t) double return byte cluster_variable float SD
1 1  .1 1 .07071068
2 1  .2 1 .07071068
3 1 .08 2  .4275512
4 1  .9 2  .4275512
5 1  .7 2  .4275512
1 2  .4 1 .07071068
2 2  .5 1 .07071068
3 2 .03 2  .1852026
4 2  .2 2  .1852026
5 2  .4 2  .1852026
end

Can anyone provide a more elegant way to derive the desired SEs or provide help how to count the number of same cluster_variable observations on a given day t?
The counting result (obs) should look like this in a new variable:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(fund t) double return byte cluster_variable float(SD obs)
1 1  .1 1 .07071068 2
2 1  .2 1 .07071068 2
3 1 .08 2  .4275512 3
4 1  .9 2  .4275512 3
5 1  .7 2  .4275512 3
1 2  .4 1 .07071068 2
2 2  .5 1 .07071068 2
3 2 .03 2  .1852026 3
4 2  .2 2  .1852026 3
5 2  .4 2  .1852026 3
end

The data above is a simplified example. The real dataset has >1.000 funds and around 12 cluster variables.

Best,
Daniel

Sunday, April 24, 2022

m:m merging - clarification

Hi there,

I have 2 datasets; one shows the data of patients who were diagnosed with cancers and the other one shows the details of medications they use. Both datasets have multiple rows for one patient (ex: multiple cancers for 1 patient in the cancer data and multiple medicines for 1 patient in the medication data). I want to merge these 2 datasets to find out how is the medication use of patients with cancers. As this will be a m:m merging I’m not sure whether the usual stata code will appropriately work out.

Below is a part of 2 datasets that I want to get merged. Highly appreciate if you can help me to find a suitable command to merge them properly.

Thank you in advance!
Thushani

Medication dataset
input str11 ID str1 gender long medication_code float(frequency daily_dose)
"1" "M" 70238 1 1
"1" "M" 70238 1 1
"1" "M" 70238 1 1
"2" "F" 70238 1 1
"2" "F" 67117 1 1
"2" "F" 67117 1 1
"3" "M" 67117 1 1
"3" "M" 70238 1 1
"4" "F" 74121 1 1
"4" "F" 67265 1 1
"4" "F" 67265 1 1
"5" "M" 70238 2 2
"5" "M" 70238 2 2
"5" "M" 70238 2 2
"5" "M" 70238 2 2

Cancer dataset
input str11 ID str1 gender int age str4 site int morph
"1" "M" 79 "C160" 8070
"2" "F" 74 "C20" 8140
"2" "F" 74 "C187" 8140
"2" "F" 74 "C250" 8140
"3" "M" 75 "C187" 8140
"3" "M" 75 "C250" 8000
"4" "F" 85 "C259" 8000
"4" "F" 85 "C187" 8140
"4" "F" 85 "C250" 8000
"5" "M" 78 "C187" 8140
"5" "M" 78 "C187" 8140
"5" "M" 78 "C221" 8160
"5" "M" 78 "C20" 8140
"5" "M" 78 "C20" 8140

Subgroup Analysis and Sample Selection Bias

I employ a DiD model and want to compare the effect of independent variable (post##treatment) on a dependent variable (i.e. vote). I want to compare this effect across two sub-groups:

Votes where the crime involved was violent (i.e. violentcrime==1)
Votes where the crime involved was white-collar (i.e. whitecollar==1)

There are other subgroups, which I am not interested in. I started with a subgroup analysis, where I estimated the regression vote post##treatment if violentcrime==1 and did the same for the variable whitecollar. However, these two sub-groups only form a portion of the full sample. I was wondering whether this approach is viable or creates selection bias and requires a heckman 2 stage model to correct for this.

Thank you for your help,

Karan

How to drop repeated years in a panel dataset

Hi everyone,

I have a panel dataset with countries and years from 1999 to 2019. However, for each country there are some repeated years. For example, Bolivia has 2002 recorded 4 times. How do I drop the repeated years for each country while keeping the first occurrence? Meaning that I want 2002 to appear once in Bolivia, and the same for other repeated years for other countries.

Many thanks,
Yasmine

Baseline table package showing p-value and statistics-value

There has been some useful table1 package, such as table1 and table1_mc. However, they can not show the statistics value (Chi-square, t-value, and so on).

Is there a package that covers this function like the below?

Thank you!Array

Dummy dependent: ANOVA v logit

This is a statistics question rather than a Stata question, but would appreciate any advice.

After running an between-subject experiment, in which I had participants randomly assigned to one of four groups, I am trying to analyse my data.
The outcome was a binary response from the participants. Hence my dependent variable is a dummy variable.

Usually in experiments, with a small number of observations, ANOVA works well as the predictor variable (group to which the participant was assigned) is categorical.

Since the dependent variable is a dummy/binary variable, can ANOVA (or a variation of it) be used? Can it be suitable? Or should I simply run a logit with my eyes closed?

Create dataset - longitudinal

Hello,

I have a dataset (example below) where I use the command "append" to add the datasets: i_indresp, ce_indresp_w and cg_indresp_w.
I nee to use the example of commands below to prepare the data in order to be a longitudinal data. However, I´m having dificulties in using this code to achieve the required results. Can you help me with this?
Code that I need to use.

foreach w in a b c d e {
use `w'_indresp, clear
renpfix `w'_
gen wave=strpos("abcde", "`w'")
save temp`w', replace
}

Example dataset:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input long(pidp pid i_hidp) byte(i_pno i_sex) int i_dvage byte(cg_sex_cv ce_sex_cv ce_scghqa i_scghqa cg_scghqa cg_scghqd cg_scghqe)
   22445  10127798 277344816 1 2 33 . . .  2 . . .
   29925  10192697 619024416 1 2 40 . . .  1 . . .
   76165  10689869 141657616 1 2 34 . . .  2 . . .
  280165  12430439 754793216 1 2 38 . . .  2 . . .
  333205  12908843 415106696 1 2 26 . . .  2 . . .
  469205  13857142 415059096 1 2 27 . . .  2 . . .
  599765  14757249 211282816 1 2 30 . . .  2 . . .
  732365  15752658 619371216 1 1 32 . . .  4 . . .
 1587125  17870879 618269616 1 2 51 . . .  2 . . .
 1697285  42526507 210766016 1 1 44 . . .  2 . . .
 1833965  50832336 753508016 1 1 52 . . .  2 . . .
 2270525  76446336 823677216 1 2 20 . . .  3 . . .
 2626845  93565895 822997216 1 1 39 . . .  2 . . .
 2888645  96949503  76479616 1 2 28 . . .  2 . . .
 3587685 118781707 483813216 1 1 36 . . . -7 . . .
 3663845 119065835 416323216 1 1 33 . . . -8 . . .
 3667245 119074613 280309616 1 2 29 . . .  4 . . .
 3705325 119277506 144867216 1 2 63 . . .  2 . . .
 4454005 154358304 482840816 1 1 72 . . .  2 . . .
 4849085 176725733 347554816 1 1 34 . . .  4 . . .
 4853165 176977635 824289216 1 1 47 . . .  2 . . .
68002045  10017933  73018416 1 2 74 . . . -8 . . .
68002725  10023526  73025216 1 2 63 . . .  2 . . .
68004087        -8  68006816 1 1 67 . . .  2 . . .
68006127        -8  68013616 1 2 47 . . .  2 . . .
68006807        -8  68034016 1 2 80 . . .  3 . . .
68008847        -8  68040816 1 2 59 . . .  3 . . .
68009527        -8  68047616 1 1 39 . . .  2 . . .
68010887        -8  68054416 1 2 53 . . .  2 . . .
68011567        -8  68061216 1 1 43 . . .  2 . . .
68020407        -8  68095216 1 2 80 . . .  2 . . .
68020564        -8  68013616 2 1 46 . . .  3 . . .
68021765  10200436  73032016 1 2 58 . . .  4 . . .
68021781  30139368  73032016 2 1 23 . . .  2 . . .
68028567        -8  68108816 1 2 46 . . .  3 . . .
68028571        -8  68108816 2 1 50 . . .  2 . . .
68028575        -8  68129216 1 2 26 . . .  2 . . .
68029927        -8  68136016 1 2 45 . . .  2 . . .
68029931        -8  68136016 2 1 48 . . .  2 . . .
68031967        -8  68142816 1 2 69 . . .  2 . . .
68035365  10403086  73045616 1 1 65 . . .  2 . . .
68035367        -8  68156416 1 1 36 . . .  2 . . .
68036727        -8  68163216 1 1 85 . . .  2 . . .
68037407        -8  68170016 1 2 48 . . .  2 . . .
68041487        -8  68176816 1 2 47 . . .  3 . . .
68041491        -8  68176816 2 1 44 . . .  1 . . .
68042167        -8  68183616 1 1 47 . . .  2 . . .
68042171        -8  68183616 2 2 46 . . .  2 . . .
68043527        -8  68190416 1 1 63 . . .  2 . . .
68044207        -8  68197216 1 2 42 . . .  3 . . .
68044211        -8  68197216 2 1 44 . . .  2 . . .
68044887        -8  68204016 1 2 70 . . .  2 . . .
68045567        -8  68210816 1 2 55 . . .  1 . . .
68045571        -8  68210816 2 1 57 . . .  2 . . .
68046927        -8  68231216 1 2 44 . . .  2 . . .
68046935        -8  68231216 2 2 17 . . .  2 . . .
68048287        -8  68238016 1 1 70 . . .  2 . . .
68049647        -8  68244816 1 1 59 . . .  2 . . .
68049651        -8  68244816 2 2 57 . . .  2 . . .
68051007        -8  68251616 1 1 56 . . .  1 . . .
68051011        -8  68251616 2 2 49 . . .  3 . . .
68056447        -8  68258416 1 1 54 . . .  3 . . .
68056451        -8  68258416 2 2 52 . . .  2 . . .
68056455        -8  68265216 1 2 22 . . .  3 . . .
68056459        -8  68258416 3 2 17 . . .  2 . . .
68058485  10628126  73052416 1 1 72 . . .  2 . . .
68058487        -8  68272016 1 1 77 . . .  2 . . .
68058489  10628169  73052416 2 2 72 . . .  2 . . .
68058491        -8  68272016 2 2 68 . . .  3 . . .
68059171        -8  68278816 1 2 27 . . .  2 . . .
68060525  10641556  73059216 1 1 90 . . .  3 . . .
68060527        -8  68285616 1 1 43 . . .  2 . . .
68060531        -8  68285616 2 2 44 . . .  2 . . .
68060533 160066204  73059216 2 2 61 . . .  3 . . .
68060537 160066239  73059216 3 1 73 . . .  2 . . .
68061288        -8  68047616 2 2 31 . . .  2 . . .
68063247        -8  68292416 1 2 50 . . .  2 . . .
68063251        -8  68292416 2 1 52 . . .  2 . . .
68063255        -8  68292416 3 1 17 . . .  1 . . .
68063927        -8  68299216 1 2 47 . . .  2 . . .
68063931        -8  68299216 2 1 49 . . .  2 . . .
68064605  10653872  73066016 1 1 68 . . .  2 . . .
68064609  10653902  73066016 2 2 65 . . .  2 . . .
68068007        -8  68326416 1 1 50 . . .  2 . . .
68068011        -8  68326416 2 2 50 . . .  2 . . .
68068015        -8  68327096 1 2 25 . . .  2 . . .
68068082        -8  68054416 2 1 56 . . .  2 . . .
68069367        -8  68333216 1 1 89 . . .  2 . . .
68071407        -8  68340016 1 1 28 . . .  2 . . .
68072087        -8  68346816 1 1 63 . . .  2 . . .
68076167        -8  68360416 1 2 68 . . .  3 . . .
68076171        -8  68360416 2 1 71 . . .  2 . . .
68086371        -8  68367216 1 2 28 . . .  2 . . .
68087727        -8  68380816 1 2 67 . . .  2 . . .
68090447        -8  68387616 1 2 64 . . .  2 . . .
68091127        -8  68394416 1 2 43 . . .  3 . . .
68091135        -8  68401216 1 2 20 . . .  4 . . .
68091139        -8  68394416 2 1 18 . . .  3 . . .
68097245  10913629  73072816 1 2 67 . . .  2 . . .
68097927        -8  68421616 1 2 68 . . .  2 . . .
end
label values pid pid
label def pid -8 "inapplicable", modify
label values i_sex i_sex
label def i_sex 1 "male", modify
label def i_sex 2 "female", modify
label values i_dvage i_dvage
label values cg_sex_cv cg_sex_cv
label values ce_sex_cv ce_sex_cv
label values ce_scghqa ce_scghqa
label values i_scghqa i_scghqa
label def i_scghqa -8 "inapplicable", modify
label def i_scghqa -7 "proxy", modify
label def i_scghqa 1 "Better than usual", modify
label def i_scghqa 2 "Same as usual", modify
label def i_scghqa 3 "Less than usual", modify
label def i_scghqa 4 "Much less than usual", modify
label values cg_scghqa cg_scghqa
label values cg_scghqd cg_scghqd
label values cg_scghqe cg_scghqe

Thank you in advance

Creating a Graph That Shows Treatment Status by Group Over Years of A Panel

Hi all,

I'm stuck with a graph I'm trying to figure out. I am in one of those difference-in-differences with staggered rollout situations where I'm trying to create a graph that will nicely summarize the treatment status of each state in each year of my panel. I took a run at using catplot and took a look at the new waffle plot command, but I don't think there's any way to get either of those to achieve what I'm looking for.

Here's a very rough Microsoft Paint sketch of what I have in my head (with blue meaning treated and light gray meaning untreated):

Array

This is just a sketch, but my data is available here, and the setup is that each observation is a state in a fiscal year, with treatment status indicated by the "post" variable, which takes on values 0 or 1.

Any suggestions are most welcome!

Estimate whether coefficient of different models are JOINTLY equal to zero

Dear network,

I have 10 independent variables (x1, x2, x3, x4, x5, x6, x7, x8, x9, and x10) and five dependent variables (y1, y2, y3, y4, y5). I run the following regressions:

regress y1 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
regress y2 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
regress y3 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
regress y4 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
regress y5 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

I would like to test whether the coefficient of x1 is JOINTLY equal to 0 across all the five models. Similarly, I want to do it also for the rest of coefficients. This is, I would like to make 10 joint F-tests.

Any suggestion?

Thanks!

Saturday, April 23, 2022

DiD Regression with panel data measuring two factors before and after treatment

Hi all! I'd appreciate some help on trying to run a regression through Stata. I am relatively new to Stata so please bear with me.

I am trying to find the correct way to perform a staggered DiD regression where the outcome variable Y is the value of transactions over fortnightly time intervals in different countries before and after a treatment event. With basic panel data I know to xtset in stata by providing the panel id and time variable. However, in my data I am looking at two different types of transactions before and after the event (a control group and treatment group in the form of a dummy variable). This complicates the panel id.

"bifirst_case" represents the number of periods before the treatment takes effect in that country
"countryid" is the country in which the deal is taking place, and "pandemic2" is the dummy measuring whether a deal belongs to the control group or treatment group. "bimonth" is the time variable representing a fortnightly period. "logval" is log of "value" of transaction.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
bifirst_case bimonth countryid logval pandemic2 panelid value
54 1  22    1.81827 0  32  6.1611904
55 1  48  2.7242925 0  75          15.245625
54 1  45 -.41551545 0  70                .66
54 1 112   2.442263 0 183 11.499035
57 1 104  3.5263605 0 168                 34
56 1  52   1.252763 0  82                3.5
57 1  75  -.8556661 0 117               .425
55 1  38    3.98549 0  60  53.8116
55 1 100  1.2919532 0 161 3.63988
55 1  12  2.2940488 0  16              9.915
55 1 111   .8916701 0 181 2.4392000
54 1  36   1.109034 0  57 3.0314
54 1   5  .59332687 0   5               1.81
53 1  25  3.2770064 0  38 26.49633
54 1 106 -2.4769385 0 172               .084
54 1  99  1.7084503 0 159 5.5203
55 1  35    -.47965 0  55               .619
57 1  79  2.2016592 0 125               9.04
54 1  56   1.926436 0  88              6.865
57 1  50   1.460087 0  79  4.30633
57 1  11  -3.218876 0  14                .04
54 1 114  -.8209805 0 186                .44
57 1  47   3.582963 0  74              35.98
54 1  95  1.7340715 0 151  5.663
57 1  78 -.25489226 0 123               .775
58 1  23  1.0030187 0  34 2.72649
57 1  31  2.3066518 0  48           10.04075
55 1  87  -3.218876 0 140                .04
55 1 103     1.9268 0 166             6.8675
55 1 100   2.484907 1 162                 12
54 1 114  1.0986123 1 187                  3
53 1  25   3.822098 1  39               45.7
57 1  11  -3.218876 1  15                .04
57 1  98   .8837675 1 158               2.42
55 1 111   2.944439 1 182                 19
54 1  22  1.9603295 1  33 7.10166
57 1  50 -.10314076 1  80               .902
55 1  12  1.2837077 1  17               3.61
55 1  87  -3.218876 1 141                .04
54 1  56   2.926382 1  89              18.66
54 1  36  1.1118575 1  58               3.04
55 1  53  1.9789304 1  85              7.235
57 1  17  1.9600948 1  24                7.1
54 1 112  2.0537095 1 184  7.796769
54 1  68   .9555115 1 107                2.6
54 1   5   1.974081 1   6                7.2
55 2 103  1.5534317 0 166  4.727666
57 2  84  -1.258781 0 134               .284
57 2  31  -.8891621 0  48               .411
58 2  55   .6523252 0  87               1.92
54 2  68  1.2933564 0 106              3.645
54 2  99   3.391551 0 159 29.7119
57 2  78   2.589642 0 123             13.325
54 2  56  1.8112516 0  88             6.1181
54 2   5  .11800523 0   5 1.12524999
55 2 100  .22314355 0 161               1.25
57 2  15   3.753027 0  21              42.65
57 2  93  -3.296837 0 148               .037
54 2 112  1.8774613 0 183  6.536888
57 2  49   1.050909 0  77            2.86025
57 2  17   .9243912 0  23 2.52033
57 2   6    .896088 0   7               2.45
54 2 106   2.683074 0 172              14.63
53 2  25  3.8385315 0  38  46.4571
55 2  53  1.3763703 0  84             3.9605
58 2 102  -.4684049 0 165               .626
56 2  52  2.5971186 0  82             13.425
54 2  95   .4187103 0 151               1.52
55 2 111   1.314249 0 181 3.721954
55 2  35  .13540463 0  55              1.145
54 2  22   2.206967 0  32  9.0881111
58 2  29  2.1494339 0  44               8.58
57 2  75  2.0524557 0 117  7.78700
57 2  50  -.3930426 0  79               .675
55 2  12   1.510722 0  16               4.53
57 2  83   .9038132 0 132 2.469000
55 2  48  1.0598137 0  75 2.8858
55 2  87  -.3047152 0 140  .737333
54 2  36  2.0859354 0  57            8.05212
54 2  45  2.3042505 0  70 10.01666
55 2  38  1.5882143 0  60 4.89500
57 2 104  1.0078211 0 168 2.7396
55 2 110 -1.2039728 1 180                 .3
54 2  36  1.1776289 1  58  3.246
54 2 112   2.398777 1 184 11.0
55 2 111   .9341307 1 182              2.545
54 2  56   1.113501 1  89              3.045
55 2  48   2.605894 1  76 13.5433
54 2  95  -.2015044 1 152              .8175
55 2 103  4.2195077 1 167                 68
55 2 100   .8754687 1 162                2.4
54 3 112   2.501697 0 183 12.2
55 3  35   2.912799 0  55           18.4
54 3  95   1.041272 0 151  2.8328
55 3 110   2.995732 0 179                 20
57 3  81  1.5040774 0 129                4.5
54 3  22  2.2976012 0  32  9.9502
54 3  56   1.907829 0  88  6.73
54 3 114   3.633895 0 186              37.86
57 3  50  -.4764242 0  79               .621
end

Would doing the following be correct for declaring panel data?

Code:

 egen panelid = group(countryid pandemic2)
xtset panelid bimonth

Thank you in advance

Panel Data Regression: reg vs xtreg

Hello Everyone,

I have a panel data that I am trying to make a regression for and I am confused which regression command I should use, "reg" or "xtreg"

Below is a sub sample of my data using dataex command, but I couldn't upload it properly.

Thanks in advance

input str52 CompanyName int Year double(CAPEX NetPPE DepAmor)
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2010 2591	35674	1780
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2011 2794	36971	1792
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2012 3156	38763	1641
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2013 3802.1	40997	1572
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2014 4311	43635.1	1717.9
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2015 4605.3	46133.2	1819.3
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2016 5017.5	45639.3	1817.1
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2017 5799.3	50261.5	1838.2
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2018 6357	55099.1	2078.8
"American Electric Power Company, Inc. (NasdaqGS:AEP)" 2019 6143.7	61095.5	2429.3
"ALLETE, Inc. (NYSE:ALE)" 2010 248.9	1805.6	80.5
"ALLETE, Inc. (NYSE:ALE)" 2011 239.2	1982.7	90.4
"ALLETE, Inc. (NYSE:ALE)" 2012 405.8	2347.6	100.2
"ALLETE, Inc. (NYSE:ALE)" 2013 328.5	2576.5	116.6
"ALLETE, Inc. (NYSE:ALE)" 2014 598.5	3284.8	136.3
"ALLETE, Inc. (NYSE:ALE)" 2015 286.8	3669.1	167.5
"ALLETE, Inc. (NYSE:ALE)" 2016 265.6	3741.2	195.7
"ALLETE, Inc. (NYSE:ALE)" 2017 208.5	3822.4	182.1
"ALLETE, Inc. (NYSE:ALE)" 2018 312.4	3904.4	209.7
"ALLETE, Inc. (NYSE:ALE)" 2019 597.1	4405.6	212.3
"Amgen Inc. (NasdaqGS:AMGN)" 2010 580	5522	594
"Amgen Inc. (NasdaqGS:AMGN)" 2011 567	5420	680
"Amgen Inc. (NasdaqGS:AMGN)" 2012 689	5326	691
"Amgen Inc. (NasdaqGS:AMGN)" 2013 693	5349	644
"Amgen Inc. (NasdaqGS:AMGN)" 2014 718	5223	637
"Amgen Inc. (NasdaqGS:AMGN)" 2015 594	4907	608
"Amgen Inc. (NasdaqGS:AMGN)" 2016 738	4961	605
"Amgen Inc. (NasdaqGS:AMGN)" 2017 664	4989	604
"Amgen Inc. (NasdaqGS:AMGN)" 2018 738	4958	630
"Amgen Inc. (NasdaqGS:AMGN)" 2019 618	5397	635
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2010 2.79	37	1.31
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2011 1.4	36.5	1.82
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2012 1.51	35.1	2.5
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2013 .441	32.9	2.67
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2014 1.55	31.7	2.61
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2015 9.23	40.1	2.68
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2016 14	52.3	2.63
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2017 8.98	56.2	3.29
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2018 4.66	54.1	4.91
"Anika Therapeutics, Inc. (NasdaqGS:ANIK)" 2019 2.83	73.6	4.96
"ANSYS, Inc. (NasdaqGS:ANSS)" 2010 14.3	36.9	11
"ANSYS, Inc. (NasdaqGS:ANSS)" 2011 22.1	45.6	13.3
"ANSYS, Inc. (NasdaqGS:ANSS)" 2012 24	52.3	17.4
"ANSYS, Inc. (NasdaqGS:ANSS)" 2013 28.8	78.7	19.9
"ANSYS, Inc. (NasdaqGS:ANSS)" 2014 26	64.6	20.9
"ANSYS, Inc. (NasdaqGS:ANSS)" 2015 16.1	61.9	19.5
"ANSYS, Inc. (NasdaqGS:ANSS)" 2016 12.4	54.7	18.7
"ANSYS, Inc. (NasdaqGS:ANSS)" 2017 19.1	57.1	17.9
"ANSYS, Inc. (NasdaqGS:ANSS)" 2018 21.8	61.7	18.4
"ANSYS, Inc. (NasdaqGS:ANSS)" 2019 44.9	189.3	23.6
"APA Corporation (NasdaqGS:APA)" 2010 4922	38151	3083
"APA Corporation (NasdaqGS:APA)" 2011 7078	45448	4095
"APA Corporation (NasdaqGS:APA)" 2012 9464	53280	4955
"APA Corporation (NasdaqGS:APA)" 2013 9556	52421	4871
"APA Corporation (NasdaqGS:APA)" 2014 10964	48076	4526
"APA Corporation (NasdaqGS:APA)" 2015 4808	20838	3300
"APA Corporation (NasdaqGS:APA)" 2016 1949	18867	2618
"APA Corporation (NasdaqGS:APA)" 2017 2760	17759	2280
"APA Corporation (NasdaqGS:APA)" 2018 3904	18421	2405
"APA Corporation (NasdaqGS:APA)" 2019 2961	14158	2680
"The Boeing Company (NYSE:BA)" 2010 1125	8931	1746
"The Boeing Company (NYSE:BA)" 2011 1713	9313	1675
"The Boeing Company (NYSE:BA)" 2012 1703	9660	1811
"The Boeing Company (NYSE:BA)" 2013 2098	10224	1844
"The Boeing Company (NYSE:BA)" 2014 2236	11007	1906
"The Boeing Company (NYSE:BA)" 2015 2450	12076	1833
"The Boeing Company (NYSE:BA)" 2016 2613	12807	1889
"The Boeing Company (NYSE:BA)" 2017 1739	12672	2047
"The Boeing Company (NYSE:BA)" 2018 1722	12645	2114
"The Boeing Company (NYSE:BA)" 2019 1834	13684	2271
"Biogen Inc. (NasdaqGS:BIIB)" 2010 173.1	1641.6	146.8
"Biogen Inc. (NasdaqGS:BIIB)" 2011 208	1571.4	145.7
"Biogen Inc. (NasdaqGS:BIIB)" 2012 254.5	1742.2	163.4
"Biogen Inc. (NasdaqGS:BIIB)" 2013 246.3	1750.7	188.8
"Biogen Inc. (NasdaqGS:BIIB)" 2014 287.8	1765.7	198.3
"Biogen Inc. (NasdaqGS:BIIB)" 2015 643	2187.6	217.8
"Biogen Inc. (NasdaqGS:BIIB)" 2016 616.1	2501.8	263.8
"Biogen Inc. (NasdaqGS:BIIB)" 2017 867.4	3182.4	266.3
"Biogen Inc. (NasdaqGS:BIIB)" 2018 770.6	3601.2	269.3
"Biogen Inc. (NasdaqGS:BIIB)" 2019 514.5	3674.3	190.7
"Bristol0Myers Squibb Company (NYSE:BMY)" 2010 424	4664	473
"Bristol0Myers Squibb Company (NYSE:BMY)" 2011 367	4521	448
"Bristol0Myers Squibb Company (NYSE:BMY)" 2012 548	5333	235
"Bristol0Myers Squibb Company (NYSE:BMY)" 2013 537	4579	417
"Bristol0Myers Squibb Company (NYSE:BMY)" 2014 526	4417	392
"Bristol0Myers Squibb Company (NYSE:BMY)" 2015 820	4412	385
"Bristol0Myers Squibb Company (NYSE:BMY)" 2016 1215	4980	344
"Bristol0Myers Squibb Company (NYSE:BMY)" 2017 1055	5001	393
"Bristol0Myers Squibb Company (NYSE:BMY)" 2018 951	5027	392
"Bristol0Myers Squibb Company (NYSE:BMY)" 2019 836	6956	421
"Caterpillar Inc. (NYSE:CAT)" 2010 2586	12539	2220
"Caterpillar Inc. (NYSE:CAT)" 2011 3924	14395	2294
"Caterpillar Inc. (NYSE:CAT)" 2012 5076	16461	2426
"Caterpillar Inc. (NYSE:CAT)" 2013 4446	13078	2716
"Caterpillar Inc. (NYSE:CAT)" 2014 3379	12392	2798
"Caterpillar Inc. (NYSE:CAT)" 2015 3261	11888	2709
"Caterpillar Inc. (NYSE:CAT)" 2016 2928	10899	2708
"Caterpillar Inc. (NYSE:CAT)" 2017 2336	9823	2554
"Caterpillar Inc. (NYSE:CAT)" 2018 2916	9085	2435
"Caterpillar Inc. (NYSE:CAT)" 2019 2669	9230	2253