Wednesday, May 24, 2023

McNemar's chi2 test for proportion

Dear Community,
In a RCT design with 2 groups (control group n=18 ; and Exercise Training group n=17) that were tested at baseline and at 2 months, I would like to know if:
1/ the proportion of participant reporting some symptoms such as "muscle pain" is significantly decreasing after the 2-month follow up in each group.
2/ if there is a significant difference on the proportion change between the group

For the question 1/, I am not sure if the Mc Nemar's test is the test I have to use
For the question 2/, I have no idea about the variable and the test that I can use

I give you here an exemple :
control group n=18 ; and Exercise Training group n=17.
In the control group : 11 participants are reporting muscle pain at baseline while they were 8 at the end of the study.
In the exercise training group : 11 participants are reporting muscle pain at baseline while they were 6 at the end of the study.

thank you for your precious insight
Merci
F

Interpretation Magnitude Interaction Two Continuous Variables Economic Significance

Hi all,

I have read a few threads regarding this topic (e.g. https://www.statalist.org/forums/for...uous-variables) but have not yet found an answer to my question. Apologies if the identical question has already been answered.

I have a count dependent variable (Y) (linear TWFE results are shown, but marginal effects from Poisson are virtually identical), which I have regressed on two continuous variables (X1 and X2) and their interaction.

The summary statistics of the variables are:

Code:
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
  Y |    373,763    .1933016    2.363418          0        502
X1 |    373,763    .5695304    .6594848          0      4.364
X2 |    373,763   -2246.553     773.906   -3651.53    316.896
These are the results:

Code:
HDFE Linear regression                            Number of obs   =    373,763
Absorbing 3 HDFE groups                           F( 141,   7317) =    1955.95
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.0578
                                                  Adj R-squared   =     0.0292
                                                  Within R-sq.    =     0.0016
Number of clusters (token1)  =      7,318         Root MSE        =     2.3286

                                                       (Std. err. adjusted for 7,318 clusters in ID)
--------------------------------------------------------------------------------------------------------
                                       |               Robust
                            Y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
---------------------------------------+----------------------------------------------------------------
                        X1 |  -.5015123   .1355351    -3.70   0.000    -.7672003   -.2358244
                                       |
c.X1#c.X2 |  -.0002419   .0000586    -4.13   0.000    -.0003567    -.000127

//The coefficient on X2 alone is perfectly collinear with the fixed-effects
I am fully aware that the economic significance of a coefficient depends on the field, research question, etc. However, what is the methodology / logic to assess the economic significance of an interaction term between two continuous variables?

For instance, one often compares the magntiude of the coefficient on a dummy variable to the mean of the dependent variable to assess economic significance, and whether the effect is large enough to be "interesting". Similarly, what would one compare the coefficient on c.X1#c.X2 to in order to assess its magnitude?

Please let me know if my question is unclear, I would happy to rephrase it.







Thoughts about Cross-lagged panel

I am currently sitting on modeling for a paper and had actually assumed that I was well positioned with my model. However, in consultation with my supervisor, I now have doubts or doubts were sown. Since my supervisor has no idea at all about cross lagged models, I am looking for help here. My data consists of 3 waves in which I measure different constructs. Each of these constructs is used as a latent variable in my model. I have simplified my model here in the picture.
I have a set of questions, which form my independent latent construct X. I measured these, just like the mediator M and the dependent latent variable Y (set of questions), at 3 time points. I had previously read up on the topic of cross-lagged, however in such a model the relationship Y-->X would also be modeled, which is not supposed to be the case for me. Additionally, in a cross-lagged no effects within a wave are modeled.
My results are really very good for this model and fit my hypotheses. Nevertheless, my supervisor said that he has not seen the effects within a wave like this in a model and wonders if it can be done this way. The missing relation Y-->X is not a problem for him, but he still questions whether we are allowed to speak of a cross-lagged and whether the modeling can be done that way at all.
This would be my first question to the community. My second one goes in the direction that my supervisor said: If such a modeling is possible, then I would still have to use control variables such as age. Would this even be possible within the framework of this model. The interjection of a control variable is technically okay, but I don't know where I should insert it. The only thing that comes to mind is an additional mediator.

Thanks for your help in advance.
Christopher


Array

Tuesday, May 23, 2023

Order variables from left to right based on their latest value

I would like to order the variables (corresponding to the Stata command "order") from left to right based on the latest observation (here, the only observation of each variable that matters is the one corresponding to w_date=2022w46"). Here is the dataset:
Code:
clear all
input str7 w_date cpd cpdpf cpdpm
"2022w44" -.3522595 -.15837106 .45831277
"2022w45" -.05552628 .00728419 .63357966
"2022w46" -.04414876 .08082671 .65427268
end

In other words, I would like to automate the fact that I would like to order the variables as follow:
Code:
order w_date cpdpm cpdpf cpd
To give a bit of context, I have 100+ variables that are cumulative returns. What matters in my context is the highest generated return at the latest date of the sample, in this example 2022w46. I would like to see the one that generate the highest returns from left to right on the screen to select the later

Can you help me automate that?

Rolling windows VAR IRF and graphs

Dear Stata users,
I am trying to implement the var, vector autoregression, and a panel var (pavar) stata command using rolling windows. While they do perfectly work, I'm not able to perform rolling impulse response function graphs or check stability and Granger causality under rolling windows.
The code I am using is

rolling, window(12) clear : pvar var1 var2 var3 vr4 var5 var6 lags(1)


pvarirf, step(12) impulse( var1 var2 var3s) response( vr4 var5 var6 ) cum oirf mc(2000)

However, I have difficulty when using the -rolling- prefix.

I would be grateful if you could point me in the right direction on how to obtain rolling impulse responses. Probably, it will need to be programed manually
​​​​​

Thank you so much for your time and consideration. I could provide data should that be necessary.

Best regards,
Mario

Repeated measures ANOVA

Dear Statalist,

I'm currently trying to replicate a piece of analysis originally done on SPSS in Stata 17 (Mac OS); specifically, a repeated-measures ANOVA.

The original SPSS code is:

Code:
GLM confrontT1 confrontT3 WITH changeconfrontCOVID  
/WSFACTOR=toename_confront_speed 2 Polynomial    
  /METHOD=SSTYPE(3)    
   /EMMEANS=TABLES(toename_confront_speed) WITH(changeconfrontCOVID=-30)COMPARE ADJ(LSD)    
   /EMMEANS=TABLES(toename_confront_speed) WITH(changeconfrontCOVID=0)COMPARE ADJ(LSD)    
   /EMMEANS=TABLES(toename_confront_speed) WITH(changeconfrontCOVID=30)COMPARE ADJ(LSD)  
 /PRINT=DESCRIPTIVE ETASQ    
 /CRITERIA=ALPHA(.05)  
 /WSDESIGN=toename_confront_speed    
 /DESIGN=changeconfrontCOVID.
The variables confrontT1 and confrontT3 are measures of intention to confront individuals who violate a social norm measured at time 1 and 3; changeconfrontCOVID is a control variable about perceptions a Covid-related norm.

The analysis was done on a dataset in wide format. I first reshaped the dataset into long format. The variables "confrontT1" and "confrontT3" are now "confrontT", "real_id" is an individual identifier, and "t" is time. The variable changeconfrontCOVID is a non-integer ranging from -100 to 100. A sample of the long format dataset is below,

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input double confrontT float real_id byte t double changeconfrontCOVID
  5  3 1 -29.5
  4  3 3 -29.5
  6  5 1    18
  6  5 3    18
4.5  7 1     .
  1  7 3     .
  6 19 1  -7.5
  7 19 3  -7.5
1.5 21 1    30
1.5 21 3    30
  5 23 1   -65
3.5 23 3   -65
  1 24 1  22.5
  1 24 3  22.5
  4 32 1    62
end
Based on the Stata documentation, I believe the correct syntax should be:

Code:
anova confrontT c.changeconfrontCOVID / i.real_id | c.changeconfrontCOVID t c.changeconfrontCOVID#t, repeat(t)
However, I get the error message

invalid interaction specification;
'|' requires a factor variable on each side
If I remove the factor-variable operators and type real_id | changeconfrontCOVID, I get the following error message

changeconfrontCOVID: factor variables may not contain noninteger values
I was wondering if you could tell me how to work around this issue. I would be grateful for any help.

Best regards,
Miguel Fonseca

Order data in descending order

Dear all,

I use the TNIC data from Hoberg&Phillips and want to sort the scores pro competitor_rank1 in descending order in order to keep only the top 5 competitors for each gvkey1. I am not experienced with STATA and tried to do it using the following code, but the score is still in descending order:

clear
cd "C:\Users\etc"
use tnic3
sort gvkey1 score
gen competitor_rank = _n
sort competitor_rank gvkey1
egen competitor_rank1 = group(gvkey1)
bysort competitor_rank1 (score): gen descending_order = _n
sort competitor_rank1 -descending_order

alternative:
use tnic3
sort gvkey1 score
gen competitor_rank= _n
sort competitor_rank gvkey1
egen competitor_rank1=group(gvkey1)
sort -score competitor_rank1

(here the error is "- invalid name" even if the column is names "score")


Any help, however small would be greatly appreciated! Thank you in advance!

generating many dummy variables with the var name and label name

Hi all,
Although I was going through some of the website links, I could not solve my issue. So, I am posting here and seeking the help. My issue is that I need to generate manynew dummy varaibles out of a categorical variable. For example the variable name is districtname_main which is a string type and it has 32 districts (See the attached picture). I have to gen all of them into new binary dummy variables with the var name and label name of that particular district. So I tried the following command but did not work.

foreach var of districtname_main {
gen dist_`var'=0
replace dist_`var'=1 if districtname_main ==real("`var'")
}

I also tried many other methods suggested on various websites. But nothing worked. I wonder if someone could help me out from this issue.

Normalize variable as an expanding window

Hi,

I would like to normalize the price of the following as an expanding window taking into account all information up to time t (but not after t, to avoid look-ahead bias). So far, I have the code that takes into account the entire dataset, and hence, induces a look-ahead bias. Can someone help me edit the code? Thanks

Code:
clear
local ticker "BTC-USD"
getsymbols `ticker', fm(1) fd(1) fy(2012) lm(12) frequency(d) price(adjclose) yahoo clear
keep period p_adjclose_BTC_USD

* Normalization:
local To_Norm p_adjclose_BTC_USD
foreach var in `To_Norm' {
    sum `var', meanonly 
    gen n_`var' = (`var' - r(min)) / (r(max) - r(min)) 
}

Monday, May 22, 2023

How do I find out which excel file was imported into STATA (origin of data?)

Hey all,


Months ago, I imported an excel file into stata for analysis. However, I would like to do this again but in a new dta but I cannot remember which excel file I imported into stata. Is there a way to find this out through history?


Thank you,

Rajiv

Query on ordering coefficients from multiple models when using coefplot

Hi All

We need help with using the coefplot command. Briefly, we have run a series logistic regression models (5 models in total with a loop to make it easier). These regression models also include interactions terms between two categorical variables (ACEscore and sexual). This is followed by 'margins' to obtain predicted probabilities of the interaction categories/groups:

Code:
foreach var of varlist kesscat docdep selfharm suicide victim {
svy: logit `var' i.ACEscore##i.sexual 
eststo `var': margins i.ACEscore#i.sexual, post
}
We then use 'coefplot' to plot just the predicted probabilities in a figure:

Code:
coefplot kesscat docdep selfharm suicide victim
By default, coefplot orders the estimates/margins in the figure (see attached) according to the 'ACEscore' variables' categories. However, we would like to order the margins by model (kesscat docdep selfharm suicide victim). It doesn't seem like the order function is helpful here (unless if we're doing something wrong!).

Any tips would be really helpful!

Many thanks
/Amal


Reminder: UK Stata Conference submission deadline 26 May

UK Stata Conference, 7-8 September 2023: reminder

I'm bumping the thread at https://www.statalist.org/forums/for...and-first-call to remind you of the upcoming submission deadline.

Please follow that link for information about how to submit. Registration information will be coming shortly.

The headline new information is that the venue has had to be changed from UCL (where we were last year) to the LSE's Marshall Building -- further details to come. (As an LSE faculty member, I assure you that it's a great venue -- brand new -- and very conveniently located. See :https://www.lse.ac.uk/lse-information/campus-map)

We look forward to hearing from you,
Stephen (and Tim and Roger)

Stata Command: including interaction terms of the endogenous variable in 2SLS using xtivreg

Hi,

I am wondering if anyone had any experience in including an interaction term in 2SLS using xtivreg?

I am now having a successful 2SLS, not including any interaction terms as follows:

Code:
xtivreg PatMV_real_log Lev CAPX_AT Size rd_log sale_log number_log (inverse_D = toughness_normalized) i.fyear, fe
inverse_D is the endogenous variable and toughness is the instrumental variable

This is successful, and I have the results like this:

Code:
Fixed-effects (within) IV regression            Number of obs     =      5,066
Group variable: all_cluster                     Number of groups  =        554

R-squared:                                      Obs per group:
     Within  =      .                                         min =          1
     Between = 0.2823                                         avg =        9.1
     Overall = 0.3891                                         max =         31

                                                Wald chi2(37)     =   19922.00
corr(u_i, Xb) = 0.0856                          Prob > chi2       =     0.0000

------------------------------------------------------------------------------
PatMV_real~g | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
   inverse_D |   13.87709   5.667382     2.45   0.014     2.769226    24.98496
         Lev |   .0032969   .0029638     1.11   0.266     -.002512    .0091058
     CAPX_AT |   1.526791   .6927957     2.20   0.028     .1689361    2.884645
        Size |   .0862929   .0621718     1.39   0.165    -.0355617    .2081474
      rd_log |   -.084102   .3179503    -0.26   0.791    -.7072731     .539069
    sale_log |   .1517564   .0322944     4.70   0.000     .0884605    .2150523
  number_log |   .0404435   .0334691     1.21   0.227    -.0251548    .1060417
             |
       fyear |
       1986  |  -.1775167    .298694    -0.59   0.552    -.7629462    .4079129
       1987  |  -.0519888   .3001776    -0.17   0.862    -.6403262    .5363485
       1988  |  -.0481431   .3076404    -0.16   0.876    -.6511073     .554821
       1989  |   .0163748   .2844115     0.06   0.954    -.5410614     .573811
       1990  |  -.0718744   .2990964    -0.24   0.810    -.6580925    .5143437
       1991  |  -.2075746   .3433348    -0.60   0.545    -.8804984    .4653492
       1992  |  -.2268283   .3502615    -0.65   0.517    -.9133282    .4596715
       1993  |   .1302594   .3597807     0.36   0.717    -.5748978    .8354166
       1994  |    .394627    .391701     1.01   0.314    -.3730928    1.162347
       1995  |  -.2333602   .4519216    -0.52   0.606     -1.11911    .6523898
       1996  |   .1275317   .4482286     0.28   0.776    -.7509803    1.006044
       1997  |  -.1892146     .55809    -0.34   0.735    -1.283051    .9046217
       1998  |  -.3070482   .5836953    -0.53   0.599     -1.45107    .8369736
       1999  |  -.6370051   .6263369    -1.02   0.309    -1.864603    .5905927
       2000  |  -.8418151   .6641659    -1.27   0.205    -2.143556    .4599261
       2001  |  -1.119607   .7252079    -1.54   0.123    -2.540988    .3017744
       2002  |  -1.545625   .8150542    -1.90   0.058    -3.143102    .0518518
       2003  |  -1.636655   .8022847    -2.04   0.041    -3.209104   -.0642054
       2004  |  -1.786408   .8443835    -2.12   0.034    -3.441369   -.1314466
       2005  |  -1.878826   .8648506    -2.17   0.030    -3.573902   -.1837496
       2006  |  -2.023935   .8749888    -2.31   0.021    -3.738882   -.3089884
       2007  |  -2.223262   .9041965    -2.46   0.014    -3.995455   -.4510695
       2008  |  -2.062127   .8859147    -2.33   0.020    -3.798488   -.3257664
       2009  |   -2.24123   .9245712    -2.42   0.015    -4.053357   -.4291043
       2010  |  -2.249556   .9657848    -2.33   0.020    -4.142459   -.3566523
       2011  |  -2.138431    .923046    -2.32   0.021    -3.947568   -.3292945
       2012  |  -2.460573   .9528901    -2.58   0.010    -4.328203    -.592943
       2013  |  -3.264231   .9082721    -3.59   0.000    -5.044412   -1.484051
       2014  |  -4.738765   .8172058    -5.80   0.000    -6.340459   -3.137071
       2015  |  -7.135273     .62943   -11.34   0.000    -8.368933   -5.901613
             |
       _cons |   1.398011    .499853     2.80   0.005     .4183173    2.377705
-------------+----------------------------------------------------------------
     sigma_u |  1.6160709
     sigma_e |  1.3406031
         rho |   .5923664   (fraction of variance due to u_i)
------------------------------------------------------------------------------
 F test that all u_i=0: F(553,4475) =     4.74            Prob > F    = 0.0000
------------------------------------------------------------------------------
Instrumented: inverse_D
 Instruments: Lev CAPX_AT Size rd_log sale_log number_log 1986.fyear
              1987.fyear 1988.fyear 1989.fyear 1990.fyear 1991.fyear
              1992.fyear 1993.fyear 1994.fyear 1995.fyear 1996.fyear
              1997.fyear 1998.fyear 1999.fyear 2000.fyear 2001.fyear
              2002.fyear 2003.fyear 2004.fyear 2005.fyear 2006.fyear
              2007.fyear 2008.fyear 2009.fyear 2010.fyear 2011.fyear
              2012.fyear 2013.fyear 2014.fyear 2015.fyear toughness_normalized
And now, I want to add an interaction term: log_analysts. I found that in previous posts that integrating interaction terms into a new variable is recommended, so I did:

Code:
gen int_log_analysts = c.inverse_D#c.log_analysts

gen int_log_analysts_iv = c.toughness_normalized#c.log_analysts
And then I suppose that my code should be:

Code:
xtivreg PatMV_real_log Lev CAPX_AT Size rd_log sale_log number_log log_analysts (inverse_D int_log_analysts= toughness_normalized int_log_analysts_iv) i.fyear, fe
I wonder if this is correct. In particular, I want to know if variable 'log_analysts' should exist outside the parenthesis, and more importantly: does this mean that I am using one instrumental variable 'toughness' on one endogenous variable 'inverse_D'? I was worried that this implies using 2 instruments on 2 endogenous variables, with 2x2 = 4 first stage estimations.

For more information, I am trying to mimic the code using ivreg2:

Code:
ivreg2 y w (x c.x#c.w= z c.z#c.w)
where w is the interaction term variable, x is the endogenous variable and z is the instrument. I don't know if it will be the same in xtivreg though. Any other suggestions or recommendations will be much appreciated.

Many Thanks,
Harry

Sunday, May 21, 2023

number increased or decreased?

Dear All, I found this question here (in Chinese). The variable x denotes the product.
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input id year str1 x
1 2011 "a"
1 2011 "b"
1 2012 "a"
1 2012 "d"
1 2013 "a"
1 2013 "d"
1 2013 "c"
2 2011 "c"
2 2011 "d"
2 2012 "c"
2 2012 "a"
2 2013 "c"
2 2013 "b"
end
The question is to obtain (1) the number of new added product, and (2) the number of retired product, compared to the situation in the previous year for each firm (id) and each year. Any suggestions are highly appreciated.

inter-religious marriage

Dear sir

I am working with Census data. To make it simple, consider a database as follows:
id family_id position (in family) religion
1 1 head (1) A
2 1 partner (2) A
3 1 son (3) A
4 2 head (1) A
5 2 partner (2) B
6 3 head (1) A
7 4 head (1) B
8 4 partner (2) A
9 4 son (3) A
I need to count the number of same-religion and different-religion marriages.

In this very simple database above, the result should be: AA = 1; AB = 1; BA = 1.

So far I managed to create a new variable "position-religion" (1-A; 2-A; 3-A; 1A; 2-B; ...).

I guess I have to create another new variable, assigning the position_religion of the head of the family to all the other ids in the same family_ids. If I manage to do that, a simple frequency table will provide the result.

Could you please help me in creating this new variable?

Thanks in advance
Sergio Goldbaum

Problem with csdid command

I have the following dataset:

Code:
id_municipio    ano    uf    munic    imposto_renda    taxpayers    ano_3g    ano_4g    treatment_3g
1200013    2013    AC    Acrelândia    430014    128    2011    2018    2011
1200054    2013    AC    Assis Brasil    205203    97    2015    2021    0
1200104    2013    AC    Brasiléia    1708153    361    2015    2017    0
1200138    2013    AC    Bujari    243841    89    2010    2017    2010
1200179    2013    AC    Capixaba    236769    102    2015    2019    0
1200203    2013    AC    Cruzeiro do Sul    8797109    1031    2008    2016    2008
1200252    2013    AC    Epitaciolândia    2199140    252    2013    2017    2013
1200302    2013    AC    Feijó    447522    282    2015    2017    0
1200328    2013    AC    Jordão    36289    35    2016    2020    0
1200336    2013    AC    Mâncio Lima    597157    91    2011    2017    2011
1200344    2013    AC    Manoel Urbano    105124    66    2016    2021    0
1200351    2013    AC    Marechal Thaumaturgo    47086    20    2016    2020    0
1200385    2013    AC    Plácido de Castro    262876    146    2015    2019    0
1200807    2013    AC    Porto Acre    145717    87    2014    2018    0
1200393    2013    AC    Porto Walter    23606    15    2016    2021    0
1200401    2013    AC    Rio Branco    1.594e+08    10252    2008    2014    2008
1200427    2013    AC    Rodrigues Alves    54578    63    2015    2018    0
1200435    2013    AC    Santa Rosa do Purus    51197    34    2016    2020    0
1200500    2013    AC    Sena Madureira    1006259    347    2015    2017    0
1200450    2013    AC    Senador Guiomard    1706722    327    2011    2017    2011
I am trying to obtain a did estimator a la Callaway and Sant'anna running the following command:

Code:
clear

import delimited "C:\Users\mateu\OneDrive\Documentos\base_tax.csv"

keep if ano >= 2004 & ano <= 2013

generate treatment_3g = 0
replace treatment_3g = ano_3g if ano >= ano_3g

xtset id_municipio ano

csdid imposto_renda , ivar(id_municipio) time(ano) gvar(treatment_3g) method(dripw)
However, I obtain the following result:

Code:
 csdid imposto_renda , ivar(id_municipio) time(ano) gvar(treatment_3g) method(dripw)
Panel is not balanced
Will use observations with Pair balanced (observed at t0 and t1)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxx
Difference-in-difference with Multiple Time Periods

                                                             Number of obs = 0
Outcome model  : least squares
Treatment model: inverse probability
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
g2008        |
 t_2004_2005 |          0  (omitted)
 t_2005_2006 |          0  (omitted)
 t_2006_2007 |          0  (omitted)
 t_2007_2008 |          0  (omitted)
 t_2007_2009 |          0  (omitted)
 t_2007_2010 |          0  (omitted)
 t_2007_2011 |          0  (omitted)
 t_2007_2012 |          0  (omitted)
 t_2007_2013 |          0  (omitted)
-------------+----------------------------------------------------------------
g2009        |
 t_2004_2005 |          0  (omitted)
 t_2005_2006 |          0  (omitted)
 t_2006_2007 |          0  (omitted)
 t_2007_2008 |          0  (omitted)
 t_2008_2009 |          0  (omitted)
 t_2008_2010 |          0  (omitted)
 t_2008_2011 |          0  (omitted)
 t_2008_2012 |          0  (omitted)
 t_2008_2013 |          0  (omitted)
-------------+----------------------------------------------------------------
g2010        |
 t_2004_2005 |          0  (omitted)
 t_2005_2006 |          0  (omitted)
 t_2006_2007 |          0  (omitted)
 t_2007_2008 |          0  (omitted)
 t_2008_2009 |          0  (omitted)
 t_2009_2010 |          0  (omitted)
 t_2009_2011 |          0  (omitted)
 t_2009_2012 |          0  (omitted)
 t_2009_2013 |          0  (omitted)
-------------+----------------------------------------------------------------
g2011        |
 t_2004_2005 |          0  (omitted)
 t_2005_2006 |          0  (omitted)
 t_2006_2007 |          0  (omitted)
 t_2007_2008 |          0  (omitted)
 t_2008_2009 |          0  (omitted)
 t_2009_2010 |          0  (omitted)
 t_2010_2011 |          0  (omitted)
 t_2010_2012 |          0  (omitted)
 t_2010_2013 |          0  (omitted)
-------------+----------------------------------------------------------------
g2012        |
 t_2004_2005 |          0  (omitted)
 t_2005_2006 |          0  (omitted)
 t_2006_2007 |          0  (omitted)
 t_2007_2008 |          0  (omitted)
 t_2008_2009 |          0  (omitted)
 t_2009_2010 |          0  (omitted)
 t_2010_2011 |          0  (omitted)
 t_2011_2012 |          0  (omitted)
 t_2011_2013 |          0  (omitted)
-------------+----------------------------------------------------------------
g2013        |
 t_2004_2005 |          0  (omitted)
 t_2005_2006 |          0  (omitted)
 t_2006_2007 |          0  (omitted)
 t_2007_2008 |          0  (omitted)
 t_2008_2009 |          0  (omitted)
 t_2009_2010 |          0  (omitted)
 t_2010_2011 |          0  (omitted)
 t_2011_2012 |          0  (omitted)
 t_2012_2013 |          0  (omitted)
------------------------------------------------------------------------------
Control: Never Treated

See Callaway and Sant'Anna (2021) for details

. 
end of do-file
I would appreciate if FernandoRios could help me.




Hausman test after reghdfe with two-way cluster

Hello,

I am a complete novice in terms of Stata and have encountered a challenge I can’t seem to overcome. I have tried searching the forum (and the web) for answers but haven’t found one that lets me overcome the challenge.

I have a two-way fixed effects model with two-way clustering using reghdfe on panel data with T = 10 and N = 423. To test the use of FE I would like to run a Hausman test. However, I can't seem to figure out how to run a Hausman test with two-way clustering, nor am I sure how to run an equivalent model with RE since I am using reghdfe.

Code:
. reghdfe BVLEV_1 L.INNO L.SIZE L.AGE L.TANG L.PROF L.GRTH L.NDTS L.MrktD c.L.INNO#i.L.MrktD, absorb(Year FIRM) vce(cluster Year FIRM)
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(BVLEV_1 INNO) double(SIZE AGE) float(TANG PROF GRTH NDTS MrktD)
        .         0 25.140104442084407  4.736198448394496   .2805809   .1225458  1.728044          0 1
1.0106446         0 25.271209371482893   4.74493212836325  .26030242    .102776 1.0134124          0 1
 .9059817  .6931472  25.31613062108747 4.7535901911063645  .24522354  .12964672 1.2477313          0 1
1.0244187         0 25.199010475505425  4.762173934797756   .2690079  .09200912  1.211202          0 1
 .9241343         0 25.212149367154357  4.770684624465665  .25968078  .09518524  .9013919          0 1
 .9028375         0 25.177729405521426   4.77912349311153   .2600798  .07181802  .9193362          0 1
 .8666971         0  25.12733523500807  4.787491742782046  .25805798   .1064541 1.3683108          0 1
 .7276743         0 25.279035672649883  4.795790545596741   .2284465   .1694506 1.6841967          0 1
 .6681173         0 25.342871330740614  4.804021044733257  .21429476  .15791163 1.3381064          0 1
 .3958571         0 25.371792403193993  4.812184355372417  .23927324  .11115817  1.897365          0 1
        .         0 25.533537795756093  4.736198448394496  .07599856  .07023368  .7052656          0 1
1.1333276         0  25.54093107974409   4.74493212836325  .08478917   .1016431 .56339264          0 1
1.2531534         0  25.59322043341546 4.7535901911063645   .0899643  .04553749  .4952209          0 1
1.2157818         0 25.646176772676757  4.762173934797756  .08490727  .06337555  .6159959          0 1
1.2367824         0  25.69303746899461  4.770684624465665   .0767672  .05830297  .7429931          0 1
1.2092685         0 25.762287725487898   4.77912349311153  .06659363  .06440251  .6952531          0 1
1.0835882         0 25.717032993922302  4.787491742782046  .06419417  .06779025   .830492          0 1
1.2075226         0  25.79551036601393  4.795790545596741 .062812395  .04183229  .6352618          0 1
 1.202427         0  25.87415570568361  4.804021044733257  .06573743  .04855713  .4949357          0 1
1.1884935         0 25.879080258005892  4.812184355372417  .10492945  .05894396  .6899122          0 1
        .  5.755742 24.196535787431234  4.736198448394496  .14523625  .08608548 1.1286103          0 1
 .6924891  6.234411 24.136589398439337   4.74493212836325  .13476273  .05329347  .6220146          0 1
 .7580355  5.958425 24.152208078949545 4.7535901911063645   .1254282  .05762918  .8005841          0 1
 .7836558  5.720312  24.13471053509311  4.762173934797756  .13485539   .0600852  .8276075          0 1
 .7642021  6.570883 24.215400213631103  4.770684624465665  .15209243  .05421892 1.1268517          0 1
 .7537746  6.526495   24.3121315669416   4.77912349311153   .1557181  .09528464 1.0803771          0 1
 .7622334  6.122493 24.306479173183142  4.787491742782046   .1659288   .0975802 1.2334186          0 1
 .7618092  6.113682  24.39860402051902  4.795790545596741  .16392794  .10700773 1.2668626          0 1
 .8007717  5.940171 24.440441160239498  4.804021044733257   .1601782  .05361722  .9788643          0 1
 .8087863  4.867535 24.472299213903636  4.812184355372417          0   .0878969 1.0270094          0 1
        .         0  21.31712734644807 4.7535901911063645   .6161835 .035405505  .4728191   .0820877 0
 .4618512         0 21.337435284753997  4.762173934797756  .55176055  .04723222 .56485796  .07206534 0
 .5941403         0  21.36078455087655  4.770684624465665   .6694127  .05526852  .5964243     .04811 0
 .6679801         0   21.4577257720642   4.77912349311153     .60712  .08645748  .5824009  .04505779 0
        .         0 23.695985336242494  4.736198448394496   .7190117  .04887533   .553095  .03741924 1
 .7705187         0  23.86249221018245   4.74493212836325   .7456001  .14748636  .4454397 .033855498 1
  .682672  1.609438  23.65903576938081 4.7535901911063645   .3385791  .04103007 .43538275 .035442423 1
 .6286231  2.484907 23.587521747769042  4.762173934797756   .3293337 .029031644  .5320368  .03727587 1
.59591955  2.564949 23.578585956841195  4.770684624465665   .3091892  .03524181  .6112404 .034720317 1
.57921785         0  23.57379891306601   4.77912349311153   .2910932 .021688854  .6228262  .05062613 1
 .5702116 1.0986123  23.59059639575273  4.787491742782046  .26903787  .05534378  .7834185  .02917658 1
 .5314447   1.94591  23.58866160882936  4.795790545596741   .2601817          0 1.0465562  .02840274 1
 .5513356  .6931472 23.624852252665107  4.804021044733257   .2459092  .06453186  .7979235  .02741656 1
 .4822459         0 24.043795315577245  4.812184355372417   .8499157   .1873104  .7865514  .02106505 1
        .         0  22.28098931390445  4.770684624465665   .3840807  .09461883  .8451321  .04506727 1
 .6773132         0  22.42692242480151   4.77912349311153   .3791458  .12221717 1.1784273  .04725125 1
 .9319066 3.2580965 22.698318613027467  4.787491742782046   .4210063  .07018868  .8177284 .033333335 1
 .8060787  2.772589 23.159944664508608  4.795790545596741   .3842118   .1145391  .7923383  .04621534 1
 .8309887 1.3862944  23.28126804188242  4.804021044733257   .4161632  .10714693  .6933689  .04023709 1
  .908798  .6931472 23.206337470872167  4.812184355372417  .48673666    .079771  .7126593  .04398855 1
        .         0 24.037888110112725  4.663439094112067  .20535256  .07454053  .7056163          0 1
   .78619         0 24.107147496432795  4.672828834461906  .20766094   .0847304  .5626268          0 1
  .759084         0   23.7972083710832   4.68213122712422   .1808103   .0906814  .8042296          0 1
 .6937635         0  23.81739443240191 4.6913478822291435  .18839782  .09374084 1.2695315          0 1
 .7142321         0   23.8684520637118  4.700480365792417  .18411104  .09293253 1.0812703          0 1
 .7059544         0 23.980323373893167  4.709530201312334   .1874382   .1009305  1.300577          0 1
 .7414396         0 24.046249393101697  4.718498871295094  .19512346 .064213924 1.0034713          0 1
 .6974292         0 24.196225584866482  4.727387818712341  .19427302  .08273678 1.0555807          0 1
 .6635369         0 24.272739535271963  4.736198448394496  .20506677  .08730604  .7278484          0 1
 .7489944         0 24.335264246751386   4.74493212836325  .23616904   .0170746  .8092557          0 1
        .  6.767343  24.84321316656426 4.6443908991413725  .23399247   .1530494 1.5778688          0 1
 .6145601  6.999423 24.929092141846183  4.653960350157523  .22023107  .16188905 1.1156516          0 1
 .8525485   7.23201 24.901093336322354  4.663439094112067   .2153826  .12069391 1.2227247          0 1
 .6608409  7.525101 24.884739312357365  4.672828834461906   .1985463  .05202068 1.0817511          0 1
 .6709611  7.751045  24.99147966916149   4.68213122712422  .18963976  .09555482  .9194262          0 1
 .6702911  7.624131 25.054709451257363 4.6913478822291435  .19192806  .08739167  .7840677          0 1
  .656939  6.656726  25.01080320478844  4.700480365792417  .18766014  .08970646  .9115818          0 1
 .8626414  6.364751  25.07917947577587  4.709530201312334  .19410613   .1058089 1.0186787          0 1
 .6024605  5.986452  25.19152750567851  4.718498871295094  .19143543  .12674797  .7005265          0 1
 .8392656  5.631212  25.18092160765754  4.727387818712341   .2275152   .0998321  .9155501          0 1
        .         0 21.286085060398946  4.634728988229636  .18472242 -.03785776 .26592264          0 1
  .799298         0 21.361346297431304 4.6443908991413725   .1597743  .05477314  .1627046          0 1
 .8547568         0 21.417737140967827  4.653960350157523   .1535219  .05243392   .216123          0 1
 .8669059         0 21.487178517425455  4.663439094112067   .1421967  .05316325  .2241691          0 1
 .9250264         0  22.04845471091829  4.672828834461906  .10988519  .04891023  .2852529          0 1
 .8776872         0  22.17889431155691   4.68213122712422  .09379284  .08198924  .5102885          0 1
 .9845775         0 22.577186002682694 4.6913478822291435    .118811    .050721 .55521816          0 1
1.0301828         0 22.966175232858753  4.700480365792417  .11179797  .04159862  .3913315          0 1
 .9712481         0  23.10771659050218  4.709530201312334  .10200204   .0592987  .3879843          0 1
1.0335312         0  23.14959836545931  4.718498871295094 .067471266  .03896626   .333397          0 1
        .  4.875197 25.389907345032007   4.61512051684126  .19899076  .07385645  .7468041          0 1
 .8884309  4.990433 25.346550952369494  4.624972813284271    .244148   .0394978  .4098155          0 1
1.0635477  5.257495 25.424400534081013  4.634728988229636   .2192063  .05449627   .640307          0 1
1.1581262  5.521461  25.41631868691919 4.6443908991413725   .2271549   .0207892  .6342526          0 1
 .9200982  5.416101  25.44460887405928  4.653960350157523  .22096443  .04179115  .7652075          0 1
 .9452122 4.4426513 25.540138372607856  4.663439094112067  .22103485 .032837752  .7096379          0 1
 .9826649 4.4998097 25.521623333047007  4.672828834461906  .21811807 .073082656  .7592581          0 1
   .94481 4.1743875 25.530462163010963   4.68213122712422  .21400774  .08259459  .8449045          0 1
 .9603899  2.833213 25.546004059923572 4.6913478822291435  .21670502  .05456676  .5521333          0 1
1.0113537 2.0794415  25.51107419627515  4.700480365792417   .2304509 .029857313  .6177271          0 1
        .  .6931472  26.30204804427632  4.564348191467836  .23235023  .05660253  .7365565          0 1
1.0989978         0 26.461021305173926  4.574710978503383  .22211842   .0761485  .4265231          0 1
1.0847838         0 26.439131679448664  4.584967478670572   .2480531   .0520219  .5241512          0 1
1.1244308         0  26.33223200982412   4.59511985013459  .22592357 .020700116  .4890538          0 1
 1.299305         0 26.368868199539683  4.605170185988092  .22564612 .015210397  .4428461          0 1
1.0879376         0 26.482558701533332   4.61512051684126  .23024334          0 .42624265          0 1
1.0736886         0 26.434295321811224  4.624972813284271  .22703527  .05220648 .53496367          0 1
1.0511376 2.0794415 26.541242693517702  4.634728988229636  .21943107  .07352107  .7395294          0 1
1.1086005         0  26.69660714579725 4.6443908991413725   .2080971   .0726368  .4884022          0 1
1.0815631         0  26.79659604452628  4.653960350157523  .18448013  .09437406  .6085445          0 1
end

Hausman test after reghdfe with two-way cluster

Hello,

I am a complete novice in terms of Stata and have encountered a challenge I can’t seem to overcome. I have tried searching the forum (and web) for answers but haven’t found one that lets me overcome the challenge.

I have a two-way fixed effects model with two-way clustering using reghdfe on panel data with T = 10 and N = 423. To test the use of FE I would like to run a Hausman test. However, I can't seem to figure out how to run a Hausman test with two-way clustering, nor am I sure how to run a equivalent model with RE since I am using reghdfe.

Code:
. reghdfe BVLEV_1 L.INNO L.SIZE L.AGE L.TANG L.PROF L.GRTH L.NDTS L.MrktD c.L.INNO#i.L.MrktD, absorb(Year FIRM) vce(cluster Year FIRM)
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(BVLEV_1 INNO) double(SIZE AGE) float(TANG PROF GRTH NDTS MrktD)
        .         0 25.140104442084407  4.736198448394496   .2805809   .1225458  1.728044          0 1
1.0106446         0 25.271209371482893   4.74493212836325  .26030242    .102776 1.0134124          0 1
 .9059817  .6931472  25.31613062108747 4.7535901911063645  .24522354  .12964672 1.2477313          0 1
1.0244187         0 25.199010475505425  4.762173934797756   .2690079  .09200912  1.211202          0 1
 .9241343         0 25.212149367154357  4.770684624465665  .25968078  .09518524  .9013919          0 1
 .9028375         0 25.177729405521426   4.77912349311153   .2600798  .07181802  .9193362          0 1
 .8666971         0  25.12733523500807  4.787491742782046  .25805798   .1064541 1.3683108          0 1
 .7276743         0 25.279035672649883  4.795790545596741   .2284465   .1694506 1.6841967          0 1
 .6681173         0 25.342871330740614  4.804021044733257  .21429476  .15791163 1.3381064          0 1
 .3958571         0 25.371792403193993  4.812184355372417  .23927324  .11115817  1.897365          0 1
        .         0 25.533537795756093  4.736198448394496  .07599856  .07023368  .7052656          0 1
1.1333276         0  25.54093107974409   4.74493212836325  .08478917   .1016431 .56339264          0 1
1.2531534         0  25.59322043341546 4.7535901911063645   .0899643  .04553749  .4952209          0 1
1.2157818         0 25.646176772676757  4.762173934797756  .08490727  .06337555  .6159959          0 1
1.2367824         0  25.69303746899461  4.770684624465665   .0767672  .05830297  .7429931          0 1
1.2092685         0 25.762287725487898   4.77912349311153  .06659363  .06440251  .6952531          0 1
1.0835882         0 25.717032993922302  4.787491742782046  .06419417  .06779025   .830492          0 1
1.2075226         0  25.79551036601393  4.795790545596741 .062812395  .04183229  .6352618          0 1
 1.202427         0  25.87415570568361  4.804021044733257  .06573743  .04855713  .4949357          0 1
1.1884935         0 25.879080258005892  4.812184355372417  .10492945  .05894396  .6899122          0 1
        .  5.755742 24.196535787431234  4.736198448394496  .14523625  .08608548 1.1286103          0 1
 .6924891  6.234411 24.136589398439337   4.74493212836325  .13476273  .05329347  .6220146          0 1
 .7580355  5.958425 24.152208078949545 4.7535901911063645   .1254282  .05762918  .8005841          0 1
 .7836558  5.720312  24.13471053509311  4.762173934797756  .13485539   .0600852  .8276075          0 1
 .7642021  6.570883 24.215400213631103  4.770684624465665  .15209243  .05421892 1.1268517          0 1
 .7537746  6.526495   24.3121315669416   4.77912349311153   .1557181  .09528464 1.0803771          0 1
 .7622334  6.122493 24.306479173183142  4.787491742782046   .1659288   .0975802 1.2334186          0 1
 .7618092  6.113682  24.39860402051902  4.795790545596741  .16392794  .10700773 1.2668626          0 1
 .8007717  5.940171 24.440441160239498  4.804021044733257   .1601782  .05361722  .9788643          0 1
 .8087863  4.867535 24.472299213903636  4.812184355372417          0   .0878969 1.0270094          0 1
        .         0  21.31712734644807 4.7535901911063645   .6161835 .035405505  .4728191   .0820877 0
 .4618512         0 21.337435284753997  4.762173934797756  .55176055  .04723222 .56485796  .07206534 0
 .5941403         0  21.36078455087655  4.770684624465665   .6694127  .05526852  .5964243     .04811 0
 .6679801         0   21.4577257720642   4.77912349311153     .60712  .08645748  .5824009  .04505779 0
        .         0 23.695985336242494  4.736198448394496   .7190117  .04887533   .553095  .03741924 1
 .7705187         0  23.86249221018245   4.74493212836325   .7456001  .14748636  .4454397 .033855498 1
  .682672  1.609438  23.65903576938081 4.7535901911063645   .3385791  .04103007 .43538275 .035442423 1
 .6286231  2.484907 23.587521747769042  4.762173934797756   .3293337 .029031644  .5320368  .03727587 1
.59591955  2.564949 23.578585956841195  4.770684624465665   .3091892  .03524181  .6112404 .034720317 1
.57921785         0  23.57379891306601   4.77912349311153   .2910932 .021688854  .6228262  .05062613 1
 .5702116 1.0986123  23.59059639575273  4.787491742782046  .26903787  .05534378  .7834185  .02917658 1
 .5314447   1.94591  23.58866160882936  4.795790545596741   .2601817          0 1.0465562  .02840274 1
 .5513356  .6931472 23.624852252665107  4.804021044733257   .2459092  .06453186  .7979235  .02741656 1
 .4822459         0 24.043795315577245  4.812184355372417   .8499157   .1873104  .7865514  .02106505 1
        .         0  22.28098931390445  4.770684624465665   .3840807  .09461883  .8451321  .04506727 1
 .6773132         0  22.42692242480151   4.77912349311153   .3791458  .12221717 1.1784273  .04725125 1
 .9319066 3.2580965 22.698318613027467  4.787491742782046   .4210063  .07018868  .8177284 .033333335 1
 .8060787  2.772589 23.159944664508608  4.795790545596741   .3842118   .1145391  .7923383  .04621534 1
 .8309887 1.3862944  23.28126804188242  4.804021044733257   .4161632  .10714693  .6933689  .04023709 1
  .908798  .6931472 23.206337470872167  4.812184355372417  .48673666    .079771  .7126593  .04398855 1
        .         0 24.037888110112725  4.663439094112067  .20535256  .07454053  .7056163          0 1
   .78619         0 24.107147496432795  4.672828834461906  .20766094   .0847304  .5626268          0 1
  .759084         0   23.7972083710832   4.68213122712422   .1808103   .0906814  .8042296          0 1
 .6937635         0  23.81739443240191 4.6913478822291435  .18839782  .09374084 1.2695315          0 1
 .7142321         0   23.8684520637118  4.700480365792417  .18411104  .09293253 1.0812703          0 1
 .7059544         0 23.980323373893167  4.709530201312334   .1874382   .1009305  1.300577          0 1
 .7414396         0 24.046249393101697  4.718498871295094  .19512346 .064213924 1.0034713          0 1
 .6974292         0 24.196225584866482  4.727387818712341  .19427302  .08273678 1.0555807          0 1
 .6635369         0 24.272739535271963  4.736198448394496  .20506677  .08730604  .7278484          0 1
 .7489944         0 24.335264246751386   4.74493212836325  .23616904   .0170746  .8092557          0 1
        .  6.767343  24.84321316656426 4.6443908991413725  .23399247   .1530494 1.5778688          0 1
 .6145601  6.999423 24.929092141846183  4.653960350157523  .22023107  .16188905 1.1156516          0 1
 .8525485   7.23201 24.901093336322354  4.663439094112067   .2153826  .12069391 1.2227247          0 1
 .6608409  7.525101 24.884739312357365  4.672828834461906   .1985463  .05202068 1.0817511          0 1
 .6709611  7.751045  24.99147966916149   4.68213122712422  .18963976  .09555482  .9194262          0 1
 .6702911  7.624131 25.054709451257363 4.6913478822291435  .19192806  .08739167  .7840677          0 1
  .656939  6.656726  25.01080320478844  4.700480365792417  .18766014  .08970646  .9115818          0 1
 .8626414  6.364751  25.07917947577587  4.709530201312334  .19410613   .1058089 1.0186787          0 1
 .6024605  5.986452  25.19152750567851  4.718498871295094  .19143543  .12674797  .7005265          0 1
 .8392656  5.631212  25.18092160765754  4.727387818712341   .2275152   .0998321  .9155501          0 1
        .         0 21.286085060398946  4.634728988229636  .18472242 -.03785776 .26592264          0 1
  .799298         0 21.361346297431304 4.6443908991413725   .1597743  .05477314  .1627046          0 1
 .8547568         0 21.417737140967827  4.653960350157523   .1535219  .05243392   .216123          0 1
 .8669059         0 21.487178517425455  4.663439094112067   .1421967  .05316325  .2241691          0 1
 .9250264         0  22.04845471091829  4.672828834461906  .10988519  .04891023  .2852529          0 1
 .8776872         0  22.17889431155691   4.68213122712422  .09379284  .08198924  .5102885          0 1
 .9845775         0 22.577186002682694 4.6913478822291435    .118811    .050721 .55521816          0 1
1.0301828         0 22.966175232858753  4.700480365792417  .11179797  .04159862  .3913315          0 1
 .9712481         0  23.10771659050218  4.709530201312334  .10200204   .0592987  .3879843          0 1
1.0335312         0  23.14959836545931  4.718498871295094 .067471266  .03896626   .333397          0 1
        .  4.875197 25.389907345032007   4.61512051684126  .19899076  .07385645  .7468041          0 1
 .8884309  4.990433 25.346550952369494  4.624972813284271    .244148   .0394978  .4098155          0 1
1.0635477  5.257495 25.424400534081013  4.634728988229636   .2192063  .05449627   .640307          0 1
1.1581262  5.521461  25.41631868691919 4.6443908991413725   .2271549   .0207892  .6342526          0 1
 .9200982  5.416101  25.44460887405928  4.653960350157523  .22096443  .04179115  .7652075          0 1
 .9452122 4.4426513 25.540138372607856  4.663439094112067  .22103485 .032837752  .7096379          0 1
 .9826649 4.4998097 25.521623333047007  4.672828834461906  .21811807 .073082656  .7592581          0 1
   .94481 4.1743875 25.530462163010963   4.68213122712422  .21400774  .08259459  .8449045          0 1
 .9603899  2.833213 25.546004059923572 4.6913478822291435  .21670502  .05456676  .5521333          0 1
1.0113537 2.0794415  25.51107419627515  4.700480365792417   .2304509 .029857313  .6177271          0 1
        .  .6931472  26.30204804427632  4.564348191467836  .23235023  .05660253  .7365565          0 1
1.0989978         0 26.461021305173926  4.574710978503383  .22211842   .0761485  .4265231          0 1
1.0847838         0 26.439131679448664  4.584967478670572   .2480531   .0520219  .5241512          0 1
1.1244308         0  26.33223200982412   4.59511985013459  .22592357 .020700116  .4890538          0 1
 1.299305         0 26.368868199539683  4.605170185988092  .22564612 .015210397  .4428461          0 1
1.0879376         0 26.482558701533332   4.61512051684126  .23024334          0 .42624265          0 1
1.0736886         0 26.434295321811224  4.624972813284271  .22703527  .05220648 .53496367          0 1
1.0511376 2.0794415 26.541242693517702  4.634728988229636  .21943107  .07352107  .7395294          0 1
1.1086005         0  26.69660714579725 4.6443908991413725   .2080971   .0726368  .4884022          0 1
1.0815631         0  26.79659604452628  4.653960350157523  .18448013  .09437406  .6085445          0 1
end

Saturday, May 20, 2023

Converting Unix time to human readable time format

Hi, I am having an issue converting what is assumably Unix time stamp to human readable time format.

* Example generated by -dataex-. For more info, type help dataex
Code:
clear
input long ORDER_TIME
205151
134137
145136
204606
211504
210942
212004
204921
205859
212712
202832
163404
212028
205544
142451
143439
155042
154539
180347
180922
180004
121746
142007
152555
162113
153127
171127
151331
161443
113412
131426
123955
123619
194452
150749
213547
131254
122751
125313
143205
171811
143902
140321
161717
140404
140210
140659
140311
213227
213112
212630
152042
153656
152006
152628
142653
140852
202313
201223
200808
144346
143442
145750
145629
151253
150323
153620
160810
154556
150338
152300
164043
191606
191427
174344
175629
180221
175809
135401
161011
160827
162033
163723
160521
 94127
165558
174701
193920
102503
182007
190233
165509
185454
184617
170928
140620
180016
180752
152800
164021
end
When I convert them using:

Code:
generate double statatime = ORDER_TIME*1000 + mdyhms(1,1,1970,0,0,0)

format statatime %tC

list ORDER_TIME statatime, noobs

the time window of the variable is in 1970s, where the actual time window is between 2019-2020.

I have searched through other threads regarding similar issues, and tried converting the ORDER_TIME variable to float,

Code:
recast float ORDER_TIME
but the results are way off from 2019-2020 time window.

So, at this point I am assuming that this might not be a Unix time stamp, or I have missed out something important converting them.

Array

Summarized ORDER_TIME has minimum value of 2, which I assume is not a valid Unix timestamp.

I'm starting to think that this might be some sort of time indicator of a day, but not sure what exactly it is indicating.

Can anyone provide advice on the matter?

Thanks in advance.

Advice on backtesting when an interaction is significant

Hi all,

Currently in the works of writing my thesis where one of the regressions is the following:

Array
rhp = Real House Prices
shock = Start of unconventional monetary policy
hsr = housing supply
hhdi = household income
mr = mortgage rate
unem = unemployment
hhd = household debt

Basically I am testing whether UMP had an effect on house prices in EZ when accounting for the housing supply (which seems to be the case). However, for the period Q1 2010 - Q1 2021 (shock = Q1 2015) I want to research in which quarter the variable became significant sort of in a backtesting manner. I used the following code, but it drops all variables due to collinearity (I assume the collinearity between quarters):

gen significance = .

forval i = 1/44 {
local quarter = 200 + `i'

xtreg rhp hsr shock hsrxshock hhdi mr unem hhd if quarter == `quarter', fe

// Check the significance of the interaction term
local t_statistic_of_interaction = _b[hsrxshock]
if abs(`t_statistic_of_interaction') > 1.96 {
replace significance = `i' if missing(significance)
}
}

Anyone has any ideas on how to determine how I can test in which quarter the interaction variable became significant?

Many thanks!

Matthias

catplot with "empty" categories

I'd like to plot the response percentages to a question using catplot or tabplot. However some of the response categories are empty (i.e., have a value of 0), yet I cannot figure out how to include these response categories in the plot.

Here is a minimal working example:

Code:
use "https://www.dropbox.com/s/obmd1nqc4vvdd0l/example.dta?dl=1", clear

label define placementl 10 "Bottom 10%" 9 "2nd decile" 8 "3rd decile" 7 "4th decile" 6 "6th decile" 5 "5th decile" 4 "4th decile" 3 "3rd decile" 2 "2nd decile" 1 "Top 10%"

label val placement placementl

catplot placement, ///
    title("Decision Making Ability, Relative to Others", color(black)) ///
    percent ///
    blabel(bar, position(outside) size(medsmall) format(%9.1f))  ///
    bar(1, bcolor(orange) blcolor(none)) ///
    graphregion(color(white)) ///
    ylabel(0(5)30, nogrid) ///
    ytitle("")
Array


The graph is missing the 7th, 8th, 9th, and 10th decile because there is no data to plot for these categories, but I would still like to have them as part of the graph.

Alternatively, I can get close with tabplot but haven't been able to figure out how to get the value labels to display outside of each bar (as in the graph above using catplot). Here is some code and the graph it produces:

Code:
use "https://www.dropbox.com/s/obmd1nqc4vvdd0l/example.dta?dl=1", clear

gen placement2 = 11 - placement

tabplot placement2, ///
     horizontal ///
     yasis ///
     percent ///
     bcolor(orange) blcolor(none) ///
     graphregion(color(white)) ///
     ytitle("") ///
     title("Decision making ability", color(black)) ///
     note("") ///
     subtitle("") ///
     xsize(4) ysize(6) ///
     ylabel(1 "Bottom 10%" 2 "2nd decile" 3 "3rd decile" 4 "4th decile" 5 "6th decile" 6 "5th decile" 7 "4th decile" 8 "3rd decile" 9 "2nd decile" 10 "Top 10%")
Array


The graph gets me there, other than I'd like to add values just outside of each bar (my understanding is that with tabplot I can only post them "underneath" each bar, rather than "on top" of each bar).

Would be grateful for any suggestions or tips, using either catplot or tabplot (or even hbar).

coefplot line color

Hi
I was wondering how I can change the color of a graph line generated by coefplot. here is my code:
Code:
coefplot , drop(_cons) vertical baselevels ///
ci(95) ciopts(recast(rcap) alcolor(%60)) recast(connected) ///
ylabel(#9, grid) ytitle("Percentage, %") ///
And I get the attached graph. the color of the line is blue. I want to change it to other colors. I'd appreciate your insights on that.

Thanks,Array

Friday, May 19, 2023

Graph legend(off) does not work


Hello,

I have health expenditure data for 20 countries covering 1971-2019 which looks like below. I also calculated a Fourier function to see how good the Fourier function fits the data for each country.

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float year long country float id double health float fourier
1971 1 1  -.048536042892761824  -.03952992
1972 1 1   -.08843363377541565  -.04224652
1973 1 1     -.092863008288085  -.04566855
1974 1 1   -.04994182142543355  -.04974203
1975 1 1    -.0123082348645286  -.05440232
1976 1 1   -.02792827260278471  -.05957511
1977 1 1  -.025661616531818306  -.06517772
1978 1 1   -.07116244745290785  -.07112036
1979 1 1   -.09433644681980902   -.0773077
1980 1 1   -.09588154779403167  -.08364037
1981 1 1   -.07324787825170909   -.0900166
1982 1 1   -.09699708181124123  -.09633397
1983 1 1   -.09129861506032695  -.10249094
1984 1 1   -.07801576288065505  -.10838865
1985 1 1   -.08787130215726849   -.1139325
1986 1 1   -.10368869339759779  -.11903369
1987 1 1   -.12984473701132185  -.12361068
1988 1 1   -.13601273697759464  -.12759055
1989 1 1   -.13522305909266027   -.1309102
1990 1 1   -.16410929161392002  -.13351732
1991 1 1    -.1749034081219039  -.13537136
1992 1 1   -.17267929170185997  -.13644409
1993 1 1   -.15660589634579344  -.13672014
1994 1 1   -.14090811627074754   -.1361972
1995 1 1    -.1284335655668775  -.13488609
1996 1 1   -.11190290870963263  -.13281055
1997 1 1   -.11524180552098832  -.13000692
1998 1 1   -.11115354361157273  -.12652346
1999 1 1    -.1129532612256796   -.1224196
2000 1 1   -.10835246407053643  -.11776494
2001 1 1   -.11400252493988818  -.11263816
2002 1 1   -.11788591728754902  -.10712566
2003 1 1   -.11254442813587473   -.1013202
2004 1 1   -.08363546588831329   -.0953193
2005 1 1   -.09284501401780126  -.08922377
2006 1 1   -.07979523121839557   -.0831359
2007 1 1   -.06995821404301965  -.07715791
2008 1 1   -.05752852459211174  -.07139017
2009 1 1   -.07075001456909177 -.065929614
2010 1 1   -.04969200603502319  -.06086815
2011 1 1   -.03984778668039857   -.0562911
2012 1 1   -.05154969019569319  -.05227587
2013 1 1    -.0497840265353243   -.0488906
2014 1 1  -.048793576715315486  -.04619313
2015 1 1   -.04948574453129099  -.04422996
2016 1 1  -.047359177099794286  -.04303557
2017 1 1  -.044218573712996125   -.0426318
2018 1 1  -.051062423551047474  -.04302751
2019 1 1   -.06104803674991116  -.04421844
1971 2 2  -.052224026069246726   .04733511
1972 2 2   -.05553294294401772   .04135935
1973 2 2   -.03474559274618068   .03687684
1974 2 2    -.0182180229388336  .033952687
1975 2 2    .13326770223542225   .03262639
1976 2 2    .16351073041347156   .03291123
1977 2 2    .16362296723203298  .034794025
1978 2 2    .16296694361277436   .03823534
1979 2 2    .18836892476247213   .04317018
1980 2 2    .19872515475682395     .049509
1981 2 2    .02624677561653282   .05713921
1982 2 2   .013270242662924406  .065927014
1983 2 2  -.007179835216951704   .07571962
1984 2 2  -.008342505302864376   .08634771
1985 2 2  -.007111580631541289   .09762829
1986 2 2  -.008340801048958653    .1093676
1987 2 2     .0135341918959422    .1213644
1988 2 2 -.0013931907919246195   .13341318
1989 2 2    .03495434943710156    .1453076
1990 2 2    .18795793711315376   .15684384
1991 2 2    .18789604274914862     .167824
1992 2 2    .20764800747810722    .1780592
1993 2 2    .25366938276007367     .187373
1994 2 2     .2984252024669359   .19560385
1995 2 2     .2890248838655149   .20260814
1996 2 2    .27253282714672866   .20826234
1997 2 2     .2654448894076852    .2124651
1998 2 2    .26901558010899657   .21513894
1999 2 2    .26938020863676154    .2162314
2000 2 2    .24211126692168153    .2157161
2001 2 2    .20775990079827972   .21359293
2002 2 2    .18572497089675324   .20988826
2003 2 2     .1631756520898294   .20465444
2004 2 2     .1665089246092707    .1979689
2005 2 2    .14807873089046864   .18993287
2006 2 2     .1416541922510765   .18066983
2007 2 2    .14415632984664126    .1703234
2008 2 2    .14638311927339548    .1590549
2009 2 2    .13329847280662777   .14704087
2010 2 2    .13033381207252606    .1344701
2011 2 2    .10474485404573974    .1215405
2012 2 2    .11025954819939078   .10845583
2013 2 2    .10129144374916023   .09542245
2014 2 2     .0931375767153433   .08264589
2015 2 2    .07071030201701785  .070327386
2016 2 2     .0581133122787421   .05866073
2017 2 2    .05343910999252845   .04782898
2018 2 2    .04770897486693318   .03800148
2019 2 2    .04395661815196764    .0293311
1971 3 3   -.16552630479892813   -.0653828
1972 3 3   -.14708090543709273  -.05491805
end
My problem is that even though I use the command legend(off), I keep having the legend.

Code:
forvalues i=1(1)20{
    xtline health fourier if id==`i', name(g`i', replace) xtitle("")  ylabel(,angle(0)) xlabel( 1971 1980 1990 2000 2010 2019, angle(90)) legend(off)
    }
gr combine g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 g13 g14 g15 g16 g17 g18 g19 g20 ,name(Fourier_1, replace)

Why do I have the legend? What am I missing? I also wonder why some graphs (i.e., Australia, Ireland and New Zealand) are smaller compared to the others. Many thanks in advance.
Array

Analysis of multiple outcome measurements with censored dependent variables

Our working group has a data set that contains about 10 measured urinary pesticides or their metabolites in a sample of about 1000 women. We want to see if we can describe exposure patterns among this sample to different socioeconomic variables, for example state of residence, age, parity, body mass index, diet, etc. Some of these explanatory variables are continuous, some ordinal and some categorical.

The dependent variables (the pesticides) are basically censored continuous variables. Many have only 10-30% of the measurements above the detection limit of the apparatus and within the group of measurements for each pesticide the distribution of measurements is highly skewed.

We are looking for help in designing a procedure that takes into account the multiple measurements of pesticides within each subject, the censored nature of the dependent variables and covariance among the various pesticides to see if we can detect patterns of pesticide exposure that might be associated with the personal characteristics of the subjects.

We’ve thought of using principal components or factor analysis but the lack of multinormality of the group of outcome measurements and the censored nature of the measurements suggests that the results of that analysis will be suspect. We are looking at “gsem” with “intreg” for solutions but aren’t sure how to specify the model.

Can any one provide pointers that would help us solve this thorny analysis problem?

Insufficient observations with bsrifhdreg

Hello,

I was trying to conduct quantile regressions with fixed effects using the following command (for the 25th quantile):

Code:
bsrifhdreg share process prod ln_sales capital fem_owner exported_sales foreign_own fem_leader smal_or_med, abs(country year sector) cluster(sector) rif(q(25))
However, I keep getting an error of insufficient observations like the following :

Code:
insufficient observations
an error occurred when bootstrap executed rifhdreg
r(2001);
I have about 26000 observations, with little missing values. Is anyone familiar with this type of problem? I would be grateful if someone could help.

Regards,

Mismatch between date formatted as string and numeric date

Dear all,

I am processing hourly time series data (DD.MM.YYYY hh:mm), where the date variable is formatted as a string and looks the following:

. list datestring

datestring

1. 01.04.2018 00:00
2. 01.04.2018 01:00
3. 01.04.2018 02:00
4. 01.04.2018 03:00
5. 01.04.2018 04:00


To convert the string variable into a date variable, I run the following code:

gen Datetime = clock(datestring, "DMYhm")
format Datetime %tc



This is what I get:

. list Datetime datestring

+---------------------------------------+
| Datetime datestring |
|---------------------------------------|
1. | 01apr2018 00:00:19 01.04.2018 00:00 |
2. | 01apr2018 00:59:18 01.04.2018 01:00 |
3. | 01apr2018 02:00:28 01.04.2018 02:00 |
4. | 01apr2018 02:59:27 01.04.2018 03:00 |
5. | 01apr2018 04:00:37 01.04.2018 04:00 |
+---------------------------------------+



However, "Datetime" and "datestring" are not equivalent. For example, in row 2 the clock time is 00:59:18, whereas it should be 00:01:00.

I believe that this "mismatch" stems from the missing seconds in the variable "datestring", but I do not know how to fix this. Any help would be much appreciated!

Many thanks in advance!

Mario




How to expand the variable within group?

Here is what I am trying to do I want to make each one of the observations in the following dataset from left to right. Namely, within the group, I want each firm to have an observation with the rest of the group variables. How can I achieve this?
Array

One-step stochastic frontier using Stata

Greetings, dear Statalist members!
Could I use Stata to estimate a one-stage stochastic frontier with exponential, log-normal and half-normal distributions?
I run the command below using Stata:
sfcross lny lnland lnlabour lnfertilizer lnseed, dist(hn) 0rt(o) emean (age sex education extension credit)
But I got the error message "Conditional mean model is only allowed for distribution(tnormal)".
I kindly ask you to share the commands with me.

Thank you in advance for sharing your busy schedule with me.

Thursday, May 18, 2023

Force negative time values in 'stset'

Dear Stata users,

I wonder if there is any way to force negative time values (age-centred) in stset.

Code:
stset stop, enter(start) f(event=1) id(ID)
Many thanks,
Oyun

Stata 18 data editor displaying ampersands incorrectly in string variable observations

I am experiencing an issue with the data editor displaying string ampersands (ASCII Code 38) in Stata 18.0 MP (current update level 5/15/23). Stata 17.0 MP displays them fine. Both instances running on Windows 11.

When the ampersand is between two strings, Stata 18 data editor displays an (half) underscore in the place of the ampersand and the space after. An ampersand alone displays blank, and a string of three ampersands with no spaces displays at a single ampersand. For example, this



Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str12 var1
"Jack & Jill"
"Arm & Hammer"
"&"          
"& & & &"    
"&&&"        
" &&  & &"    
end
will display in Stata 17 MP as

Array

in Stata 18 MP the data editor displays as

Array

Is there a setting that needs to be changed or is this a bug? Aside from font and alignment settings in Stata 17, all settings from query are the same between Stata 17 and 18, and Stata 18 is at the default settings. Changing the font did not change the display.



Thanks for any assistance.

Fixed effect regression questions

Hello,

i'm kinda new to stata and empirical research so sorry if this question is very basic.

1: So i have observations for different companies for different years. I have a dependent variable and several independent variables. I would like to include their industry aswell as the year in which the observations was recorded as a fixed variable.
To my understanding, one does that by going

xtset industry year
xtreg DV IV1 IV2...., fe

However i have repeated time values within this data so that doesnt work with xtset. Do i just do that by combining the industry and year into one variable and then doing the regression or do i miss something substantial?

egen industry_year = group(industry year)
xtset industry_year
xtreg DV IV1 IV2...., fe


2: As a regression result for Rsquared the overall score is used when doing xtreg, right?


3: In my mind, simply inserting the different years and industries as a dummy variable in the regression should yield the same result as in 1.

So just doing:

regress DV IV1 IV2..... year1 year2 year3.....industry1 industry2.....

should be the same as:

xtset industry_year
xtreg DV IV1 IV2...., fe


But doing so, i get a slightly different result. Why is that, or is my appraoch in 1 flawed?




Thank you in advance for your answer and sorry, if these questions are basic

ASGEN command for weighted average

Hello everyone,


I have to compute the weighted average of democratic scores in the destination countries, weighted it with my shares of emigrants in a certain province in a year.
This is the Stata command i am using:

bysort provincia year: asgen weighted_average= democracy_polity if democracy_polity!=. & share_emigration!=., weight(share_emigration)

However, the new variables doe not make much sense since the values do not sum up to 1.

Thank you,
Best,
Margherita


Wednesday, May 17, 2023

[Error: icio] Missing adb tables (2001-2006)

Hello everyone, I want to ask how I can contact or report issues in using the icio command. I am currently doing a study on GVCs using the ADB tables but the periods 2001-2006 are missing, as seen from the screenshot:


Array

I understand that this usually happens when they are updating the table but I have been trying for at least a week now and it is still missing. Is this covered by Stata technical support or should I search for the icio authors?

Thank you.

two graphs

Hello


I want to combine these two graphs into one graphs please

twoway (scatter ed yearofbirth) || (lfit ed yearofbirth if yearofbirth>=1967, lstyle(dash)) || (lfit ed yearofbirth if yearofbirth < 1964, lstyle(dash)) if rural==1 , xline(1967 1965) legend(off) title(Years of School) xtitle(Birth year )
graph save ed_rural, replace




twoway (scatter ed yearofbirth) || (lfit ed yearofbirth if yearofbirth>=1967, lstyle(dash)) || (lfit ed yearofbirth if yearofbirth < 1964, lstyle(dash)) if rural==0 , xline(1967 1965) legend(off) title(Years of School) xtitle(Birth year )
graph save ed_urban, replace



data



----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input int yearofbirth float(rural edu)
1945 0  5.461538
1945 1 3.0535715
1946 0 4.4444447
1946 1  2.981132
1947 0 4.4583335
1947 1 3.0229886
1948 0  7.466667
1948 1  3.486111
1949 0  6.466667
1949 1 3.4285715
1950 0  5.333333
1950 1  3.837838
1951 0  7.793103
1951 1  3.505618
1952 0      7.36
1952 1  3.639535
1953 0  6.193548
1953 1         4
1954 0  6.254546
1954 1  3.862857
1955 0      6.86
1955 1  3.469388
1956 0  7.218391
1956 1 3.8533835
1957 0  6.893617
1957 1 4.0705395
1958 0  6.971428
1958 1 4.1626296
1959 0  7.011765
1959 1  3.993243
1960 0    7.4375
1960 1 3.8481014
1961 0  7.695652
1961 1 4.3138685
1962 0  7.971292
1962 1 4.3483872
1963 0  8.376344
1963 1  4.818792
1967 0  9.774648
1967 1  7.380208
1968 0  9.903571
1968 1  7.065299
1969 0  9.743494
1969 1  7.844485
1970 0  9.916924
1970 1  7.581955
1971 0 10.474684
1971 1  7.606625
1972 0 10.030556
1972 1  7.788915
1973 0 10.155216
1973 1  7.918845
1974 0  9.981396
1974 1  7.819808
1975 0   10.1309
1975 1  7.725707
1976 0 10.116773
1976 1  7.886999
1977 0   10.1833
1977 1  7.752955
1978 0  10.25701
1978 1   7.91217
1979 0 10.060362
1979 1  7.930493
1980 0 10.401316
1980 1  8.386111
1981 0 10.675324
1981 1  8.188822
1982 0 10.276224
1982 1  8.335929
1983 0 10.234568
1983 1   8.07248
1984 0 10.370723
1984 1  8.129973
1985 0  10.37156
1985 1  8.451258
1986 0     10.67
1986 1  8.335877
1987 0  10.58913
1987 1  8.426045
1988 0 10.391775
1988 1  8.287082
1989 0 10.039095
1989 1   8.25811
1990 0 10.204603
1990 1  8.161189
1991 0 10.863905
1991 1  8.934037
1992 0  10.68661
1992 1 8.7576475
1993 0  10.43131
1993 1  8.633952
1994 0 10.023728
1994 1  8.825364
1995 0  9.856209
1995 1  8.348605
1996 0 10.740332
1996 1  9.220834
1997 0  10.17801
1997 1   8.86036
end

Help to combine graph bar and connected into single graph, by group(s)

Have been trying without luck to combine the two graph types (bar and connected line) to be overlayed. Both are to be displayed by group (organization) and time period (quarter). This requires the option over() used twice- I can successfully create the graphs separately but when I try: tw (bar...) || (connected ..., c(l)), it just fails at every turn. Thanks for any help.

ERROR Fixed effect estimation (reghdfe) : treatment variable is collinear with the fixed effects

Hi all,

I am conducting a study to estimate the effect of Medicaid expansion on the uninsured rate using a classic Difference-in-Differences (DID) design with two-way fixed effects (twfe) model. My mathematical model is as follows:

UNINSist = αs + δt + βEXPANSIONist + εist

In this model:


UNINSist is a binary variable indicating whether an individual in the survey is uninsured (1) or insured (0) in state s and year t.
αs represents state fixed effects, capturing time-invariant differences across states.
δt represents time fixed effects, capturing common time trends across all states.
β is the parameter of interest, representing the causal effect of Medicaid expansion on the uninsured rate.
EXPANSIONist is a binary treatment variable that equals 1 for states that adopted Medicaid expansion and 0 for states that did not.
εist is the error term accounting for unobserved factors and random variation.

I have data from the American Community Survey (ACS) for the years 2011 to 2019, which consists of repeated cross-sectional data. Here are the top 15 observations of my dataset:

Array


To estimate this model, I am using the reghdfe command

Code:
reghdfe UNINS expansion , absorb(ST YEAR) cluster(ST)
eventhough I got the regression result I got the following error
Code:
  
 note: expansion is probably collinear with the fixed effects (all partialled-out values are close to z &gt; ero; tol = 1.0e-09) (MWFE estimator converged in 4 iterations) note: expansion omitted because of collinearity
I tired using xtreg command in Stata instead but encountered a challenge. Since my data is in a repeated cross-sectional format, the xtreg command requires me to define the panel structure using xtset ST YEAR.

To proceed with the xtreg command, I would need to aggregate the individual observations and take the mean uninsured for each state and year. This would transform my repeated cross-sectional data into a panel structure. However, I have a few concerns regarding this approach.

Firstly, my dataset includes several demographic variables such as sex, race, and education level, which are categorical variables. Aggregating the data by taking the mean may not be appropriate for categorical variables, as it could lead to the loss of valuable information. I am unsure how to handle these categorical variables effectively while converting the data to a panel structure.

Secondly, my dataset also includes survey weights. Considering that the survey weights are specific to each individual, taking the mean uninsured rate for each state and year may not accurately account for the survey design and could potentially introduce biases into the analysis.

Given these concerns, I am uncertain whether taking the average of individuals to obtain one observation per year per state is a suitable approach for my analysis. And also I don't know if taking this approach would solve my treatment collinearity with the fixed effect.


I am seeking guidance on how to address this issue and estimate the classic DID TWFE model.


Thank you for your assistance!

I'm using Stata 17

nested CES function

Hello,
I have a function with 4 inputs including capital(K), labor(L), energy(E), and material(M) in a nested CES form: Y=A⋅ {a[b(c⋅K^(-α)+(1-c)⋅E^(-α) )^(ρ⁄α)+(1-b)⋅L^(-ρ) ]^(β⁄ρ)+(1-a)⋅M^β }^((-1)⁄β). My dataset consists of Y, K, L, E, and M data for some manufacturing industry sub-sectors between 1975-2005. Based on this, what is the stata command for the substitution elasticity between K and E? When I use the command nlsur (Y = A * (a * (b*((cK)^(-alpha)+(1-c)((E)^(-alpha)))^(rho/alpha) + (1-b)(L)^(-rho))^(beta/rho) + (1-a)(M)^beta)^(-1/beta)), start(a = 0.5, b = 0.5, c = 0.5, alpha = 0.5, rho = 0.5, beta = 0.5, A = 1), it gives me an initial value error.
It would make me very happy if you could help.

How can I add the the decomposition of the R² in "between" and "within" to a a table (xtreg) ?

I am studying the relationship between university rankings (dependant variable) and academic freedom (independant variable). During my research, I found out that I get much more consistent results if I use cross sectional databases instead of panel databases.

I want to explain this in my report, but also want to show some proof of this. When using the
Code:
xtreg
command, I get these results :

Array

I want to draw you attention on the part circled in blue. This show a decomposition of the R² in "within" (time series dimension of the data) and "between" (cross-sectional dimension of the data). Showing this in my LaTeX regression table could constitute proof that a big part of the variance of my model is explained by the cross-sectionnal aspect of my data, rather than the time series aspect.

The problem is, I don't know how I could include this in the regression table. I know how to use
Code:
esttab
and
Code:
outreg
, but I don't know if there are any options for these commands to include this particular information in the final table. Worst case scenario, I can always add it manually in LaTeX, but if there is another way to do it, I would prefer it.


ds - how to display fullnames

Dear Statalisters -

I have a large dataset and am looking to list out the names of all my string variables. I have found the command ds which has provided the ability to limit the output of the variables to string: ds, has(type string)

Unfortunately we have some longer variable names in the dataset which means that a number of them are summarized using the ~ mid variable name. For example:
cf_antibio~r
qol_entere~y
qol_startt~e

I'd like to have the full variable name output here instead, yet the fullnames option (used within d procedure) doesn't work in this command.

Does anyone have suggestions for outputting the names of all string variables from a dataset as full names?

Thanks in advance for your advice.

Best,
Alison

renaming multi-variables using foreach

Hi,

I am asking this clarification after going through STATA doc and some of the web content. Although I could not get the "foreach" for "rename" completely (I am still learning that command to understand more), I felt that the examples given in the doc and web have not helped me. Hence, I am posting here my question is that I would like to run rename command for many variables in one go. For eg. the existing variables b13_q2, b13_q3, and b13_q4 have to berenamed as b13q2, b13q3, and b13q4. I do have to do this for many variables. How I was doing was in MS-excel. But I wanted to do that in STATA. I wanted to know whether "foreach" would work or is there any other command is to be called? Could someone help me outfrom this.


Tuesday, May 16, 2023

command table with option markvar(newvar) creates error __000001 not found

Why do the commands table and table twoway with the option markvar(newvar)create the error "__000001 not found r(111);"? The error occurs in Stata/MP 18.0 for Windows (64-bit x86-64) and also in Stata/MP 17.0 on the same platform.

Code:
. use https://www.stata-press.com/data/r18/nhanes2l, clear
(Second National Health and Nutrition Examination Survey)

. table sex, markvar(mynewvar)

---------------------
         |  Frequency
---------+-----------
Sex      |           
  Male   |      4,915
  Female |      5,436
  Total  |     10,351
---------------------
__000001 not found
r(111);

. noisily table sex diabetes, markvar(mynewvar2)

--------------------------------------------
         |           Diabetes status        
         |  Not diabetic   Diabetic    Total
---------+----------------------------------
Sex      |                                  
  Male   |         4,698        217    4,915
  Female |         5,152        282    5,434
  Total  |         9,850        499   10,349
--------------------------------------------
__000001 not found
r(111);

. tab mynewvar mynewvar2, missing

           |       mynewvar2
  mynewvar |         0          1 |     Total
-----------+----------------------+----------
         1 |         2     10,349 |    10,351 
-----------+----------------------+----------
     Total |         2     10,349 |    10,351
The Stata 18 help files say:
markvar(newvar) generates an indicator variable that identifies the observations used in the tabulation
.

Yes, the table and table twoway option markvar(newvar) does what the help files say but why does it also create an error? It is possibly my user error (despite being a long-time user-programmer),

How to reshape data with dates as variable names?

Greetings everyone,

I need your help reshaping my district-level daily temperature data from a wide to a long format.

I have provided a snapshot of my data below.

In the full dataset, the dates range from 1 January 2010 to 31 December 2010.

After reshaping, my final data should have a column titled "districts", "dates" and "temperature".

22/12/2010 23/12/2010 24/12/2010 25/12/2010 26/12/2010 27/12/2010 28/12/2010 29/12/2010 30/12/2010 31/12/2010 district
290.42316 290.901957 290.734693 290.79091 290.825307 290.452145 289.745414 290.255769 289.986327 290.213195 Kiambu
289.481211 289.885756 289.451988 289.767927 289.649287 289.153357 288.50605 289.048894 288.718642 288.92301 Kirinyaga
293.614437 293.97406 293.782521 294.27897 294.155162 293.696184 293.107893 293.79007 293.65899 293.840799 Machakos
289.207302 289.550169 289.3492 289.693667 289.658836 289.180405 288.488866 289.215417 288.847019 288.924391 Murang'a
289.288837 289.7971 289.618359 289.631393 290.327304 289.866269 288.860669 289.378584 289.493381 289.696221 Nyandarua
289.563779 290.08223 289.613974 289.742309 290.000783 289.236923 288.369696 289.150477 288.96267 289.007754 Nyeri
300.711855 300.66441 300.18528 300.410653 300.731155 300.436221 300.53266 300.752484 300.353821 300.444164 Kilifi
299.950987 300.133282 299.710488 299.877987 299.976884 299.445853 299.535585 300.018341 299.801684 299.864323 Kwale
301.295454 301.08952 300.471525 300.529891 301.209784 300.572103 300.901886 301.213524 300.948301 300.673431 Lamu
300.134964 300.254665 299.859483 300.240489 300.404015 299.917327 299.962994 300.233772 299.975998 300.343877 Mombasa
298.248242 298.587084 298.223512 298.31255 298.57019 297.938847 297.78606 298.416717 297.907583 297.759021 Taita Taveta
302.242976 302.18383 301.706147 301.55664 301.987383 301.566071 301.623938 301.890389 301.22068 301.330906 Tana River
292.718341 293.402177 292.95594 293.089097 293.058324 292.351082 291.739324 292.541505 292.338334 292.59647 Embu
300.488048 301.121816 300.67348 300.509714 300.712365 300.257019 300.060362 300.069667 299.507594 299.979497 Isiolo
298.730816 299.010331 298.981171 299.147314 299.156926 298.213189 297.696936 298.756034 298.212052 298.317001 Kitui