BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Thursday, October 31, 2019

Sequence Index Plot

Dear Stata Users

I am trying to show how individuals move between a set of conditions or states over time. I think a sequence index plot should be able to visualize that.

HTML Code:

https://www.stata-journal.com/sjpdf.html?articlenum=gr0022

My data is the summary of different horse behaviour over time. There are 6 different behaviour classes, and 19 individuals. Every row accounts for a second, and each second there is a classified behaviour (between 1 and 6).

Code:

. separate begin, by(type) . separate end, by(type)

Code:

graph twoway > (rbar begin1 end1 id, horizontal) > (rbar begin2 end2 id, horizontal) > (rbar begin3 end3 id, horizontal) > (rbar begin4 end4 id, horizontal) > (rbar begin5 end5 id, horizontal) > , legend(order(1 "education" 2 "apprenticeship" > 3 "employment" 4 "unemployment" 5 "inactivity") > cols(1) pos(2) symxsize(5)) > xtitle("months") yla(, angle(h)) yscale(reverse)

This code has been used previously.

My data set

HorseID	Time	PredictedLF
1	15:06:57	1
1	15:06:58	1
1	15:06:59	5
1	15:07:00	5
1	15:07:01	2
1	15:07:02	4
1	15:07:03	4
2	09:38:10	4
2	09:38:11	2
2	09:38:12	3
2	09:38:13	3
2	09:38:14	4
2	09:38:15	4
2	09:38:16	4
2	09:38:17	2
2	09:38:18	4
2	09:38:19	6
2	09:38:20	6
2	09:38:21	6

Code:

separate  Time, by ( PredictedLF)

Which gave me 6 variables Time1 Time2 Time3 Time4 Time5 Time6
I then used this to get start and end times for each classified behaviour.

For example:

Code:

gen LFendLyingLF=1 if Time1~=. &(Time1 [_n+1] ==.)

However when I used the code from above it did not work.

Code:

graph twoway (rbar  LFStartLyingLF LFendLyingLF HorseID, horizontal)(rbar  LFStartStep LFendStep HorseID, horizontal)(rbar  LFStartPaw LFendPaw HorseID, horizontal)(rbar  LFStartStand LFendStand HorseID, horizontal)(rbar  LFStartWS LFendWS HorseID, horizontal)(rbar  LFStartLyingR LFendLyingR HorseID, horizontal),legend (order(1 "Lying Left" 2 "Step" 3 "Pawing ground" 4 "Standing" 5 "Weight shift" 6 "Lying Right") cols(1) pos(2) symxsize(6))

Any advice would be much appreciated. Thankyou for your time.

Katrina

problem with Ivreg2 result

Dear All,
After running my regression with ivreg2, in last portion of the result I saw error massage which contains the following:
Warning: estimated covariance matrix of moment conditions not of full rank. overidentification statistics not reported, and standard errors and model tests should be interpreted with caution.
Possible causes:
number of clusters insufficient to calculate robust covariance matrix singleton dummy variable (dummy
with one 1 and N-1 0s or vice versa)
partial option may address problem.

Please may you help me in way of solving the problem? Thank You in Advance.

Complex Reshape Long to Wide

Dear Stata Users

I am dealing with a complex reshape wide problem.

Here is an input data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float date str6 code str1(buy_sell tradercode) double tradetime long tradenumber str5 ordernumber double tradeprice long tradevolume str4 tradingreport str1(ordertype invtype) str4 ordernumber1
19725 "1512  " "B" "0" 32400000      1 "m5584"   8.8  2000 "0057" "0" "I" "476R"
19725 "1512  " "S" "0" 32400000      1 "N5683"   8.8  2000 "0307" "3" "I" "3554"
19725 "1512  " "B" "0" 32400000      2 "C5523"   8.8  8000 "1926" "0" "I" "7631"
19725 "1512  " "S" "0" 32400000      2 "N5683"   8.8  8000 "0307" "3" "I" "3554"
19725 "1512  " "B" "0" 32400000      3 "C5523"   8.8  2000 "1926" "0" "I" "7631"
19725 "1512  " "S" "0" 32400000      3 "45557"   8.8  2000 "1131" "0" "J" "7966"
19725 "0050  " "B" "0" 32400000      4 "U5556"  58.7  2000 "0298" "3" "I" "722E"
19725 "0050  " "S" "0" 32400000      4 "N5883"  58.7  2000 "0614" "4" "I" "882H"
19725 "2311  " "B" "0" 32404000   2414 "T5582"    28  1000 "1031" "3" "I" "7036"
19725 "2311  " "S" "0" 32404000   2414 "C5501"    28  5000 "0437" "3" "I" "4909"
19725 "2311  " "B" "0" 32404000   2414 "T5582"    28  4000 "1031" "0" "I" "7036"
19725 "2330  " "B" "0" 32405000   3004 "o5551"   105  3000 "2612" "0" "I" "4768"
19725 "2330  " "B" "0" 32405000   3004 "o5551"   105  2000 "2612" "4" "I" "4768"
19725 "2330  " "S" "0" 32405000   3004 "N5819"   105  5000 "0054" "6" "F" "8465"
19725 "2406  " "B" "0" 32405000   3197 "G5551"  34.3  8000 "1920" "0" "I" "3345"
19725 "2406  " "S" "0" 32405000   3197 "G5584"  34.3 25000 "1044" "3" "I" "490I"
19725 "2406  " "B" "0" 32405000   3197 "G5551"  34.3 17000 "1920" "1" "I" "3345"
19725 "0050  " "S" "2" 52200000 884583 "28269" 58.55    70 "1135" "0" "I" "001A"
19725 "0050  " "B" "2" 52200000 884583 "F5354" 58.55    70 "0920" "0" "I" "708R"
19725 "0050  " "B" "2" 52200000 884584 "F5354" 58.55    30 "0920" "0" "I" "708R"
19725 "0050  " "S" "2" 52200000 884584 "12556" 58.55    30 "0252" "0" "I" "4906"
end
format %tdDD/NN/CCYY date
format %tc_HH:MM:SS tradetime

I have attached an output file I would like to produce in stata.

I have tried the following code which did not work:

Code:

reshape wide tradevolume tradeprice, i(tradenumber) j(invtype ordertype buy_sell) string Array

Any idea of how to perform the desired reshape?

Thank you.

multicollinearity test variables in a panel data

Hi,

how can I run a multicollinearity test in Stata 12 between variables in a panel data?

Thanks!

Solve Macroeconometrics Model and Forecast

I am trying to solve a macroeconometrics model of my country using STATA. Model has 7 endogenous variables and 7 equations. There is consumption, investment, TFP, capital etc. variables. Actually, I solved it in Eviews. However, now trying to write the do file in STATA but not able to do. Can anyone has a sample do file codes for solving a model and forecast out of sample GDP growth assuming trend growth of exogenous variables in future years like 2020, 2021, 2022 etc. Where can I get such resources of STATA?

T Tests and Wilcoxon Tests export to MS Word or MS Excel

Hello Members

I had been working on EViews for some time, I shifted to Stata now, since I am new, I have been trying to calculate T and Wilcoxon Tests of Difference (of two variables).
I have 24 different variables. VarA VarB VarC VarD and Var1 to Var20. Though these are not the original names of those 24 variables I need to calculate these tests for.
I have huge number of observations, 19393 to be precise. These are firm-year observations.
I have two variable sets
VarA to varD
Var1 to Var20

I need to calculate T and Wilcoxon tests in this way:

ttest VarA = Var1
ttest VarB = Var2
ttest VarC= Var3
ttest VarD = Var4

ttest VarA = Var5
ttest VarB = Var6
ttest VarC = Var7
ttest VarD = Var8

ttest VarA = Var9
ttest VarB = Var10
ttest VarC = Var11
ttest VarD = Var12

ttest VarA = Var13
ttest VarB = Var14
ttest VarC = Var15
ttest VarD = Var16

ttest VarA = Var17
ttest VarB = Var18
ttest VarC = Var19
ttest VarD = Var20

I need to do it for both tests and it will take a lot of time to do for both tests. And I also need to export these results to Excel or Word. The important results I need from these tests is the T and Z values, means and its differences and P value.

I am using simple command for T-Test:

ttest VarA=Var1 if sample_common==1 // for common sample
ttest VarB=Var2 if sample_common==1 // for common sample
and so on
ttest VarD=Var20 if sample_common==1

and for Wilcoxon:

signrank VarA=Var1 if sample_common==1
signrank VarB=Var2 if sample_common==1
and so on until:
signrank VarD=Var20 if sample_common==1

I did it in EViews and it took days.

Question1: Any loop to solve this for 22 tests for T Test and 22 for Wilcoxon.
Question2: I can do it manually, it shouldn't take much time in Stata anyway, but I have to export these results in Excel or Word. Is it possible to store all 44 results from T and Wilcoxon tests and export at one time instead of using "asdoc" or "estpost" commands to do it one by one.

PS: I have other names for these variables on my Stata files. but if I needed to apply loop and if I had to change the names, I would.
PS: I am posting this question second time, probably I this time I have tried to be more precise and clearer

Many thanks in advance
Qureshi Qazi

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str90 Name int Data_Year float(VarA VarB VarC VarD Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10)
"3Com Corp"                     2006   .063745834    .08969723      .069065    .09025278   .23625404   .21030265    .23093487    .2097471    .28277314    .25682175    .27745396    .25626618    .033384956    .007433569
"3Com Corp"                     2007    .16977616    .19370146     .1685596     .1953236   .21452543   .23845074    .2133089    .24007288    .1668212    .1907465    .16560467    .19236866    .04178643    .06571174
"3Com Corp"                     2008    .01767699     .0154646   .019336283   .034823008  .030258216  .028045826    .031917505    .04740423    .033407442    .03119505    .03506673    .05055346    .04746863    .04525624
"3Com Corp"                     2009    .14623202            0    .15350877            0   .14623202           0    .15350877    0    .14623202    0    .15350877    0    .14623202    0
"3D Systems Corporation"        2006      .039646 .00020166667     .0490905 .00020166667   .03763147 .0018128604    .04707598    .0018128604    .02352965    .015914686    .03297415    .015914686    .004353501    .04379784
"3D Systems Corporation"        2007  .0022994361    .00480564  .0022994361    .00480564  .009673823  .007167619    .009673823    .007167619    .0152432    .012736997    .0152432    .012736997    .024439054    .02193285
"3D Systems Corporation"        2008   .006780514  .0019229947   .006780514  .0019229947  .006426498 .0015689787    .006426498    .0015689787    .005438766    .00058124587    .005438766    .00058124587    .0339265    .02906898
"3D Systems Corporation"        2009   .013091595  .0003426961    .01351137  .0003426961   .04784323   .03440894    .04826301    .03440894    .032321885    .018887596    .032741662    .018887596    .015538788    .02897308
"3D Systems Corporation"        2010   .010408957  .0018122155   .010408957  .0015592063   .08288364    .0742869    .08288364    .074033886    .07043226    .06183552    .07043226    .06158251    .05507143    .06366817
"3D Systems Corporation"        2011  .0021587943  .0008884781 .00012632662  .0007614846  .003817905  .002547588    .0017854348    .002420593    .0023259483    .003596265    .0043584183    .00372326    .0009045005    .002174817
"3D Systems Corporation"        2012   .034877814   .036073748   .034530625   .036266666   .04522194   .04641788    .04487475    .0466108    .04598092    .04717685    .04563373    .04736977    .027396336    .028592274
"3D Systems Corporation"        2013   .014645678   .011313648     .0146691   .011295148   .02164196  .018309928    .02166538    .018291429    .02130642    .017974392    .02132984    .017955892    .0020181816    .005350212
"3D Systems Corporation"        2014   .006814484   .006349909   .006749919   .006373292    .0076736  .007209024    .007609035    .007232408    .006485644    .006021068    .006421079    .006044452    .00010808185    .0005726572
"3D Systems Corporation"        2015    .18139833     .1863056     .1819054     .1863167   .20626158   .21116886    .20676865    .21117996    .16404065    .1689479    .1645477    .168959    .14261276    .14752004
"3D Systems Corporation"        2016    .09154546    .09252198    .09154546     .0926962   .05696752   .05794405    .05696752    .05811827    .04235609    .04333261    .04235609    .04350684    .068535976    .0695125
"3D Systems Corporation"        2017     .0422009    .04366238    .04201279    .04389391   .01915851  .020619985    .0189704    .02085151    .007609781    .009071256    .00742167    .009302784    .0043729735    .005834448
"3D Systems Corporation"        2018    .04688773   .064143635    .04688773    .06424884   .05238211  .069638014    .05238211    .069743216    .030802766    .04805867    .030802766    .04816388    .04165558    .024399675
"3M Company"                    2006   .007902322   .008422193    .00789471   .008539871 .0027746186   .00329449    .002767008    .0034121685    .001578681    .0010588095    .001586292    .0009411313    .022377644    .02185777
"3M Company"                    2007   .009525856   .009252021   .009461696   .009269216  .003089823  .002815984    .0030256584    .00283318    .0018383823    .002112221    .0019025467    .002095025    .02678877    .027062606
"3M Company"                    2008   .002358515  .0021152752     .0020338   .002626779 .0036961474 .0039393865    .0040208623    .003427882    .000344336    .0005875751    .000669051    .00007607043    .03844815    .03820491
"3M Company"                    2009 .00028154327  .0021755302 .00014198819   .002291102  .000513874  .002407864    .00009034574    .002523437    .005798586    .007692575    .005375057    .007808149    .063203245    .065097235
"3M Company"                    2010 .00012979316  .0002218459 .00019729044  .0003182533  .001575403  .001667455    .0016428977    .001763858    .005023457    .005115509    .005090952    .005211912    .0443816    .04447365
"3M Company"                    2011    .00144438  .0010095017  .0016821553  .0009869061   .01167959  .012114465    .011441812    .012137063    .015261516    .015696391    .015023738    .01571899    .0389358    .03937067
"3M Company"                    2012  .0011355683  .0010253274  .0011939312  .0010324238  .001641661 .0017518997    .0015833005    .0017448068    .00642407    .006534308    .006365709    .006527215    .04044257    .04055281
"3M Company"                    2013  .0012451266   .001289607  .0012641896  .0012103393   .02736242    .0274069    .027381487    .027327634    .023072034    .023116514    .0230911    .02303725    .04288026    .04283578
"3M Company"                    2014  .0011115152  .0010260963  .0010981105  .0010268093 .0033697374 .0032843165    .00335633    .003285032    .00008535385    6.7055225e-08    .00007194653    6.482005e-07    .03796756    .03805298
"3M Company"                    2015  .0010885468  .0008731743  .0010673078  .0008847371  .007901259  .007685889    .007880021    .007697452    .004525643    .0043102726    .004504405    .004321836    .033128984    .033344354
"3M Company"                    2016   .001311073  .0013082847  .0013359665   .001269583   .00561408  .005611293    .005638976    .005572591    .0019335747    .001930788    .0019584708    .001892086    .03909488    .03909767
"3M Company"                    2017  .0020558885  .0014581957   .001880831  .0013208266  .005040143  .005637836    .005215202    .005775206    .0084559955    .0090536885    .008631054    .009191059    .02746413    .028061824
"3M Company"                    2018   .002394655  .0024441944  .0023865404   .002451757  .004556708  .004507169    .0045648217    .004499607    .0021658428    .002116304    .0021739565    .0021087416    .025048865    .025098404
"3Par Inc."                     2008  .0013813954   .001285969  .0010992248   .000324031   .05037327   .05027785    .04789265    .04866785    .04598048    .04588505    .04349986    .04427505    .01888669    .018982116
"3Par Inc."                     2009   .015931848   .020027524   .014621233   .019863695   .05308522    .0571809    .05177461    .05701707    .05260266    .05669833    .05129204    .0565345    .011556804    .01565248
"4Kids Entertainment Inc."      2006   .010503505   .004155513   .010503505   .004155513   .06159928    .0762583    .06159928    .0762583    .06085967    .0755187    .06085967    .0755187    .06401655    .07867557
"51job Inc Sponsored ADR"       2006  .0009983673  .0021634013  .0009983673  .0007284353  .014379997  .013214964    .014379997    .01464993    .007414099    .006249066    .007414099    .007684033    .0009983685    .0021634009
"51job Inc Sponsored ADR"       2007   .006390861   .004669244  .0035826596   .006235618  .001592569  .003314186    .004400771    .0017478094    .0039211884    .0021995716    .0011129864    .003765948    .00639086    .004669243
"51job Inc Sponsored ADR"       2008    .00554606   .003545061   .007244617   .003545061  .018473737  .016472738    .020172294    .016472738    .022097744    .020096745    .0237963    .020096745    .00554606    .003545061
"51job Inc Sponsored ADR"       2010   .003085282  .0045155487  .0028973166    .00542219   .02915079  .030581053    .028962824    .0314877    .04217244    .04360271    .04198448    .04450935    .003085278    .0045155436
"51job Inc Sponsored ADR"       2011 .00009827411  .0020869034 .00009827411  .0020869034   .09238305   .09437168    .09238305    .09437168    .10061754    .10260616    .10061754    .10260616    .00009827316    .002086904
"51job Inc Sponsored ADR"       2013    .03462349   .036381774    .03462349   .036592685   .06918992    .0709482    .06918992    .07115911    .07451138    .07626967    .07451138    .07648058    .03462349    .036381774
"51job Inc Sponsored ADR"       2014   .015530784    .02824252   .015530784    .02821915    .0295283    .0733016    .0295283    .07327823    .031733766    .07550707    .031733766    .0754837    .015530783    .02824252
"51job Inc Sponsored ADR"       2016    .04755431    .01190903    .04755431   .012312967   .05298883  .006474514    .05298883    .00687845    .0479803    .011483036    .0479803    .011886973    .04755431    .01190903
"51job Inc Sponsored ADR"       2017   .036593165     .0425553   .036593165    .04254997   .07838735   .08434948    .07838735    .08434416    .07865115    .08461328    .07865115    .08460795    .036593165    .0425553
"51job Inc Sponsored ADR"       2018   .002892358 .00020230074   .002892358 .00020230074   .05686961   .05996427    .05686961    .05996427    .0634903    .06658496    .0634903    .06658496    .00289236    .00020229816
"8x8, Inc."                     2010   .004706667      .015818   .004706667   .004706667   .01256238 .0014510453    .01256238    .01256238    .011538368    .0226497    .011538368    .011538368    .07998595    .09109729
"8x8, Inc."                     2011     .1906471      .174274    .19297796     .1768446    .3657363    .3684912    .3678564    .3691403    .3684702    .3591835    .3704267    .3581331    .3582836    .346729
"8x8, Inc."                     2012  .0011356467   .002758044 .00050473184   .003659306  .018737625  .017115228    .01936854    .016213968    .009267982    .007645585    .009898897    .006744325    .07238847    .074010864
"8x8, Inc."                     2013   .021082655    .02054065    .02162466    .02026965   .08328375   .08274175    .08382576    .08247074    .08150784    .08096583    .08204985    .08069483    .01442558    .013883575
"8x8, Inc."                     2014    .01338719    .01550542   .013633498    .01560394   .10547615   .10759437    .10572246    .1076929    .10175686    .10387509    .10200316    .1039736    .0045912536    .0024730265
"8x8, Inc."                     2015    .02160153    .02419432    .02160153   .023784934   .02858303  .031175824    .02858303    .030766435    .021868344    .02446114    .021868344    .02405175    .0010313708    .0036241664
"8x8, Inc."                     2016    .02249694    .02477738    .02249694    .02468035   .02249694   .02477738    .02249694    .02468035    .02249694    .02477738    .02249694    .02468035    .008465279    .010745715
"8x8, Inc."                     2017    .08363133     .0822119     .0836105     .0822119    .0866249   .08520546    .08660406    .08520546    .064408265    .062988825    .064387426    .062988825    .065191776    .063772336
"8x8, Inc."                     2018    .05333638     .0508778        .0531    .05097234   .03116936  .028710777    .03093298    .028805315    .01252138    .010062797    .012284997    .010157336    .0097591    .007300517
"99 Cents Only Stores LLC"      2006  .0008692161 .00002581262  .0005038241 .00002581262  .014675104  .015570132    .016048145    .015570132    .01109101    .01198604    .01246405    .01198604    .06838533    .06749031
"99 Cents Only Stores LLC"      2007   .005973706 .00013968776   .004823336 .00010682005    .0395574   .04539142    .04070777    .04563793    .0435509    .04938492    .04470127    .04963142    .036372766    .04220679
"99 Cents Only Stores LLC"      2008   .015178392    .01901005   .013670854    .01932412  .004027253   .00785891    .0025197156    .008172981    .001565991    .0022656657    .0030735284    .0025797375    .022838546    .01900689
"99 Cents Only Stores LLC"      2009   .020643184   .000774108   .020643184  .0009725526   .09633993   .11775722    .09633993    .11601056    .09880747    .12022476    .09880747    .1184781    .029092923    .007675633
"99 Cents Only Stores LLC"      2010  .0003136955  .0010114766  .0001606733  .0009257842   .07991765   .07921986    .08007067    .07930555    .08273792    .08204013    .08289094    .08212582    .012712806    .01341059
"99 Cents Only Stores LLC"      2011    .07256336            0     .0727729            0    .3657363    .3684912    .3678564    .3691403    .3684702    .3758102    .3704267    .37686735    .50595725    .53290296
"A. M. Castle & Co."            2007  .0022954813    .00261611  .0022954813   .003205501   .05004604   .05495764    .05004604    .05554703    .04524966    .05016125    .04524966    .05075064    .04394333    .03903174
"A. M. Castle & Co."            2008     .0913299     .0957433     .0913299     .0957433   .14210442    .1465178    .14210442    .1465178    .1251683    .12958169    .1251683    .12958169    .02929031    .0337037
"A. M. Castle & Co."            2009    .04319113   .017429363    .04319113   .014474608   .07120092    .0969627    .07120092    .09991745    .11264203    .1384038    .11264203    .14135855    .0634171    .08917887
"A. M. Castle & Co."            2010   .009157779   .004994156   .009157779   .004994156   .11226244   .11642607    .11226244    .11642607    .13546339    .13962701    .13546339    .13962701    .04370815    .04787178
"A. M. Castle & Co."            2011    .03772841   .034577947   .036750678   .034577947    .3717508    .3686003    .37077305    .3686003    .356984    .3538335    .3560063    .3538335    .174707    .17785747
"A. M. Castle & Co."            2012     .0990796   .072723046    .09978436    .07230021    .2180187   .19166213    .21872343    .1912393    .1678234    .14146683    .16852814    .141044    .24592987    .27228642
"A. M. Castle & Co."            2013     .0543656    .03608531     .0543656   .036423832   .19709557   .21537587    .19709557    .21503735    .24010104    .25838134    .24010104    .2580428    .021030463    .002750166
"A. M. Castle & Co."            2014    .17762762    .10856872    .17627352    .10856872    .6831738    .6143943    .6763842    .6143943    .5966482    .5275893    .5952941    .5275893    .25206226    .3211212
"A. O. Smith Corporation"       2006   .005301367  .0010280343   .005301367  .0010282052  .005866192  .010139525    .005866192    .010139354    .0009403452    .005213678    .0009403452    .005213507    .03901444    .034741104
"A. O. Smith Corporation"       2007   .005001438        .0058   .004735304   .007130991  .032263838  .031465277    .032529972    .030134283    .04065246    .0398539    .0409186    .03852291    .008792259    .00959082
"A. O. Smith Corporation"       2008    .04446613    .04557883    .04517945    .04574984    .1710858    .1721985    .17179912    .1723695    .1774951    .1786078    .17820844    .17877883    .01363136    .012518667
"A. O. Smith Corporation"       2009   .015428863   .022937804    .01458191    .02169573   .08539367   .07788473    .08624062    .0791268    .09121913    .08371019    .09206608    .08495226    .14876318    .15627213
"A. O. Smith Corporation"       2010    .05127329   .036748406    .05094681    .04766264   .10471006   .09018518    .10438358    .10109942    .1002719    .085747    .0999454    .09666125    .030802824    .01627794
"A. O. Smith Corporation"       2011  .0032867645    .00437542   .008171218   .007908613  .011902422  .012991074    .016786873    .01652427    .013516717    .01460537    .018401168    .018138565    .06654631    .06763497
"A. O. Smith Corporation"       2012   .014697907   .012114756   .014947158   .012205384  .036519736   .03910289    .036270484    .03901226    .03697255    .0395557    .036723297    .03946507    .04051258    .03792942
"A. O. Smith Corporation"       2013    .00643095   .006938322    .00643095   .006906612   .03886429   .03835692    .03886429    .03838863    .03821892    .03771155    .03821892    .03774326    .033338398    .033845775
"A. O. Smith Corporation"       2014  .0017912495  .0023225804  .0018283278   .002384501   .03116286  .031694196    .03119994    .03175611    .032190673    .032722004    .032227755    .032783926    .01116112    .011692453
"A. O. Smith Corporation"       2015  .0012660875  .0004506293  .0011597235  .0004506293 .0006649531 .0014804117    .0007713176    .0014804117    .003146861    .0039623193    .003253225    .0039623193    .02236493    .02318039
"A. O. Smith Corporation"       2016  .0007080015 .00049027544   .000751338 .00049027544 .0011721104  .001389835    .0011287741    .001389835    .003645189    .003862914    .003601853    .003862914    .029173976    .0293917
"A. O. Smith Corporation"       2017  .0011812038  .0012057022  .0011812038  .0004420274   .01487348   .01484898    .01487348    .015612656    .017123219    .017098717    .017123219    .017862394    .005184133    .005159631
"A. O. Smith Corporation"       2018 .00049902085 .00011406658  .0005533616  .0000993799  .006449405  .005836319    .006503746    .005851004    .0040893666    .003476281    .0041437075    .003490966    .013753258    .014366344
"A.C. Moore Arts & Crafts Inc." 2006   .036451545   .018925773   .035764262   .018238489   .08194973    .0994755    .08263701    .1001628    .0876755    .10520127    .08836278    .10588856    .006163504    .01136227
"A.C. Moore Arts & Crafts Inc." 2007    .01998846   .001622058    .01998846 .00014536225   .05620247   .03783607    .05620247    .03635937    .05513395    .036767542    .05513395    .035290845    .03702586    .05539227
"A.C. Moore Arts & Crafts Inc." 2008    .05464363    .05737091    .05482545    .05737091  .020912945  .018185671    .020731125    .018185671    .032008216    .02928094    .031826396    .02928094    .005490936    .0027636625
"A.C. Moore Arts & Crafts Inc." 2009    .17124286  .0019714285     .1623143  .0019714285    .3867867   .56000096    .3957153    .56000096    .3137952    .4870095    .3227238    .4870095    .50595725    .53290296
"A.C. Moore Arts & Crafts Inc." 2010    .06962925   .001602041    .06962925   .001602041   .12150179     .189529    .12150179    .189529    .16627245    .23429967    .16627245    .23429967    .26878652    .33681375
"AAON, Inc."                    2015   .001612327 .00027244305   .001612327 .00027244305   .00467164  .006011527    .00467164    .006011527    .008669252    .010009138    .008669252    .010009138    .028037423    .02937731
"AAON, Inc."                    2016   .000798019   .001228682   .000798019   .001228682 .0043858737   .00481654    .0043858737    .00481654    .0004182495    .0008489154    .0004182495    .0008489154    .03378661    .033355944
"AAON, Inc."                    2017  .0002139183  .0016257185  .0002139183  .0023319214 .0017332844 .0003214851    .0017332844    .0003847182    .0009333566    .002345156    .0009333566    .003051359    .019920645    .021332445
"AAON, Inc."                    2018  .0020643051 .00015694823  .0020643051 .00011553134   .00259787 .0006905142    .00259787    .0004180353    .0044903774    .002583021    .0044903774    .0023105424    .016668528    .01476117
"AAR CORP."                     2006  .0082943635   .008225052  .0082943635   .009338205   .04046746   .04053678    .04046746    .03942362    .03831016    .03837947    .03831016    .03726632    .1783592    .1782899
"AAR CORP."                     2007   .003636862  .0026947584    .00286605  .0026947584   .05068785   .05162995    .05145866    .05162995    .0410632    .0420053    .04183401    .0420053    .14101297    .14007086
"AAR CORP."                     2008   .009105706   .006269787   .009030765   .006401262   .05749735   .05466143    .05742241    .0547929    .06026013    .05742421    .06018519    .05755569    .04801719    .05085311
"AAR CORP."                     2009   .013209126  .0004790875    .00984139  .0004790875  .024330314    .0106421    .02096258    .0106421    .02627539    .012587175    .02290765    .012587175    .023599077    .009910863
"AAR CORP."                     2010   .006699739    .00102698   .006140122  .0013533507   .03872751  .033054754    .03816789    .03338112    .06101678    .05534403    .06045717    .0556704    .1517052    .14603247
"AAR CORP."                     2011   .007712414   .001479432   .007712414  .0013884237   .06646837   .05727653    .06646837    .05736754    .05592322    .04673138    .05592322    .04682239    .12315482    .13234666
"AAR CORP."                     2012    .02956286   .008245696    .03015493   .008245696   .08219576    .0608786    .08278784    .0608786    .0916058    .07028863    .09219787    .07028863    .013701297    .03501846
"AAR CORP."                     2013   .014343148    .00706424   .014075482   .006848501   .07274885   .06546994    .072481185    .065254204    .07527548    .06799658    .07500782    .06778084    .08090564    .08818454
"AAR CORP."                     2014    .12776116    .07647776    .12808639    .07647776   .00966556   .06094896    .00934032    .06094896    .04312363    .09440703    .04279839    .09440703    .13066672    .1819501
"AAR CORP."                     2016    .00358197  .0021240015    .00358197  .0022506656  .022760343  .024218313    .022760343    .024091646    .01924402    .020701993    .01924402    .020575326    .17214277    .1706848
"AAR CORP."                     2017   .000751891   .005411498  .0007609682  .0019621786   .01958217   .02424177    .018069308    .020792454    .009424418    .014084022    .007911559    .010634705    .11471897    .11005937
"AAR CORP."                     2018   .007679816  .0041674725   .008188852   .002589463  .010508485   .01402083    .009999454    .01559884    .017906226    .02141857    .017397195    .02299658    .11140133    .11491367
end

Nested forvalue loops - referring to previous macros in a relevant macro

Good afternoon.

Could anyone help me decipher what syntax error I am producing in the below command sequence?

/*Given*/
regress offspringheight mpheight if(tallmpheight==1), robust
global mp_b = _b[mpheight]
global cons_b = _b[_cons]
global new_meanmpheight1 = 68.78209

/*Then*/
forvalues j = 1(1)100 {
forvalues i = `=`j'+1' {
global new_meanmpheight`i' = $cons_b + ($mp_b * $new_meanmpheight`i')
}
}

That is, I seek to perform multiple regression predictions over successive iterations that require local macro `i' to be defined in term of another local macro `j'. From that I seek to save in STATA's global memory all predictions made on the basis of the orginial regression regress offspringheight mpheight if(tallmpheight==1), robust (In attempt to demonstrate a conversion of heights).

please let me know what mistake I am making in defining the above nested loop.

Why the different bootstrap errors in panel data random effects?

Hi,

in help xt_vce_options I found the following recommendation:

When working with panel-data models, we strongly encourage you to use the vce(bootstrap) or vce(jackknife) options instead of the corresponding prefix command.

Of course, this called into my curiosity, and I was wondering if it was because of the clustering nature of the data to avoid any mistakes when using the prefix, or because there is something else to it. So I decided to try it out. When doing a fixed-effects estimation, I found no difference in using either method:

Code:

. clear all

. set more off

. webuse nlswork
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. local xv "c.age##c.age c.ttl_exp##c.ttl_exp south"

. xtset idcode year
       panel variable:  idcode (unbalanced)
        time variable:  year, 68 to 88, but with gaps
                delta:  1 unit

. xtreg ln_w `xv', fe vce(boot, reps(50) seed(1234))
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Fixed-effects (within) regression               Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1546                                         min =          1
     between = 0.2856                                         avg =        6.1
     overall = 0.2149                                         max =         15

                                                Wald chi2(5)      =    1521.76
corr(u_i, Xb)  = 0.1348                         Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0291285   .0055949     5.21   0.000     .0181626    .0400943
                    |
        c.age#c.age |  -.0006749   .0000913    -7.39   0.000    -.0008539    -.000496
                    |
            ttl_exp |   .0617062   .0035824    17.22   0.000     .0546848    .0687275
                    |
c.ttl_exp#c.ttl_exp |   -.000893   .0001529    -5.84   0.000    -.0011927   -.0005933
                    |
              south |  -.0684464   .0200641    -3.41   0.001    -.1077714   -.0291214
              _cons |   1.126962   .0780397    14.44   0.000     .9740066    1.279917
--------------------+----------------------------------------------------------------
            sigma_u |  .36581516
            sigma_e |  .29463102
                rho |  .60654417   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

. bs, reps(50) cl(idcode) id(cid) group(year) seed(1234): xtreg ln_w `xv', fe
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Fixed-effects (within) regression               Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1546                                         min =          1
     between = 0.2856                                         avg =        6.1
     overall = 0.2149                                         max =         15

                                                Wald chi2(5)      =    1521.76
corr(u_i, Xb)  = 0.1348                         Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0291285   .0055949     5.21   0.000     .0181626    .0400943
                    |
        c.age#c.age |  -.0006749   .0000913    -7.39   0.000    -.0008539    -.000496
                    |
            ttl_exp |   .0617062   .0035824    17.22   0.000     .0546848    .0687275
                    |
c.ttl_exp#c.ttl_exp |   -.000893   .0001529    -5.84   0.000    -.0011927   -.0005933
                    |
              south |  -.0684464   .0200641    -3.41   0.001    -.1077714   -.0291214
              _cons |   1.126962   .0780397    14.44   0.000     .9740066    1.279917
--------------------+----------------------------------------------------------------
            sigma_u |  .36581516
            sigma_e |  .29463102
                rho |  .60654417   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

Fitting random effects, however, presents a different picture

Code:

. xtreg ln_w `xv', re vce(boot, reps(50) seed(1234))
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Random-effects GLS regression                   Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1538                                         min =          1
     between = 0.2971                                         avg =        6.1
     overall = 0.2249                                         max =         15

                                                Wald chi2(5)      =    2034.78
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0325329   .0050847     6.40   0.000     .0225671    .0424987
                    |
        c.age#c.age |  -.0007202   .0000839    -8.58   0.000    -.0008847   -.0005557
                    |
            ttl_exp |   .0639336   .0027724    23.06   0.000     .0584998    .0693674
                    |
c.ttl_exp#c.ttl_exp |   -.000943   .0001341    -7.03   0.000    -.0012059   -.0006801
                    |
              south |  -.1253318   .0116613   -10.75   0.000    -.1481877    -.102476
              _cons |    1.08762   .0695168    15.65   0.000     .9513691     1.22387
--------------------+----------------------------------------------------------------
            sigma_u |  .31293049
            sigma_e |  .29463102
                rho |  .53009223   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

. bs, reps(50) cl(idcode) id(cid) group(year) seed(1234): xtreg ln_w `xv', re
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Random-effects GLS regression                   Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1538                                         min =          1
     between = 0.2971                                         avg =        6.1
     overall = 0.2249                                         max =         15

                                                Wald chi2(5)      =    1979.20
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0325329   .0052315     6.22   0.000     .0222793    .0427865
                    |
        c.age#c.age |  -.0007202   .0000858    -8.39   0.000    -.0008884    -.000552
                    |
            ttl_exp |   .0639336    .002964    21.57   0.000     .0581244    .0697429
                    |
c.ttl_exp#c.ttl_exp |   -.000943   .0001408    -6.70   0.000     -.001219    -.000667
                    |
              south |  -.1253318     .01277    -9.81   0.000    -.1503606   -.1003031
              _cons |    1.08762   .0719991    15.11   0.000     .9465039    1.228735
--------------------+----------------------------------------------------------------
            sigma_u |  .31293049
            sigma_e |  .29463102
                rho |  .53009223   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

The question, then, is why? It seems odd that with a fixed effects estimation there is no difference, but with a random effects estimation there is. It is unfortunate, because there may be applications where the statistic we want to bootstrap is some post-estimation, not simply the standard errors of the coefficients, but that it is based on those standard errors. In any case, can someone explain why the different results?

Thanks!!!

Detecting inverted duplicates over two columns

Hello, I have searched for an answer to this simple question on previous forums, and cannot find the specific situation I am hoping to rectify.

I am looking to detect duplicates, but they will be found inverted over two columns, so it is not the typical "duplicate" usually referred to in Stata. As an example:

Obs Var1 Var2
1 A B
2 B A
3 A C
4 C A

(This is a simplification of the content of the cells.)

I need Stata to let me know when 2 and 4 occur, because I need to delete them. They are in String format. I have tried identifying them with loops over levelsof locals but I cannot seem to get the right identification of repeated content).

Thank you.

Is it possible to not set values of optional integer arguments with the syntax command

When specifying optional arguments integer arguments using the syntax command it is usually necessary to set a value. Is it possible to get round this feature?

For example I am writing a program where an argument is optional as long as option1 hasn't also been given. If option1 has been given I want the program to produce an error if the argument has not been given.

A work around is to set the value to one that would be implausible for the user to enter but this does not seem ideal. Is there a better way to do this?

Code:

prog define foo
    syntax, [option1, a_number(integer -99999)]
    if `a_number' == -99999 & "`option1'" != "" {
        di as error "a_number must be provided if option1 is specified"
    }
end

Keep the first observation within a moving 30-day window

I have the following dataset. I want to keep d=1 for the first observation within 30-day moving window, and replace d=0 if the observation is the 2nd and nth observations within the window.
My difficulty is , this is not a moving window. For example, for the first 30-day window, it begins from 02jan1990 to 1Feb1990. The 2nd 30-day window begins from 2Feb1990 instead of 03Jan1990. Is there any way to do it? Or is there any trick for this?

My gut feeling is, to play around with rangestat can solve the question, but I have spent 3 hour and could not think of a solution.

Any help is greatly appreciated.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double permno long date float d
10001 10959 1
10001 10965 1
10001 10966 1
10001 10967 1
10001 10968 1
10001 10969 1
10001 10973 1
10001 10974 1
10001 11021 1
10001 11023 1
10001 11024 1
10001 11029 1
10001 11035 1
10001 11037 1
10001 11042 1
10001 11043 1
10001 11044 1
10001 11120 1
10001 11128 1
10001 11130 1
10001 11140 1
10001 11143 1
10001 11144 1
10001 11148 1
10001 11149 1
10001 11150 1
10001 11151 1
10001 11154 1
10001 11164 1
10001 11168 1
10001 11170 1
10001 11175 1
10001 11176 1
10001 11177 1
10001 11183 1
10001 11497 1
10001 11501 1
10001 11505 1
10001 11506 1
10001 11513 1
10001 11514 1
10001 11518 1
10001 11519 1
10001 11527 1
10001 11528 1
10001 11532 1
10001 11533 1
10001 11535 1
10001 11536 1
10001 11542 1
10001 11543 1
10001 11546 1
10001 11547 1
10001 11548 1
10001 11549 1
10001 11550 1
10001 11554 1
10001 11555 1
10001 11556 1
10001 11557 1
end
format %d date

Effect of schooling on menstruation methods

Hello,
I am trying to find the effect of schooling on mensuration methods from Indian context.These methods may include use of locally prepared napkins, cloths, sanitary napkins, tampons, etc.My dependent variable is mensuration methods during periods and independent variables are girls's own schooling and her mother's schooling. but the problem is that my data does not have income aspect which can affect schooling of the girl.
So, to control the income, I am using household fixed effect and district fixed effect which includes caste, religion, region, age and living standard. I want to know whether is it okay to consider these effects? Because as far as I know we use fixed effect in panel data. But my data is not panel. the year is include is only 2015.Please help me out in this.
Another doubt i have is that for mensuration methods, I am using 4 dummies.
Dummy =1, if sanitary napkins is used
Dummy =1 if sanitary napkins/ Local napkins is used
Dummy=1, if sanitary/ tampons is used
Dummy= if sanitary/ local napkins/ tampons is used.
I am confused, why my professor has asked me to create these dummy as my data is already in yes/ No format.
Please help me out in clarifying my doubt.

summary forest plot for meta-analysis

Hello,

I am doing a study-level meta-analysis with different outcomes. In addition to the standard forest plots, I would like to produce a summary forest plot, by removing individual studies and just keeping the summary effect for each outcome, and then diplay all the summary effects (obtained with random effect model) in a single graph. Is there any command that could do this in STATA?

Thanks in advance

Nicoletta

2:Setting manual coefficients

Hi Listers,

Y = β1+ β2*Area + β3Labor + β4*Fertilizer + u

This time its given that B4=3*B3.
Which model should you use? Give the estimates of B2, B3 and B4.

Thanks!

Applying Bai-Perron Test to Time Series data structured in panel format

Hello,
I have a panel dataset where data is there for each country. I want to run the stata command for each country.

https://www.wiwiss.fu-berlin.de/fach...G_V02_Main.pdf
In this paper, the author has developed a Stata package "sbbpm".

The syntax is:
sbbpm depvar timevar, [minspan(#) maxbreaks(#) alpha(#) trimming(#) het(string) prewhit(#) method(string)]

I tried to run the synatx for each country using if condition like:
sbbpm depvar timevar, [minspan(#) maxbreaks(#) alpha(#) trimming(#) het(string) prewhit(#) method(string)] if id==1
but it does not allow if.

I'm wondering if can be done some local or global looping...
would highly appreciate your help...

Forecast errors

Hi Statlist,

I would like to compute the variance of forecast errors in a panel data setting as indicated in Array . In particular, I have build up what is called formula 2 in the manuscript as:

Code:

xtreg tasso_crescita_sales_prod L.log_sales L.dummy_2 L2.dummy_2 L3.dummy_2 mean_gr_rate_atc2 recalls_sales ageprodcat1 ageprodcat2 ageprodcat3 ageprodcat4 newmolfirm newmolmarket i.Year, fe vce(cluster idpr)

Now the tricky parts: first, forecast errors. To compute them I have simply made:

Code:

predict yhat, xbu
gen forecast_errors = tasso_crescita_sales - yhat

I have noticed that in "regress" command there is something like star that can be used after predict to compute standard errors of residuals. This is not the case with "xtreg", so I am wondering how I can compute the variance for each single error to obtain what is called Var(AV_it_hat) in the numerator of (4) in Array (and then collapse by (time) in order to obtain (4)).

Thank you very much,

Federico

Issue with xlabel command

Hello, I am trying to change the labels of a graph (without using the editor)

Code:

graph box job_hours if expectation == 2, over(gift_received)

Now as you can see from the attachment the x-axis has labels 1 and 2. I would like to change these to gift received for 1 and no gift received for 2.

I tried

Code:

graph box job_hours if expectation == 2, over(gift_received) ytitle(Hours worked per Week) xlabels(1 "Gift received" 2 "No gift received")

As shown in the help file, but it doesn't work, I get the error message "xlabels(1 Gift not found" (without the "")

What am I doing wrong? I checked in the variable and it only consists of 1 and 2, these aren't renamed or were changed in anyway. So what am I doing wrong?

Panel Regression versus Dynamic Models

Dear all,
I am trying to replicate a study, where I am regressing the change in capital expenditures(DELCAPEX) is regressed on cash flows(CAFLOW) and Market to Book ratio(MB). I also want to add Lags of my dependent variable (here,L.DELCAPEX) and independent variable( L.CAFLOW). I ran the following fixed effects regression code in Stata.

Code:

xtreg DELCAPEX  CAFLOW MB L.DELCAPEX L.CAFLOW  i.year,fe vce(robust

I am having only 7 years of data. I don't know whether the above panel regression model is correct or not. Also
1. Can I include lagged values of dependent variable and independent variables in the model and estimate it via panel regression?
2. Does the no: of years(7 in my case) matter here?
Any help in this regard will highly helpful to me

generate or replace a variable -- with a string type

Hi guys

I have a string variable in the dataset. It contains a long list of heart diseases. I would like to categorise it into a few ones, using keywords (e.g. one of the heart diseases contains the word "muscular").

Can I create/replace a variable using those keywords? So it could be something like this

replace x if y == "muscular"

Thanks!

Setting manual coefficients

For a hypothetical case in class the following question is asked:

We have the following model: Y = β1+ β2*Area + β3Labor + β4*Fertilizer + u

Each labor day increases production with 0.01 ton, so β3=0.01, show how to estimate β2 and β4.
(Do not use Stata's possibility to do linear regression with linear restrictions)

The following variables are used in Stata:

prod (for Y)
area (for Area)
L (for labor)
F (for fertilize)

How can we fix a coefficient to 0.01, or are there other ways to solve this?

Thanks in advance!

With kind regards,
Pete

Handling large percentage of zero-valued observations in the dependent variable in a panel dataset

I am writing a paper using a panel dataset in which my depepent variable has an large percentage amount of zero values observations. Those zero values are real zeros, I mean they are not missing data or whatsoever. I have taken a look on the literature and there are many models that can be applied in this case. I am awared of the following: Tobit (Tobin, 1958), Two-Stage Model (Heckman, 1979), Two-Parts Model (Duan et al.,1984), PPML (Silva & Tenreyro, 2006) and Double-Hurdle models (Dong & Kaiser, 2008). Which one should I use and how to justify the adoption?

correlation and proportional weights

Hello,
I am not able to use proportional weights with correlation but can use analytical weights.
Can anyone help me to explain the reason?

I am able to understand some intuition like correlation is not affected by the multiplier effect. But I am not sure, then why analytical weights are allowed!

commands:
corr x y [pw=freq]
error message:
pweights not allowed
r(101);

Competing risks in survival analyses

Dear reader,

I would like to know the difference in my survival analysis approach.
I have 4 types of events that I want to examine, and not everyone experiences an event.
These events are dummies in my dataset.
I use the log-logistic approach, and I consider one event only, so the other events are censored, and the rest is therefore also treated as censored.

(1) My question is then, what am I really examining when I do this for all the four event types separately? Because I don't understand the difference then when using competing risks. It looks like I am already using a form of competing risks when examining the event types separately, although without using the specific command, is this correct? I mean when running the regression with only event type 1, event type 2 3 4 and the rest is treated as censored, as in the competing risk regression, right?

Is it just with competing risks that I consider event type 1 versus event type 2, and thereby consider event type 3 and 4 and the others as censored. So the results will only involve event type 1 and 2?

Is my approach, regarding the first question, meaningful?
Or should I run the log-logistic regression only for all the event types together, and use the competing risks approach for the comparison of the event types?

(2) My second question involves the model distribution when using a competing risks approach. I read some papers that mention 'a log-logistic AFT competing risks model', but how should I determine the distribution for the competing risks regression? I don't see this option in the stcrreg command window.

Thanks in advance.

Kind regards,

Michael

Commands:

Separate log-logistic analysis:

Code:

 stset E_Date, failure(Event==1) id(ID) enter(Date1) origin(Date1)
streg $xlist, dist(loglogistic)

stset E_Date, failure(Event==2) id(ID) enter(Date1) origin(Date1)
streg $xlist, dist(loglogistic)

stset E_Date, failure(Event==3) id(ID) enter(Date1) origin(Date1)
streg $xlist, dist(loglogistic)

etc.

and the competing risks regression:

Code:

stset E_Date, failure(Event==1) id(ID) enter(Date1) origin(Date1)
stcrreg $xlist, nohr compete(Event==2) offset(Event==1)

Two-way table where cells are %s and Totals are frequencies

Hello,

Is there a way to create a two-way table where the cells are percentages and the totals are the actual frequencies? I'm sure I am missing something obvious. So something like:

HTML Code:

ta var1 var2, col nofreq

However, instead of the total being the total of the percentages, it's the total of the frequencies.

Thanks!

reghdfe - "noconstant" as standard option

Dear community,

I would please like to ask for your help concerning the following question.

Recently, the server I am allowed to work on upgraded to STATA 16 and during the course of which, a clean re-install of the very much appreciated reghdfe as well as its dependencies was performed.

Running previous code results into a constant being reported; is there a way to declare "noconstant" as default, such that I would not need to change all code using reghdfe?

Yours sincerely,
Sinistrum

test for statistical difference

Hi Stata fam
I would like to test for statistical difference between responses from two different populations, a intervention site and a control site.
For example, i would like to know whether there is a statistical deference between people who responded that they had never attended school in the
intervention site and those who had never attended school in the control site
As on the below table

Thank you

Highest level of education completed	intervention	control	p-value
Never attended school	350 (90.0%)	300 (79.5%)
Preschool/Primary school (incomplete)	18 (6.5%)	36 (10.8%)

Hh members under five years
None	132 (26.0%)	124 (22.2%)
1	176 (40.8%)	184 (43.4%)

Transforming time series data into Panel data

Hello, I have a time series dataset, downloaded from UNCTAD.
Then I pasted the data into Stata datasheet.
The data is the following

var1 var2 var3 var4 var5
year 1980 1981 1982 1983
x 1.2 2.3 3.4 4.5
y 2.1 2.2 2.3 2.4
z 3.1 3.2 3.3 3.4

where x, y, z are countries and var2-var5 are years.

I was trying to transform it using the "reshape" function. It did not solve the problem.
I appreciate your reply and help.

Wednesday, October 30, 2019

Interacting an independent variable with fixed effects

Dear Statalist members,

Thank you so much for reading this post and helping me answer my question! In short, my question is how I can run a regression in Stata which contains not only fixed effects, but also the interaction of an independent variable with the fixed effects. I will explain the details below.

I am trying to replicate a paper which examines how the stock price reaction to earnings (a.k.a. earnings response coefficient) changes after a certain regulation was introduced. It is a classic difference-in-differences design complicated by the fact that the variable of interest is not a stand-alone variable but the coefficient obtained from a regression.

Specifically, the earnings response coefficient is the "b₁" coefficient in the regression below:
CAR _i,t = b₀ + b₁* UE i,t + bm* Controls i,t + e_i,t
where
CAR _i,t is the stock return around earnings announcement, for company i and year-quarter t;
UE i,t is the earnings being announced, for company i and year-quarter t;
Controls i,t is a list of control variables.

The regression I am trying to run is specified as follows (Table 3 in the attached paper https://www.sciencedirect.com/scienc...300417#bib0026):
CAR= b₀ + b₁* UE*Post*Treat + b2* UE + b3* Treat + b4* Post + b5* Treat*Post + b6* UE*Post + b7* UE*Treat
+ bm* Controls + bn* UE*Controls + Year-quarter fixed effect + Industry fixed effect + UE*Year-quarter fixed effect + UE*Industry fixed effect

In Stata 14, I tried to run the following code:
reghdfe car ue_post_treat ue ue_treat ue_post controls ue_controls, ///
absorb(industry yearquarter ue_industry ue_yearquarter) vce (cluster announcement_date)

The regression ran, but "ue" was omitted due to collinearity. This is expected because "ue_industry" = ue*i.industry and "ue_yearquarter" = ue*yearquarter are obviously collinear with "ue". This brings me to my questions:

1. Is my regression correctly specified? The paper cited above mentioned that they included the "interactions of UE with year-quarter fixed effect and industry fixed effects" and Table 3 also shows "UE*fixed effects", but does that actually mean taking the product of the year-quarter (or industry) dummy and UE? I am really confused as that seems to inevitably result in collinearity problems.

2. Suppose my specifications are correct, how should I run this regression in Stata (or other softwares)? In the paper, none of variables were omitted due to collinearity, so it is definitely achievable. Is there any other command I should use to run this regression?

Again, thank you so much for your help, I truly appreciate any suggestions or reference to any post or paper that addresses a similar problem.

Correct Modeling of Data

Stata Experts:

I have been struggling for several days with my data analysis plan and I am hoping for some clarity.
1. I am working with ECLSK:2011, a nationally representative dataset which requires the use of the .svyset command. (data is nested --
children in schools, schools in districts (the primary sampling units). I have done that and all is fine with respect to descriptive analysis.
2. I am looking at the effect of maternal education (5-level categorical predictor based on educational attainment: below high school, high school, some college, etc) ) on children's math achievement (represented by a z-scored continuous variable) in the fall of kindergarten.
3. I would like to analyze the data using multilevel modeling with fixed effect to control for school effects.
4. This is a four level model:
Step 1: Regress maternal characteristics on math achievement:
... maternal characteristics include maternal age, poverty level, work status, occupational prestige
Step 2: Add maternal activities at home; i.e. frequencies of activities like building, working with numbers, playing with puzzles. The frequencies include: never, once or twice/week, three to six times/weeks and everyday. I used PCA to analyze these activities in the form of PC_1 and PC_2.
Step 3. Add maternal expectations for children's educational attainment (also a 5-level categorical variable: below high school to professional degree); and finally
Step 4. Add covariates found in research to be associated with math achievement, i.e. like single parent hh, poverty level, number of siblings, child age.

In Stata, the svy prefix cannot be used with: .xtreg fe. Is there a different way to do this analysis?
For example, what about just using the cluster option with OLS regression? cluster(school_id)
Is .mixed a possibility? I am not measuring Level 2 data and I am not an expert in this type of analysis. Would .areg work by absorbing (school_id)?
Any help would be gratefully accepted.
Best,
Pat

General xtabond2 queries

Dear Statalist,

I am examining quarterly data for an unbalanced panel of 11,000+ banks from 1996:Q1 to 2016:Q4 using Stata/IC. I am using a system gmm model through xtabond2. My regression code is as follows:

Code:

xtset id dateq

xtabond2 LCCF l.LCCF T1RAT SIZE RISK ROE MS MNA COMP POP INCG GDPG UNEM FFR q2_1997- q4_2016, ///
gmm(l.LCCF T1RAT SIZE RISK ROE MS MNA, lag(3 4) collapse) iv(COMP POP INCG GDPG UNEM FFR q2_1997- q4_2016) ///
twostep robust small nodiffsargan

For completeness, variables q2_1997- q4_2016 are dummies for each quarter to introduce time fixed effects. I have included all bank-level variables as endogenous in the -gmm- option, and all other variables as instruments in the -iv- option.

I understand that the -collapse- suboption of the -gmm- command is meant for datasets where the number of groups > number of instruments. This is obviously not the case with my data, as I have 11,000+ banks, so I do not think that it is necessary nor justified to include it. However, if I exclude the -collapse- suboption, using this code:

Code:

xtset id dateq

xtabond2 LCCF l.LCCF T1RAT SIZE RISK ROE MS MNA COMP POP INCG GDPG UNEM FFR q2_1997- q4_2016, ///
gmm(l.LCCF T1RAT SIZE RISK ROE MS MNA, lag(3 4)) iv(COMP POP INCG GDPG UNEM FFR q2_1997- q4_2016) ///
twostep robust small nodiffsargan

I get the following error:

Code:

                     J():  3900  unable to allocate real <tmp>[927864,588]
              _Explode():     -  function returned error
           _ParseInsts():     -  function returned error
         xtabond2_mata():     -  function returned error
                 <istmt>:     -  function returned error

Why does this error occur? xtabond2 seems very limited the detail of its error reporting.

PPML for one country

Hi,

If I estimate a gravity model using PPML from a balanced data panel for a several number of countries, but I would like to focus only in one specific country AAA is it the following correct? :

egen exp_time = group(exporter year)
quietly tabulate exp_time, gen(EXPORTER_TIME_FE)

egen imp_time = group(importer year)
quietly tabulate imp_time, gen(IMPORTER_TIME_FE)

ppml TRADE EXPORTER_TIME_FE* IMPORTER_TIME_FE* ln_DIST CNTG LANG CLNY RTA if exporter == "AAA" | importer == "AAA", cluster(DIST)

Or should I drop exporters and importers that are different from country AAA?

Thanks!

Not concave issue with multinomiallogit regression

Dear Satalis,

I hope you are well. I would like to ask please regarding the problem of 'not concave' iteration. I have a dataset for 300 firms, one of the firms appears to be an outlier. When I excluded the outlier (i.e. to be only 299 firms) from the multinomial regression analysis (mlogit) the analysis iteration took a long time to perform the iteration and showed non-concavity in some iteration process. How to solve this problem please? Important to mention that when the outlier is not excluded the regression runs perfect but I got small value for the marginal effects results (for instance, 3.94E) for only one of the dependent variable estimation.

. mlogit App_status i.I_sec i.AF_LEG i.AF_AGE i.AF_SIZE i.I_loct2 i.I_expt2 i.AF_GRWT i.BO_GEN i.BO_CIT i.BO_AGE i.ow_Exper2 i.BO_FINT i.BO_EDU i.CR_LEN i.CR
> _BS1 i.CR_BS2 i.CR_BS3 i.CR_BS4 i.CR_BS5 i.CR_BS6 i.CR_BS7 i.CR_BS8 i.CR_SAT i.DE_ADS1 i.DE_ADS2 i.DE_ADS3 i.DE_ADS4 i.DE_ADS5 i.DE_ADS6 i.DE_ADS72 i.EI_BP i.EI_AUDFR

Iteration 0: log likelihood = -376.13767
Iteration 1: log likelihood = -240.52371
Iteration 2: log likelihood = -204.10375
Iteration 3: log likelihood = -190.49561
Iteration 4: log likelihood = -182.16412
Iteration 5: log likelihood = -174.29973
Iteration 6: log likelihood = -169.14731
Iteration 7: log likelihood = -167.60249
Iteration 8: log likelihood = -167.36518
Iteration 9: log likelihood = -167.30934
Iteration 10: log likelihood = -167.29729
Iteration 11: log likelihood = -167.29478
Iteration 12: log likelihood = -167.29422
Iteration 13: log likelihood = -167.29408
Iteration 14: log likelihood = -167.29405
Iteration 15: log likelihood = -167.29405 (not concave)
Iteration 16: log likelihood = -167.29405 (not concave)
Iteration 17: log likelihood = -167.29405 (not concave)
Iteration 18: log likelihood = -167.29405 (not concave)
Iteration 19: log likelihood = -167.29405 (not concave)
Iteration 20: log likelihood = -167.29405 (not concave)
Iteration 21: log likelihood = -167.29405 (not concave)
Iteration 22: log likelihood = -167.29405 (not concave)
Iteration 23: log likelihood = -167.29405 (not concave)
Iteration 24: log likelihood = -167.29405 (not concave)
Iteration 25: log likelihood = -167.29405 (not concave)
Iteration 26: log likelihood = -167.29405 (not concave)
Iteration 27: log likelihood = -167.29405 (not concave)
Iteration 28: log likelihood = -167.29405 (not concave)
Iteration 29: log likelihood = -167.29405 (not concave)
Iteration 30: log likelihood = -167.29405 (not concave)
Iteration 31: log likelihood = -167.29405 (not concave)
Iteration 32: log likelihood = -167.29405 (not concave)
Iteration 33: log likelihood = -167.29405 (not concave)
Iteration 34: log likelihood = -167.29405 (not concave)
Iteration 35: log likelihood = -167.29405 (not concave)
Iteration 36: log likelihood = -167.29405 (not concave)
Iteration 37: log likelihood = -167.29405 (not concave)
Iteration 38: log likelihood = -167.29405 (not concave)
Iteration 39: log likelihood = -167.29405 (not concave)
Iteration 40: log likelihood = -167.29405 (not concave)
Iteration 41: log likelihood = -167.29405 (not concave)
Iteration 42: log likelihood = -167.29405 (not concave)
Iteration 43: log likelihood = -167.29405 (not concave)
Iteration 44: log likelihood = -167.29405 (not concave)
Iteration 45: log likelihood = -167.29405 (not concave)
Iteration 46: log likelihood = -167.29405 (not concave)
Iteration 47: log likelihood = -167.29405 (not concave)
Iteration 48: log likelihood = -167.29405 (not concave)
Iteration 49: log likelihood = -167.29405 (not concave)
Iteration 50: log likelihood = -167.29405 (not concave)
Iteration 51: log likelihood = -167.29405 (not concave)
Iteration 52: log likelihood = -167.29405 (not concave)
Iteration 53: log likelihood = -167.29405 (not concave)
Iteration 54: log likelihood = -167.29405 (not concave)
Iteration 55: log likelihood = -167.29405 (not concave)
Iteration 56: log likelihood = -167.29405 (not concave)
Iteration 57: log likelihood = -167.29405 (not concave)
Iteration 58: log likelihood = -167.29405 (not concave)

--Break--
r(1);

Could you please advise on how to solve the problem of the non-concavity with mlogit analysis?

Appreciate your kind help and cooperation

Best regards,
Rabab

covariance structure in mixed model

Dear all,

I've been having problems selecting the correct covariance structure and for my mixed model (employees nested within firms).
When adding a (categorical) random slope (regarding the size of the firm) to my model, my model doesn't converge with the default (independent) covariance structure and only converges when using the cov(exchangeable) option (for the R. notation, only independent and exchangeable covariance structures are available)

Code:

mixed jobsat level1fixedeffects level2fixedeffects || firmid: R.size

Did I understand it correctly, that one uses the exchangeable covariance structure only for repeated measures/panel/longitudinal data? I'm using pooled cross-sectional data, looking at firms for 2 consecutive years and treating the year variable as a fixed effect, so the observations (employees) on the first level are not the same for both years. I believe the covariance structure should therefore be independent and not exchangeable, so the non-convergence just tells me that the model is not sufficient and I shouldn't use it.

Please tell me if you need further information in order to answer my question regarding the covariance structure.
Your input will be greatly appreciated.

Felicia

How to save graphs from loop in particular folder

How to save graphs from loop in a particular folder.

I have my out foder:

global MY_OUT "C:\Users\out"

and the loop as below:

foreach y of varlist reer cpi gdpgr {
local graphtitle : var lab `y'
twoway (line c tp,title("`graphtitle'") legend(off) lwidth(0.25 ) ///
lcolor(black) lpattern(solid))(line trq tp,lwidth(0.4) lcolor(black) ///
lpattern(solid))(line l tp,lwidth(0.25 ) ///
lcolor(black) lpattern(longdash)) (line u tp ,lwidth(0.25 ) lcolor(black) ///
lpattern(longdash) ///
xlabel(-3 "t-3" -2 "t-2" -1 "t-1" 0 "t" 1 "t+1" 2 "t+2" 3 "t+3") ///
yline(0,lwidth(0.25 ) lcolor(black) lpattern(shortdash)) saving(`y' , replace ))
drop tp c trq l u
local graphnames `graphnames' `y'
}

I was using something like

saving($MY_OUT\`y' , replace ) but it still saves the graphs in my do folder and not in my out folder.

How to extract a country specific value for each country in a panel data set

I have a panel date set wich consists of 49 countries and want to extract fed funds rate for each country. The variable interestrate have interest rate of each country and want to extract the values of the US

interestrate for each country in the data set.

Unobserved Componenet Model and State Space Model

Dear All

I am try trying to construct an index, using Unobserved Component Model (UCM) and State Space Model (SSM) in Stata, but I am really struggling with the right syntax/command.
I have already used Principal Component Analysis (PCA) but a lot of observations were dropped in the process.
With UCM and SSM; I have read a number of articles/journals on them but I do not have a good grasp of the commands. I got different kinds of error when I follow/use the ‘Statistics>Time-Series>UCM or Statistics>Time-Series>SMM’ routes in Stata.

I understand UCM and SSM are primarily designed for forecasting purposes but they have also been used to construct indices.

The whole idea of what I am doing is to use 3 different methods/models to construct the same index and see how they perform when used for further analysis:

-I am keeping the index constructed with PCA.

-I want to use UCM to construct another index without using Kalman Filter to address missing observation:
After dropping an observation due to having too many missing observations, I have 5 variables left to construct the index.
3 of the 5 variables are fairly complete (they have insignificant missing observations) while 2 of them have significant missing observations.
Following kaufmann et al (2010) in the construction of World Governance Indicator: I want the UCM to construct the index with any combination of 3 (or more) variables from the 5 variables; the 3 variables must include at least 2 of the 3 variables that are fairly complete variables and at least 1 of the 2 variables with significant missing observations.

-For SSM; I intend to use Kalman Filter to address missing observations when constructing the index with the model.

I need assistance on the UCM and SSM commands I need to construct the index with Stata.

I will provide further information if needed.

Your assistance will be very much appreciated.

Thank you.

Variance of each single residual

Hi Statlist,

I am struggling into constructing the residuals' specific variance for the following model:

Code:

xtreg tasso_crescita_sales_prod L.log_sales L.dummy_2 L2.dummy_2 L3.dummy_2 mean_gr_rate_atc2 recalls_sales ageprodcat1 ageprodcat2 ageprodcat3 ageprodcat4 newmolfirm newmolmarket i.Year, fe vce(cluster idpr)

What I would like to do is to generate a variable consisting of the variance of regression residuals so that each residual has its own variance (not gen var = e(sigma_e) which generates a unique value).d
\overline{AV_{t}}=\sum_{i=1}^{N_{t}} \widehat{AV}_{it},

Stata : Rr mantel heanzel

I have SMR results by lauching this command : strate maxexpo_ardbis, per(100000) smr(taux_esto)
and i want to have the RR of mantel Haenzel, rr which measures the tendency of increase or decrease of the SMRs according to the classes of the variable maxexpo_ardbis.
Do you know which command i have to lauch for that ?
Thank you so much for your return,
Kind regards,

only the estimated coefficient

Hi,

Which could be the reason that Stata does not show me only the estimated coefficient?:

------------------------------------------------------------------------------
| Robust
ln_TRADE | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------

ln_DIST | -.7334557 . . . . .
CNTG | 1.421494 . . . . .
LANG | -3.674322 . . . . .
CLNY | 6.189377 . . . . .
_cons | 17.52462 . . . . .
--------------------------------------------------------------------------------------

Thanks!

Ramsey test

Hi,

Which is the command in Stata 12 for the Ramsey Test with data panel?

Thanks!

Number of rows and columns of subplots using xtline

Hi,

how can I define the number of rows/columns of subplots when using "xtline"? The options "rows()" and "cols()" do not work:

input: "xtline var1, rows(2)"
output: "option rows() not allowed"

Same with "cols(#)". Any ideas?

Thanks

help with local macros

Dear all,
I would like to create several boxplots at once and combine them. i found the code below and it works very well. can some one help me understand the different sections. i understand the foreach command but i am confused about the rest mainly
local names `names' graph`j'
local j

Thanks in advance

Code:

local j = 1
local names
foreach var of varlist A-X {
     graph box  `var', name(graph`j')
     local names `names' graph`j'
     local j
}

graph combine `names'

Stata 15.1 Mac

Problem with Reshape on Panel Data

Hi all,
My aim is to make variable id to be unique and I think reshape is the command instead of collapse.
When I was tried with reshape from long to wide, error appear in output:
Some error are:

code : reshape wide hi06 hi07, i(id) j(hi01) string
output: r(109); variable hi01 is numeric

I did convert data hi01 to string
command: tostring hi01, replace
hi01 was byte now str1

and did command : . reshape wide hi06 hi07, i(id) j(hi01) string
output :
(note: j = 1 3 8)
values of variable hi01 not unique within id
Your data are currently long. You are performing a reshape wide. You
specified i(id) and j(hi01). There are observations within i(id) with the
same value of j(hi01). In the long data, variables i() and j() together
must uniquely identify the observations.

long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+
Type reshape error for a list of the problem variables.
r(9);

I also did several combination to reshape, including dropping some variables that not really urgent for my analysis (especially drop variable with missing value), therefore the variables left are: id, hi01, hi06, but result almost the same either. Probably I do not really understand the reshape structure.
Thanks in advance.

[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input str1(hi01 hitype) byte(hi02 hi06 hi07 hi08x) str9 id
"3" "." . 3 . . "001220001"
"3" "." . 3 . . "001220002"
"3" "." . 3 . . "001220003"
"3" "." . 3 . . "001220004"
"3" "." . 3 . . "001220005"
"3" "." . 3 . . "001220011"
"3" "." . 3 . . "001250001"
"3" "." . 3 . . "001250002"
"3" "." . 3 . . "001250003"
"3" "." . 3 . . "001290001"
"3" "." . 3 . . "001290002"
"3" "." . 3 . . "001290003"
"3" "." . 3 . . "002010001"
"3" "." . 3 . . "002010002"
"3" "." . 3 . . "002010003"
"3" "." . 3 . . "002010004"
"3" "." . 3 . . "002020003"
"3" "." . 3 . . "002020007"
"3" "." . 3 . . "002030001"
"3" "." . 3 . . "002030002"
"3" "." . 3 . . "002040001"
"3" "." . 3 . . "002040002"
"3" "." . 3 . . "002090001"
"3" "." . 3 . . "002090002"
"1" "A" 3 3 . . "002090003"
"1" "D" 3 3 . . "002090003"
"1" "C" 3 3 . . "002090003"
"1" "B" 1 3 . . "002090003"
"1" "V" 3 3 . . "002090003"
"1" "C" 3 3 . . "002090006"
"1" "D" 3 3 . . "002090006"
"1" "A" 3 3 . . "002090006"
"1" "B" 1 3 . . "002090006"
"1" "V" 3 3 . . "002090006"
"3" "." . 3 . . "002090007"
"1" "D" 3 3 . . "002093102"
"1" "V" 3 3 . . "002093102"
"1" "A" 3 3 . . "002093102"
"1" "C" 3 3 . . "002093102"
"1" "B" 1 3 . . "002093102"
"3" "." . 3 . . "002093202"
"1" "D" 3 3 . . "002093301"
"1" "V" 3 3 . . "002093301"
"1" "A" 3 3 . . "002093301"
"1" "C" 3 3 . . "002093301"
"1" "B" 1 3 . . "002093301"
"3" "." . 3 . . "002100001"
"3" "." . 3 . . "002100002"
"3" "." . 3 . . "002110001"
"3" "." . 3 . . "002110002"
"3" "." . 3 . . "002120001"
"3" "." . 3 . . "002140001"
"3" "." . 3 . . "002140002"
"3" "." . 3 . . "002160001"
"3" "." . 3 . . "002180003"
"3" "." . 3 . . "002180004"
"3" "." . 3 . . "002180005"
"3" "." . 3 . . "002183101"
"3" "." . 3 . . "002190001"
"1" "B" 3 3 . . "002200001"
"1" "A" 1 3 . . "002200001"
"1" "V" 3 3 . . "002200001"
"1" "C" 3 3 . . "002200001"
"1" "D" 3 3 . . "002200001"
"3" "." . 3 . . "002210001"
"3" "." . 3 . . "002250001"
"3" "." . 3 . . "002250002"
"3" "." . 3 . . "002250007"
"3" "." . 3 . . "002260001"
"3" "." . 3 . . "002260002"
"3" "." . 3 . . "002260005"
"3" "." . 3 . . "002260006"
"3" "." . 3 . . "002270001"
"3" "." . 3 . . "002270002"
"3" "." . 3 . . "002270003"
"3" "." . 3 . . "002270004"
"3" "." . 3 . . "002270006"
"3" "." . 3 . . "002290001"
"3" "." . 3 . . "002290002"
"3" "." . 3 . . "002290006"
"3" "." . 3 . . "002300001"
"3" "." . 3 . . "002300006"
"3" "." . 3 . . "002300011"
"3" "." . 3 . . "003010001"
"3" "." . 3 . . "003010002"
"3" "." . 3 . . "003010005"
"3" "." . 3 . . "003020002"
"3" "." . 3 . . "003020003"
"3" "." . 3 . . "003030001"
"3" "." . 3 . . "003030004"
"3" "." . 3 . . "003030005"
"3" "." . 3 . . "003031201"
"3" "." . 3 . . "003040002"
"3" "." . 3 . . "003050001"
"3" "." . 3 . . "003050002"
"3" "." . 3 . . "003050004"
"3" "." . 3 . . "003050006"
"3" "." . 3 . . "003070002"
"3" "." . 3 . . "003070005"
"3" "." . 3 . . "003080001"

Data which contain Multiple followups with hours variables

Dear Statlist Users,

Hope all you experts are doing great.I wanted to understand how to analyze the data which contains multiple followups, for example data contain 47 followups depend on different days. and each followup contain baby breast feeding practices in 24 hours. Data has been taken hourly ( How many times baby has been fed in per hour ).
if any can one reply please?

Tuesday, October 29, 2019

non-linear optimisation with linear restrictions

Dear all: I need to find the B vector that minimises the following non-linear function (B-A)*IM*(B-A) such that a linear restriction (also involving the vector B) is fulfilled. It is not necessary to write the restriction in this post. A is the vector of coefficients from my non-restricted model and IM is the inverse of the var/cov matrix from my non-restricted model.
Is it possible to do this in Mata? I use Stata 15. I haven't found any tutorial nor examples of this. I would appreciate your advice ... Many thanks, Juan

Fixed Effect troubles

Hello, I'm a first-time poster and newbie to the world of Stata. I am trying to estimate a fixed effects model with city as the entity variable and year as the time variable in a panel data set I have. I am typing "xtreg city year, fe" and getting an error message "must specify panelvar; use xtset."

I then entered the command "xtset city year", then "xtreg city year, fe" and got a different error message. This time it says "the panel variable city may not be included as an independent variable."

I have no idea what this means or how to fix it. I would appreciate any tips. Thank you very much for your time.

Scatter of a local

Hi, is it not possible to create a scatter of a local variable? I'm evaluating the difference in the probabilities of Y btn men and women for a range of other variables. Through some loops I get a list of that, but when trying to plot it, I can't. Is not possible to make a scatter of a local?

This code is working (with it I get the mentioned list):

quietly {

logit pmvotegrp lrself male retnat income age edulevel

local maleA=0
local incomeA=3
local lrselfA=5
local edulevelA=3
local retnatA=3

local maleB=1
local incomeB=3
local lrselfB=5
local edulevelB=3
local retnatB=3

capture drop predA
capture drop predB

foreach nn of numlist 1/100 {

local ageA=`nn'
local ageB=`nn'

# delimit ;
local sysA=_b[_cons]+
_b[lrself]*`lrselfA'+
_b[male]*`maleA'+
_b[retnat]*`retnatA'+
_b[income]*`incomeA'+
_b[age]*`ageA'+
_b[edulevel]*`edulevelA';

local sysB=_b[_cons]+
_b[lrself]*`lrselfB'+
_b[male]*`maleB'+
_b[retnat]*`retnatB'+
_b[income]*`incomeB'+
_b[age]*`ageB'+
_b[edulevel]*`edulevelB';

local predA=exp(`sysA')/(1+exp(`sysA'));
local predB=exp(`sysB')/(1+exp(`sysB'));

# delimit cr

local diffBA=`predB'-`predA'

noisily: display in r "Case `nn' (Age `ageB'): `diffBA' "

}
}

When changing the noisily part for what follows, I get nothing:

noisily: twoway scatter `diffBA' age, msize(small) mcolor(red)
, name(dd_age, replace) legend(off)
title("Probability of voting PM's party, diff btn men and women", size(vsmall))
xtitle("Age") ytitle("Diff prob men and women")
nodraw
xlabel(0(1)100,grid angle(45) labsize(vsmall))
ylabel(1(0.001)1,grid labsize(vsmall))

Even if just coding scatter `diffBA' age, I get nothing. Any advice?

Reg without constant term

Hi,

I know that when I estimate a regression with fixed effects the constant term should not be included.
However, when I run the regression os Stata, it estimates the constant term.

How can I run my regression without a constant term?
or what is the interpretation of the constant term in the presence of the fixed effects?

Thanks!

Help with Stata's Graphs

Hi

I'm trying to build a Graph on Stata but the variability of my Data is very high.
I need to build a Graph of Brazilian's Inflation from 1980 to 2012 but, in the middle of this period, we had very high values if we compare with the data average. My objective is, only in the graphic view, to limit the higher values to 100 (Eg.: where the inflation was 300, converter this value to 100 on this graph). Is possible to do It?

Thanks!

Attrition and OLS estimatos

How can I prove that the attrition bias affects or does not affect my OLS estimator?
Can I do it with a balance table between the group of which I have information and with the one of which I lost information? How would you interpret that balance table? (I have complete data of both groups in other variables)

Analyzing two time points with regression

I am analyzing data with the mental health component score (measured 0-100) for 4500 individuals. This score was measured at baseline and then 30 days following an intervention.
I would like to use regression to see the average change in score over time.

I used the following codes to reshape my data:

rename preop_mcs test1
rename x30d_mcs test2
reshape long test, i(sampleid) j(time)
xtset sampleid
xtreg test

I then conducted a mixed effects regression with the following code:

meglm test time || sampleid:

Then I added covariates:

meglm test time age sex smoking race bmi || sampleid:

I've noticed no matter how many or which covariates I've added the coefficient for time does not change. Because I know age is significantly associated with the score from my previous analysis, I'd expect to see at least a little change. This makes me think something is wrong with my coding and/or approach.

I appreciate any guidance with the above coding and how to proceed with a regression.

longitudinal population survey data

Hello,

I have a longitudinal population survey data. This is actually my first time working with such a huge dataset, and I am having a really bad time in setting it up. I am describing the dataset below. I would be really grateful if anyone can point to the right direction.

It is a population survey. The strata are the districts. Primary sample units are individuals. The time period is from 1980 to 2019. The variables include both continuous and categorical variables. The categorical variables represent demographic characteristics of the sample primary sample units. What I am interested in is finding out the patterns in one of the continuous and one of the categorical variables, given the others, through time.

I have tried to setup the data with svy. Since the survey was with replacement, there is no FPC. However, this was not fruitful; most commands were still reporting repeating time values or too many values. Furthermore, I am not sure whether I should subset the dataset because I cannot figure what the variables would represent then.

I am sure that I am lacking some basics for data like this. If anyone can point to the right resource, that would be very helpful too.

Thank you.

mi predict problem

Dear all,

I am having problems trying to get the baseline survivor function after mi predict.
I am using the following code:
mi predict basesurv using miest, basesurv

and get the following error:
option basesurv not allowed r(198)

Is there a way to calculate predicted probabilities after mi predict command? I can only calculate XB after mi predict, but I would like to calculate 10-year risks for each person in my dataset. Therefore I first need baseline survivor values.. Any suggestions are very welcome..

Thanks!

groups command, export/copy table to excel

the groups function is a very useful command, however the output seems to be not very transferable - is there a way to export the results to excel? Copy Table does not work for -groups- ouput
Either with the same function as "copy Table" for other output, or putexcel?
Thanks

How to recreate longitudinal analyses from SAS in STATA using xtgee only using SAS output???

Hello all,

I have been trying to recreate SAS output in Stata but keep getting discrepancies. Unfortunately, the biostatistician only gave me the output and not the SAS code so I am trying to recreate the analyses in Stata by using the output.

First, I tried to check skewness and kurtosis and there was quite a discrepancy but I soon learned that is because SAS and Stata use different formulas.

Now, I am trying to create GEE analyses but I am not getting the same results.

This is what I am using to decipher the SAS analyses:
Distribution Normal Link Function Identity Observations Used 1375
GEE Model Information: Correlation Structure AR(1)
Subject Effect SBA (55 levels) Number of Clusters 55
Correlation Matrix / Cluster Size 25 (years)
tm=years; tm5=(tm-5)*(tm>=5); tm10=(tm-10)*(tm>=10); tm19=(tm-19)*(tm>=19);

Dependent Variable sallfd = sqrt(All_food) Algorithm converged.
GEE Fit Criteria

I have tried the following:
xtset SBA year
xtgee sAllfd tm tm5 tm10 tm19 i.Gcode Gcode#c.tm Gcode#c.tm5 Gcode#c.tm10 Gcode#c.tm19, family(gaussian) link(identity) corr(ar1)

Would this be incorrect since xtgee provides fit population-averaged panel-data models by using GEE?

Any help is most welcomed.