BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Friday, September 30, 2022

Coefplot graph error despite running the regression

I'm supposed to get a coefplot graph using the following data and code but unfortunately I'm coming across this error. Why lths variable is not being allowed , I'm simply not getting it.

Code:

option lths not allowed
r(198);

The code is like the following

Code:

eststo clear
reghdfe ln_incwage black  ///
hks mks lks ///
age ismarried /// 
somecollege hsdegree  /// 
lths  [aw=asecwt] , absorb(i.county) vce(cluster county) 
eststo e1 

coefplot (e1 , mcolor(red%50) msymbol(O) msize(large) ciopts(recast(rcap)) color(red%50)) keep(black)) ///
(e1 , mcolor(green%50) msymbol(S) msize(large) keep(hks mks lks) ciopts(recast(rcap)) color(green%50)))  ///
(e1 , mcolor(purple%50) msymbol(T) msize(large) keep(age ismarried)  ciopts(recast(rcap)) color(purple%50))) ///
(auto_dube , mcolor(blue%50) msymbol(D) msize(large) keep(lths)  ciopts(recast(rcap)) color(blue%50))) , ///
scheme(plotplain) xtitle("Coefplot",size(large)) /// 
xline(0, lcolor(gs7) lpat(solid)) legend(off) grid(none) ///
yline(-1 -2 -3, lcolor(gs7%40) lpat(dot)) ///
groups(black ="{bf:Local}") ///
hks mks lks ="{bf:Spectrum}") ///
age ismarried  = "{bf:Industry}") /// 
lths = "{bf:Rec}") ///
graph export "coefplot.png"

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float ln_incwage byte black float(hks mks lks) byte(age ismarried somecollege hsdegree lths) double(asecwt county)
10.414393 0 1 0 0 54 0 0 0 0 1434.86 36055
 10.46846 0 0 1 0 38 1 0 1 0  606.54     0
 9.538924 0 0 1 0 37 0 0 0 1 1000.35     0
11.436045 0 1 0 0 55 1 0 0 0  463.44     0
        . 0 0 0 0 46 1 0 0 1 1215.63  6037
10.008928 0 0 1 0 30 1 1 0 0 2010.14     0
10.037678 0 0 1 0 44 1 0 1 0  459.67     0
 12.08503 0 0 0 1 37 1 0 0 0 1296.95 34039
 5.914584 0 0 0 1 18 0 0 0 1 1544.17 42091
10.232072 0 0 0 1 18 1 1 0 0 1599.78  4013
 9.775313 1 0 0 1 26 1 0 1 0 1329.36 42011
10.033458 0 0 0 1 35 1 1 0 0 1222.95     0
        . 0 0 0 0 63 0 0 1 0   910.7     0
 9.469932 1 0 1 0 42 0 0 1 0 1269.54     0
 9.826607 0 0 0 1 36 1 0 0 1  1672.3     0
        . 0 0 0 0 18 0 0 0 1 1224.82 34023
        . 0 0 0 0 60 0 0 0 1  547.61     0
8.9103155 0 0 0 1 62 1 0 1 0 1492.59     0
10.519753 0 0 0 1 47 0 0 0 1 1284.65  6037
 11.30821 0 1 0 0 39 1 1 0 0 1230.16 34029
        . 0 0 0 0 49 1 0 1 0 1377.73 12011
 8.555781 0 0 1 0 46 1 0 0 0 1059.27     0
10.519753 0 0 0 1 31 0 1 0 0 1243.65 12025
10.163078 0 0 0 1 47 1 0 1 0  125.48     0
        . 0 0 0 0 36 1 0 1 0   445.1     0
11.455847 0 1 0 0 36 1 0 0 0  606.54  8059
10.491868 0 1 0 0 38 0 0 0 0 1482.28 36047
        . 0 0 0 0 30 0 1 0 0   557.1     0
 9.826607 0 0 1 0 42 1 0 1 0   928.9     0
        . 0 0 0 0 18 0 1 0 0 1261.18 12011
10.659515 0 0 1 0 38 1 0 1 0  744.99     0
9.3958235 0 0 1 0 27 1 1 0 0  369.12     0
 10.04975 0 0 0 1 38 1 0 1 0 1397.47     0
10.008928 0 0 1 0 40 1 0 0 1 1841.38     0
11.772516 0 0 0 1 39 1 1 0 0 1207.02 12025
        . 0 0 0 0 27 1 1 0 0  994.94     0
10.568543 0 0 1 0 35 1 1 0 0  801.43  8031
10.592074 0 1 0 0 28 1 0 0 0 1149.52     0
        . 0 0 0 0 25 0 1 0 0 2619.46  6085
10.568543 0 0 1 0 44 1 0 1 0  1546.2 12011
        . 0 0 0 0 48 1 0 0 0 1737.04 34003
 9.826607 0 0 1 0 26 1 0 0 0 1421.61     0
        . 0 0 0 1 53 1 1 0 0  446.02     0
 8.776784 0 1 0 0 18 0 0 1 0 1067.93     0
        . 0 1 0 0 44 1 0 0 0 1204.65  6037
        . 0 0 0 0 50 1 0 1 0  724.72     0
10.386222 0 0 1 0 23 0 0 1 0  352.97     0
        . 0 0 0 0 18 0 0 0 1 1059.78     0
 9.603463 0 0 0 1 18 0 0 0 1 1421.59     0
11.035567 0 0 1 0 31 0 0 0 1 2132.08  6079
 9.415627 0 0 0 1 40 1 0 1 0 1191.55 27137
 9.469932 0 0 0 1 50 1 0 1 0  960.59     0
 6.830874 1 0 1 0 22 0 0 1 0  1716.9     0
end

anycount

Dear all

I would like to count the number of values across 5 numeric variables. I tried using "anycount", but there is a wide range of values for me to list all of them in the option statement and the syntax wont allow me to specify range:

2 syntaxes I tried below:

egen sum_xy = anycount (x y z v w), value (1,2,3,4......1000)

egen sum_xy = anycount (x y z v w), value (1-1000)

Multiple commands in one line stored in a local

I have a problem executing a local with multiple Stata commands separated by ";".

If I use this code

Code:

sysuse auto, clear

#delimit ;
regress price weight; su price;
#delimit cr

the two commands are executed correctly.

However, if I like to create a local with the commands,

Code:

local multiple_commands = "regress price weight; su price"

#delimit ;
`multiple_commands';
#delimit cr

Stata returns

; invalid name
r(198);

.

Can someone tell me how to fix this? I tried to use different characters before the delimiter ";" such as `;' or \;, but it did not work.
"
Thank you in advance!
Martin

Problems with regressions (regress.ado r(199)

Hi,

I have run into some problems with making regressions in Stata (version 16). I have been using Stata on my Macbook Air (mid 2017) for about a month without any problems. Now from out of a sudden I get the error message "command Regress not defined by Regress.ado r(199);" when I try to do regressions. I have tried to reinstall the software but it did not help. Does anyone have any suggestions on how I can fix this?

"

Thursday, September 29, 2022

formula into regression

Hi

Is there any way to make a regression in this way

regress ln(y) indep vars

without generate a new variable such as lny = ln(y)?

I'm sorry if this is a basic question.

Generate New Variable = Average Price Per Type Per Year

I am using Stata version 15.1 and I have 47 variables and 1,010 observations in my dataset. I am doing educational research with 112 colleges included (This is setup as a panel data set). However, I have expanded it so that I can capture the average prices for all colleges in set categories. I now have 69,999 observations in my data set.

My goal:
I am trying to create a variable that will represent the average price of colleges within a year range on my panel data set. I need to generate a yearly average for the following categories of colleges - 1—Public, four-year or above 2—Private not-for-profit, four-year or above 3—Private for-profit, four-year or above 4—Public, two-year. I have all the individual colleges in the data set and I know their prices. The variable for the price I want to average is "avgprice_per_year."

My first thought was to do the following:

sort unitid year
egen avgprice_pub_4yr_year=mean(avg_tuition_fees_ft), by(year)

but this only generates the average across all colleges for the year. I want it to generate a new variable that shows the average price for each type of college (1—Public, four-year or above 2—Private not-for-profit, four-year or above 3—Private for-profit, four-year or above 4—Public, two-year). So I would then have a new variable that is generated that cumulates all the Public, four-year colleges prices and adds this to every instance of the year 2009 for example. I would need to do this four times to then generate each of these four variables to then later use in my regression to see how alternative college pricing may impact the enrollment rate.

Monte Carlo Simulation

Hi all,

I want to be able to run a Monte Carlo simulation for an econometrics assignment that I have. From what I can tell, I need to design an experiment that has sample sizes of n{10, 100, 1000, 10000]}, each requiring R=1000, where R= replications. For each of the sample sizes, I need to calculate the 25th, 50th, and 75th percentile values. I have some code from stata for this experiment and it listed below. The question I have is how can I create a loop within this program to allow for changing the n number of observations?

clear

local mc = 1000
set seed 368
set obs `mc'
gen data_store_x = .
gen data_store_cons = .
quietly {
forvalues i = 1(1) `mc' {
if floor((`i'-1)/100) == (`i' -1)/100 {
noisily display "Working on `i' out of `mc at $S_TIME'"
}
preserve

clear

set obs = 1000

gen x = rnormal() *3 + 6

gen e = runiform() - 0.5

gen y = 3 + 4*x + e

reg y x, robust

local xcoef = _b[x]
local const = _b[_cons]

restore

replace data_store_x = `xcoef' in `i'
replace data_store_cons = `const' in `i'
}
}
summ data_store_x data_store_cons

Endogeneity correction via Gaussian Copulas - generating normal CDF of a non-normal empirical distribution function

Dear Stata experts,

I am a passionate passive follower in the forum, you guys helped me a lot in the past 2 years of my research!

I am facing now an "apparently" easy task that actually takes, according to the professor, only 3 lines to compute but I am struggling for 3 hrs how..

Its a class about Endogeneity correction, this specific assignment is via Gaussian Copulas approach.

For this, we need to:
1) Compute a dataset with an endogenous regressor correlated with the error term that is non-normally distributed
2) Then we need to generate the inverse of the normal cumulative distribution function (CDF) of this non-normally distributed error term
3) We integrate this into the actual estimation

I am heavily struggling to compute a non-normally distributed variable and then compute from this the inverse of the normal CDF.

I had a look into this post but I am absolutely not sure how to interpret the first two results and whether they give me what I need:
https://www.statalist.org/forums/for...aussian-copula

Any idea dear Stata experts? That would be amazing.

Best regards,
Alexander

Help generating this type of plot

Hey all, I need some assistance regarding the appropriate data shaping and plot type to generate this simple plot (see attachment).

Basically all I'm trying to do is plot a change in a continuous outcome (6MWD) pre and post treatment for two different groups. The continuous scale is on the y axis and x axis only has two data points - pre-treatment and post-treatment (averages). I would like to have the lines connected for each treatment.

I've been fiddling with stata for awhile trying to figure this out but without avail. Anyone have any suggestions regarding the type of plot and how I need my data laid out? Thank you!

Lag of variable with condition of not being misssing

I want to create the lag of a variable with one caveat. If it is missing, I want to use the previous period's value etc. So it is the most frequent non-missing lag. Any ideas?

Wednesday, September 28, 2022

Post estimation models for latent class analyses

I recently ran a latent class analysis for my choice experiment data using the "lclogit" command. I am trying to get the latent class model but keep getting the error "varlist not allowed."

I ran the lclogit with my dependent variable "Select" the dummy coded optout, and my attributes. I included 2 demographic variables in the membership. This coding worked.

My exact coding was: lclogit Select optout quail_rec tort_rec water_med water_high scenic_med scenic_high Tax, group(groupid) id(Respondent) nclasses(3) membership(gender_f age)

For the latent class model, I used the coding: "lclogitml iterate (50)", but I got the error "varlist not allowed"

I am trying to get the latent class model with standard errors and coefficients. Can anyone help? Thanks.

Margins after csdid.

Dear All,
I want to update my traditional DiD analyses with Callaway and Sant’Anna's (2021) semi-parametric DiD estimator using the csdid command. Here is a dataex of my sample:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(emp_ratio_sa intsmall_ta_1 intsmall_tb_1 intsmall_ta_2 intsmall_tb_2 intsmall_ta_3 intsmall_tb_3 intsmall_ta_4 intsmall_tb_4 intsmall_ta_5 intsmall_tb_5 intsmall_ta_6 intsmall_tb_6 intsmall_0 statefip time treat_qpdmp eventT)
.7219443 0 0 0 0 0 0 0 0 0 0 0 0 0 1  96 0 6
.7270641 0 0 0 0 0 0 0 0 0 0 0 0 0 1  97 0 6
 .722241 0 0 0 0 0 0 0 0 0 0 0 0 0 1  98 0 6
 .731342 0 0 0 0 0 0 0 0 0 0 0 0 0 1  99 0 6
.7264431 0 0 0 0 0 0 0 0 0 0 0 0 0 1 100 0 6
.7278489 0 0 0 0 0 0 0 0 0 0 0 0 0 1 101 0 6
.7415584 0 0 0 0 0 0 0 0 0 0 0 0 0 1 102 0 6
.7661154 0 0 0 0 0 0 0 0 0 0 0 0 0 1 103 0 6
 .742871 0 0 0 0 0 0 0 0 0 0 0 0 0 1 104 0 6
.7310079 0 0 0 0 0 0 0 0 0 0 0 0 0 1 105 0 6
.7459943 0 0 0 0 0 0 0 0 0 0 0 0 0 1 106 0 6
.7434144 0 0 0 0 0 0 0 0 0 0 0 0 0 1 107 0 6
.7512739 0 0 0 0 0 0 0 0 0 0 0 0 0 1 108 0 6
.7619339 0 0 0 0 0 0 0 0 0 0 0 0 0 1 109 0 6
.7568708 0 0 0 0 0 0 0 0 0 0 0 0 0 1 110 0 6
.7509231 0 0 0 0 0 0 0 0 0 0 0 0 0 1 111 0 6
  .75106 0 0 0 0 0 0 0 0 0 0 0 0 0 1 112 0 6
.7469623 0 0 0 0 0 0 0 0 0 0 0 0 0 1 113 0 6
.7410953 0 0 0 0 0 0 0 0 0 0 0 0 0 1 114 0 6
.7331818 0 0 0 0 0 0 0 0 0 0 0 0 0 1 115 0 6
.7517717 0 0 0 0 0 0 0 0 0 0 0 0 0 1 116 0 6
.7682488 0 0 0 0 0 0 0 0 0 0 0 0 0 1 117 0 6
.7614667 0 0 0 0 0 0 0 0 0 0 0 0 0 1 118 0 6
.7746972 0 0 0 0 0 0 0 0 0 0 0 0 0 1 119 0 6
.7701825 0 0 0 0 0 0 0 0 0 0 0 0 0 1 120 0 6
.7691069 0 0 0 0 0 0 0 0 0 0 0 0 0 1 121 0 6
.7497196 0 0 0 0 0 0 0 0 0 0 0 0 0 1 122 0 6
.7749722 0 0 0 0 0 0 0 0 0 0 0 0 0 1 123 0 6
.7229774 0 0 0 0 0 0 0 0 0 0 0 0 0 1 124 0 6
.7317652 0 0 0 0 0 0 0 0 0 0 0 0 0 1 125 0 6
   .7542 0 0 0 0 0 0 0 0 0 0 0 0 0 1 126 0 6
.7434831 0 0 0 0 0 0 0 0 0 0 0 0 0 1 127 0 6
.7431633 0 0 0 0 0 0 0 0 0 0 0 0 0 1 128 0 6
.7390506 0 0 0 0 0 0 0 0 0 0 0 0 0 1 129 0 6
.7738222 0 0 0 0 0 0 0 0 0 0 0 0 0 1 130 0 6
.7451106 0 0 0 0 0 0 0 0 0 0 0 0 0 1 131 0 6
.7595577 0 0 0 0 0 0 0 0 0 0 0 0 0 1 132 0 6
.7639228 0 0 0 0 0 0 0 0 0 0 0 0 0 1 133 0 6
.7485735 0 0 0 0 0 0 0 0 0 0 0 0 0 1 134 0 6
.7524864 0 0 0 0 0 0 0 0 0 0 0 0 0 1 135 0 6
.7646362 0 0 0 0 0 0 0 0 0 0 0 0 0 1 136 0 6
.7876217 0 0 0 0 0 0 0 0 0 0 0 0 0 1 137 0 6
.7759739 0 0 0 0 0 0 0 0 0 0 0 0 0 1 138 0 6
.7897649 0 0 0 0 0 0 0 0 0 0 0 0 0 1 139 0 6
.7795818 0 0 0 0 0 0 0 0 0 0 0 0 0 1 140 0 6
.7496525 0 0 0 0 0 0 0 0 0 0 0 0 0 1 141 0 6
.7717182 0 0 0 0 0 0 0 0 0 0 0 0 0 1 142 0 6
.7859297 0 0 0 0 0 0 0 0 0 0 0 0 0 1 143 0 6
.7857348 0 0 0 0 0 0 0 0 0 0 0 0 0 1 144 0 6
.7631013 0 0 0 0 0 0 0 0 0 0 0 0 0 1 145 0 6
.8003568 0 0 0 0 0 0 0 0 0 0 0 0 0 1 146 0 6
 .790067 0 0 0 0 0 0 0 0 0 0 0 0 0 1 147 0 6
.7887953 0 0 0 0 0 0 0 0 0 0 0 0 0 1 148 0 6
 .804588 0 0 0 0 0 0 0 0 0 0 0 0 0 1 149 0 6
.7865897 0 0 0 0 0 0 0 0 0 0 0 0 0 1 150 0 6
.7946309 0 0 0 0 0 0 0 0 0 0 0 0 0 1 151 0 6
.7958547 0 0 0 0 0 0 0 0 0 0 0 0 0 1 152 0 6
.7819211 0 0 0 0 0 0 0 0 0 0 0 0 0 1 153 0 6
.7804796 0 0 0 0 0 0 0 0 0 0 0 0 0 1 154 0 6
.7998914 0 0 0 0 0 0 0 0 0 0 0 0 0 1 155 0 6
 .808333 0 0 0 0 0 0 0 0 0 0 0 0 0 1 156 0 6
.8037804 0 0 0 0 0 0 0 0 0 0 0 0 0 1 157 0 6
.8006392 0 0 0 0 0 0 0 0 0 0 0 0 0 1 158 0 6
.7885816 0 0 0 0 0 0 0 0 0 0 0 0 0 1 159 0 6
.8076813 0 0 0 0 0 0 0 0 0 0 0 0 0 1 160 0 6
.7850609 0 0 0 0 0 0 0 0 0 0 0 0 0 1 161 0 6
.7776859 0 0 0 0 0 0 0 0 0 0 0 0 0 1 162 0 6
.7701354 0 0 0 0 0 0 0 0 0 0 0 0 0 1 163 0 6
.7665312 0 0 0 0 0 0 0 0 0 0 0 0 0 1 164 0 6
.7779756 0 0 0 0 0 0 0 0 0 0 0 0 0 1 165 0 6
.7685159 0 0 0 0 0 0 0 0 0 0 0 0 0 1 166 0 6
.7648834 0 0 0 0 0 0 0 0 0 0 0 0 0 1 167 0 6
.7512071 0 0 0 0 0 0 0 0 0 0 0 0 0 1 168 0 6
.7527735 0 0 0 0 0 0 0 0 0 0 0 0 0 1 169 0 6
.7590218 0 0 0 0 0 0 0 0 0 0 0 0 0 1 170 0 6
.7660882 0 0 0 0 0 0 0 0 0 0 0 0 0 1 171 0 6
.7786498 0 0 0 0 0 0 0 0 0 0 0 0 0 1 172 0 6
.7688466 0 0 0 0 0 0 0 0 0 0 0 0 0 1 173 0 6
.7708621 0 0 0 0 0 0 0 0 0 0 0 0 0 1 174 0 6
 .764181 0 0 0 0 0 0 0 0 0 0 0 0 0 1 175 0 6
.7689334 0 0 0 0 0 0 0 0 0 0 0 0 0 1 176 0 6
.7696679 0 0 0 0 0 0 0 0 0 0 0 0 0 1 177 0 6
.7451332 0 0 0 0 0 0 0 0 0 0 0 0 0 1 178 0 6
 .755814 0 0 0 0 0 0 0 0 0 0 0 0 0 1 179 0 6
.7509971 0 0 0 0 0 0 0 0 0 0 0 0 0 1 180 0 6
.7730986 0 0 0 0 0 0 0 0 0 0 0 0 0 1 181 0 6
.7907968 0 0 0 0 0 0 0 0 0 0 0 0 0 1 182 0 6
.7697361 0 0 0 0 0 0 0 0 0 0 0 0 0 1 183 0 6
.7392074 0 0 0 0 0 0 0 0 0 0 0 0 0 1 184 0 6
.7659248 0 0 0 0 0 0 0 0 0 0 0 0 0 1 185 0 6
 .767098 0 0 0 0 0 0 0 0 0 0 0 0 0 1 186 0 6
.7616657 0 0 0 0 0 0 0 0 0 0 0 0 0 1 187 0 6
.7849947 0 0 0 0 0 0 0 0 0 0 0 0 0 1 188 0 6
.7597136 0 0 0 0 0 0 0 0 0 0 0 0 0 1 189 0 6
.7601928 0 0 0 0 0 0 0 0 0 0 0 0 0 1 190 0 6
.7775874 0 0 0 0 0 0 0 0 0 0 0 0 0 1 191 0 6
.7807354 0 0 0 0 0 0 0 0 0 0 0 0 0 1 192 0 6
.7515444 0 0 0 0 0 0 0 0 0 0 0 0 0 1 193 0 6
.7349452 0 0 0 0 0 0 0 0 0 0 0 0 0 1 194 0 6
.7434267 0 0 0 0 0 0 0 0 0 0 0 0 0 1 195 0 6
end
format %tq time

Initially, I successfully ran the DiD and was able to calculate the total effect in each event-time period of the policy in "small geographies" using margins:

Code:

            foreach outcome in `outcomes' {
                
            use "$fin_data/fig5_data_illicit.dta", clear
            
            forval i=2/6 {
                
                reghdfe     `outcome'     i.cq tb_6 tb_5 tb_4 tb_3 tb_2 t_0 ta_1     ///
                                        ta_2 ta_3 ta_4 ta_5 ta_6 smallillicit     ///
                                        intsmall_tb_6 intsmall_tb_5             ///
                                        intsmall_tb_4 intsmall_tb_3             ///
                                        intsmall_tb_2 intsmall_0                 ///
                                        intsmall_ta_1 intsmall_ta_2             ///
                                        intsmall_ta_3 intsmall_ta_4             ///
                                        intsmall_ta_5 intsmall_ta_6             ///
                                        [fweight=civpop], ///
                                        absorb(i.statefip) vce(cluster statefip)
                
                margins,     expression(_b[tb_`i']+_b[intsmall_tb_`i']) post 
                qui         esttab, ci 
                mat         ci_tb`i'    =    r(coefs)
                
            }
}

Now I try to do the same with csdid (I am new to the command) as follows, but am not sure how to specify the margins after csdid, getting the following error:

Code:

.         foreach outcome in `outcomes' {
  2.                     
.                 use "$fin_data/fig5_data_illicit.dta", clear
  3.                 est clear
  4.                 csdid `outcome' intsmall*, ivar(statefip) time(time) gvar(treat_qpdmp) agg(event) 
  5.                 
.                 *store event study statistics/estimates 
.                 estat   event , window(-6, 6) esave(m1) replace // save only the prior 6 and post 6 estimates
  6.                 
.                 **#: MARGINS TEST
.                 margins, expression(_b[e(b)[1,3]])                      // THE PROBLEM HERE IS THAT NOT SURE WHICH COEFICEIENTS IN OUTPUT TO USE TO GET THE TOTAL EFFECT 
  7.                         
.                 *load and store event study stats in a matrix and save matrix as .dta file
.                 estimates use   m1
  8.                 mat list                r(table)
  9.                 matrix                  table = r(table)
 10.                 matsave                 table, replace saving path("$fin_data")
 11.                         
.                 *load event study stats .dta file for formatting. Goal is to create 2way graph
.                 use "$fin_data/table.dta", clear 
 12.                         
.                 * drop extraneous rows and vars
.                 drop if                 _rowname == "eform" | _rowname == "df"
 13.                 drop                    Pre_avg Post_avg
 14.                         
.                 * rename and reshape for formatting reasons 
.                 rename  (Tm6 Tm5 Tm4 Tm3 Tm2 Tm1 Tp0 Tp1 Tp2 Tp3 Tp4 Tp5 Tp6)                           ///
>                                 (T0  T1  T2  T3  T4  T5  T6  T7  T8  T9  T10 T11 T12)
 15.                 reshape long T, i(_rowname) j(time)
 16.                         
.                 * keep only coef and CI bands estimates. merge into one .dta file
.                 keep if _rowname == "b" | _rowname == "ll" | _rowname == "ul"
 17.                 preserve 
 18.                         keep if _rowname == "ll" 
 19.                         rename T ci_lower
 20.                         tempfile ll 
 21.                         save `ll'
 22.                 restore 
 23.                 preserve        
 24.                         keep if _rowname == "ul"
 25.                         rename T ci_upper
 26.                         tempfile ul 
 27.                         save `ul'
 28.                 restore
 29.                         
.                 keep if         _rowname == "b"
 30.                 rename          T coef
 31.                         
.                 * merge in the lower CI estimate column 
.                 merge 1:1       time using `ll'
 32.                 drop _merge 
 33.                         
.                 * merge in the upper CI estimate column 
.                 merge 1:1 time using `ul'
 34.                 drop _merge _rowname 
 35.                         
.                 * define time value labels and apply 
.                 label define time_csdid 0 "-6" 1 "-5" 2"-4" 3 "-3" 4 "-2" 5 "-1"        ///
>                                                                 6 "0" 7 "+1" 8 "+2" 9 "+3" 10 "+4" 11 "+5" 12 "+6"
 36.                 label values time time_csdid
 37.                         
.                 
.                 graph twoway                                                                                                                    ///
>                                 (rcap ci_upper ci_lower time,                                                                   ///
>                                 lstyle(thin) lcolor(gs13) lwidth(*2) msize(vtiny))                              ///
>                                 (scatter coef time,                                                             ///
>                                 graphregion(color(white)) bgcolor(white)                                                ///
>                                 msymbol(circle) msize(small) mcolor(purple)                                     ///
>                                 title("Difference: `cpsquarterly_`outcome''")                                   ///
>                                 xtitle("`xtitle'", size(small) height(5))                                               ///
>                                 ytitle("`c_`outcome''")                                                                                 ///
>                                 xlabel(0(1)12,labsize(vsmall) valuelabel)                                               ///
>                                 ylabel(,labsize(vsmall) nogrid angle(0))                                                ///
>                                 yline(0, lpattern(dot) lwidth(thin) lcolor(black))                              ///
>                                 xline(5, lwidth(vthin) lcolor(black))                                                   ///
>                                 plotregion(lstyle(none)) graphregion(margin(zero))                              ///
>                                 legend(off)),                                                                                                   ///
>                                 name("csdid_`outcome'_g3", replace)     
 38.         
.                 graph export "$graphs/Figure 5/`date'/csdid_`outcome'_diff.png", replace
 39.         }
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
..................................................
.....................................
Difference-in-difference with Multiple Time Periods

                                                         Number of obs = 7,344
Outcome model  : weighted least squares
Treatment model: inverse probability tilting
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       T-128 |  -.0038994    .002398    -1.63   0.104    -.0085994    .0008007
       T-127 |    .011263   .0069984     1.61   0.108    -.0024537    .0249797
       T-126 |  -.0029164   .0060814    -0.48   0.632    -.0148358    .0090029
       T-125 |  -.0071741   .0041964    -1.71   0.087    -.0153989    .0010508
       T-124 |   .0086744   .0026999     3.21   0.001     .0033827    .0139661
       T-123 |  -.0029752   .0028089    -1.06   0.290    -.0084805    .0025302
       T-122 |  -.0049747   .0044493    -1.12   0.264    -.0136952    .0037457
       T-121 |   .0024769   .0040247     0.62   0.538    -.0054114    .0103653
       T-120 |   .0024723    .003154     0.78   0.433    -.0037093    .0086539
       T-119 |  -.0035079   .0033814    -1.04   0.300    -.0101354    .0031195
       T-118 |   .0034128   .0032696     1.04   0.297    -.0029954    .0098211
       T-117 |    .004966    .001937     2.56   0.010     .0011696    .0087624
       T-116 |  -.0013375   .0029865    -0.45   0.654    -.0071908    .0045159
       T-115 |  -.0001271   .0021956    -0.06   0.954    -.0044305    .0041763
       T-114 |    .001878   .0031095     0.60   0.546    -.0042166    .0079726
       T-113 |  -.0015993   .0028235    -0.57   0.571    -.0071332    .0039346
       T-112 |   .0035643   .0045386     0.79   0.432    -.0053312    .0124598
       T-111 |   .0022065   .0032561     0.68   0.498    -.0041752    .0085883
       T-110 |  -.0061921   .0021869    -2.83   0.005    -.0104784   -.0019058
       T-109 |  -.0028732   .0023366    -1.23   0.219     -.007453    .0017065
       T-108 |  -.0017538   .0016822    -1.04   0.297    -.0050508    .0015433
       T-107 |   .0002055   .0024355     0.08   0.933     -.004568    .0049791
       T-106 |   .0007294   .0025281     0.29   0.773    -.0042256    .0056844
       T-105 |   -.002054   .0025307    -0.81   0.417    -.0070141    .0029061
       T-104 |  -.0023996    .002338    -1.03   0.305    -.0069819    .0021827
       T-103 |  -.0011283   .0031629    -0.36   0.721    -.0073274    .0050708
       T-102 |   .0029173   .0022172     1.32   0.188    -.0014284     .007263
       T-101 |   -.001756   .0030409    -0.58   0.564     -.007716     .004204
       T-100 |  -.0033694   .0027438    -1.23   0.219    -.0087472    .0020085
        T-99 |  -.0029343   .0027502    -1.07   0.286    -.0083247     .002456
        T-98 |  -.0005361   .0028091    -0.19   0.849    -.0060418    .0049695
        T-97 |  -.0021415   .0021108    -1.01   0.310    -.0062787    .0019956
        T-96 |  -.0000833   .0033721    -0.02   0.980    -.0066924    .0065259
        T-95 |   .0018972   .0028361     0.67   0.504    -.0036615    .0074558
        T-94 |  -.0038235   .0033355    -1.15   0.252    -.0103609     .002714
        T-93 |  -.0012526   .0043483    -0.29   0.773    -.0097751    .0072699
        T-92 |   .0027288   .0027234     1.00   0.316     -.002609    .0080666
        T-91 |  -.0012465   .0031355    -0.40   0.691    -.0073919    .0048989
        T-90 |  -.0013619   .0039355    -0.35   0.729    -.0090753    .0063515
        T-89 |   .0051595   .0029894     1.73   0.084    -.0006996    .0110186
        T-88 |  -.0019034   .0037745    -0.50   0.614    -.0093013    .0054945
        T-87 |  -.0045874     .00252    -1.82   0.069    -.0095266    .0003518
        T-86 |  -.0012883   .0033857    -0.38   0.704    -.0079241    .0053475
        T-85 |   .0075664   .0024828     3.05   0.002     .0027002    .0124325
        T-84 |   .0024097   .0038059     0.63   0.527    -.0050498    .0098692
        T-83 |  -.0004293    .002227    -0.19   0.847    -.0047942    .0039355
        T-82 |  -.0010582   .0019803    -0.53   0.593    -.0049394    .0028231
        T-81 |  -.0023748   .0019575    -1.21   0.225    -.0062115    .0014619
        T-80 |   .0024216    .003036     0.80   0.425    -.0035287     .008372
        T-79 |  -.0001846   .0035462    -0.05   0.958     -.007135    .0067657
        T-78 |    .001593   .0030387     0.52   0.600    -.0043628    .0075488
        T-77 |   -.003404   .0029436    -1.16   0.248    -.0091733    .0023653
        T-76 |   .0000364   .0030397     0.01   0.990    -.0059213    .0059941
        T-75 |   .0065255   .0028417     2.30   0.022     .0009559    .0120951
        T-74 |  -.0028602    .002865    -1.00   0.318    -.0084755    .0027552
        T-73 |   .0011511   .0028124     0.41   0.682    -.0043611    .0066632
        T-72 |  -.0033078   .0035737    -0.93   0.355    -.0103121    .0036965
        T-71 |   .0018136   .0028645     0.63   0.527    -.0038006    .0074279
        T-70 |   .0003638   .0030569     0.12   0.905    -.0056276    .0063552
        T-69 |   .0014643    .002361     0.62   0.535    -.0031631    .0060917
        T-68 |   .0008133    .003283     0.25   0.804    -.0056213    .0072478
        T-67 |  -.0038067   .0027904    -1.36   0.172    -.0092757    .0016623
        T-66 |  -.0009422    .003207    -0.29   0.769    -.0072279    .0053434
        T-65 |   .0017713   .0029168     0.61   0.544    -.0039454    .0074881
        T-64 |   .0027834   .0033562     0.83   0.407    -.0037945    .0093614
        T-63 |   .0003457   .0030793     0.11   0.911    -.0056897    .0063811
        T-62 |  -.0002905    .003587    -0.08   0.935    -.0073209    .0067399
        T-61 |   .0035518   .0022845     1.55   0.120    -.0009257    .0080293
        T-60 |  -.0007902   .0029157    -0.27   0.786    -.0065049    .0049244
        T-59 |   .0040283   .0032535     1.24   0.216    -.0023484    .0104049
        T-58 |  -.0063999   .0026296    -2.43   0.015    -.0115538   -.0012461
        T-57 |   .0028872   .0026871     1.07   0.283    -.0023793    .0081538
        T-56 |  -.0014128   .0023458    -0.60   0.547    -.0060105    .0031849
        T-55 |  -.0011947   .0024568    -0.49   0.627      -.00601    .0036205
        T-54 |   .0020811   .0020269     1.03   0.305    -.0018915    .0060537
        T-53 |  -.0004961   .0032295    -0.15   0.878    -.0068257    .0058336
        T-52 |  -.0006216   .0027893    -0.22   0.824    -.0060886    .0048454
        T-51 |  -.0005572   .0031187    -0.18   0.858    -.0066697    .0055553
        T-50 |   .0047685   .0021391     2.23   0.026     .0005759    .0089611
        T-49 |  -.0052805   .0030655    -1.72   0.085    -.0112887    .0007277
        T-48 |   .0025296   .0026271     0.96   0.336    -.0026195    .0076787
        T-47 |  -.0020436   .0025257    -0.81   0.418    -.0069939    .0029068
        T-46 |  -.0003554   .0023396    -0.15   0.879     -.004941    .0042301
        T-45 |   .0019743    .002164     0.91   0.362    -.0022671    .0062158
        T-44 |  -.0010374     .00281    -0.37   0.712    -.0065449    .0044701
        T-43 |   .0058304   .0024319     2.40   0.017      .001064    .0105968
        T-42 |  -.0026484   .0036348    -0.73   0.466    -.0097725    .0044757
        T-41 |   .0001173   .0036966     0.03   0.975     -.007128    .0073626
        T-40 |  -.0024488   .0032077    -0.76   0.445    -.0087357    .0038381
        T-39 |  -.0010829   .0017788    -0.61   0.543    -.0045693    .0024034
        T-38 |  -.0012993   .0026665    -0.49   0.626    -.0065255    .0039268
        T-37 |   .0032848   .0025825     1.27   0.203    -.0017767    .0083464
        T-36 |  -.0053405   .0025026    -2.13   0.033    -.0102455   -.0004355
        T-35 |   .0006545   .0031029     0.21   0.833    -.0054271    .0067361
        T-34 |   .0023185   .0017974     1.29   0.197    -.0012044    .0058414
        T-33 |   -.001526   .0030538    -0.50   0.617    -.0075114    .0044594
        T-32 |   .0003437   .0031169     0.11   0.912    -.0057652    .0064527
        T-31 |  -.0047286   .0026527    -1.78   0.075    -.0099279    .0004707
        T-30 |   .0039166   .0033308     1.18   0.240    -.0026117    .0104449
        T-29 |   .0061783   .0031426     1.97   0.049     .0000189    .0123377
        T-28 |  -.0004335   .0031523    -0.14   0.891    -.0066119     .005745
        T-27 |  -.0020271   .0028577    -0.71   0.478     -.007628    .0035738
        T-26 |   .0018796   .0025069     0.75   0.453    -.0030339    .0067931
        T-25 |  -.0001574   .0029059    -0.05   0.957    -.0058528     .005538
        T-24 |    .001797   .0031501     0.57   0.568     -.004377    .0079711
        T-23 |   .0010855   .0034263     0.32   0.751    -.0056299     .007801
        T-22 |  -.0084413   .0043402    -1.94   0.052    -.0169479    .0000653
        T-21 |   .0039316   .0031939     1.23   0.218    -.0023282    .0101915
        T-20 |   .0053024   .0023553     2.25   0.024      .000686    .0099187
        T-19 |   -.000722   .0038188    -0.19   0.850    -.0082068    .0067627
        T-18 |  -.0041015   .0040248    -1.02   0.308    -.0119899    .0037869
        T-17 |  -.0012231   .0024455    -0.50   0.617    -.0060163      .00357
        T-16 |   -.000033   .0033951    -0.01   0.992    -.0066873    .0066213
        T-15 |   .0072438   .0025245     2.87   0.004     .0022958    .0121917
        T-14 |  -.0006471   .0024552    -0.26   0.792    -.0054593    .0041651
        T-13 |  -.0045405   .0022248    -2.04   0.041    -.0089009     -.00018
        T-12 |  -.0057636   .0028684    -2.01   0.045    -.0113856   -.0001417
        T-11 |   .0059804   .0039886     1.50   0.134    -.0018371    .0137979
        T-10 |  -.0030996    .003193    -0.97   0.332    -.0093579    .0031586
         T-9 |   .0007201   .0032885     0.22   0.827    -.0057252    .0071653
         T-8 |   .0023677   .0023576     1.00   0.315    -.0022531    .0069884
         T-7 |  -.0003027   .0029223    -0.10   0.917    -.0060304     .005425
         T-6 |  -.0023616   .0026732    -0.88   0.377    -.0076009    .0028778
         T-5 |   .0018408   .0033694     0.55   0.585     -.004763    .0084447
         T-4 |  -.0006476   .0029742    -0.22   0.828     -.006477    .0051818
         T-3 |   .0010017   .0033283     0.30   0.763    -.0055217    .0075251
         T-2 |   .0029037   .0037752     0.77   0.442    -.0044956     .010303
         T-1 |  -.0061838   .0032727    -1.89   0.059    -.0125982    .0002306
         T+0 |   .0081809   .0028068     2.91   0.004     .0026796    .0136822
         T+1 |  -.0023625    .003672    -0.64   0.520    -.0095594    .0048344
         T+2 |  -.0040996   .0034869    -1.18   0.240    -.0109339    .0027346
         T+3 |   .0019913   .0044057     0.45   0.651    -.0066437    .0106263
         T+4 |  -.0039832   .0044561    -0.89   0.371     -.012717    .0047507
         T+5 |  -.0011873   .0050038    -0.24   0.812    -.0109946    .0086199
         T+6 |   .0015952    .005285     0.30   0.763    -.0087632    .0119535
         T+7 |   .0014604   .0065849     0.22   0.824    -.0114458    .0143665
         T+8 |  -.0045589   .0066347    -0.69   0.492    -.0175626    .0084449
         T+9 |   .0008087   .0055703     0.15   0.885     -.010109    .0117264
        T+10 |   .0002163   .0048391     0.04   0.964    -.0092682    .0097008
        T+11 |    .000389   .0059787     0.07   0.948     -.011329    .0121071
        T+12 |  -.0005101   .0062377    -0.08   0.935    -.0127358    .0117156
        T+13 |   .0041018    .005362     0.76   0.444    -.0064074    .0146111
        T+14 |    .010907   .0060666     1.80   0.072    -.0009833    .0227973
        T+15 |   .0049799   .0047421     1.05   0.294    -.0043144    .0142742
        T+16 |   .0057155     .00659     0.87   0.386    -.0072006    .0186317
        T+17 |    .003413   .0079198     0.43   0.667    -.0121096    .0189356
        T+18 |  -.0016702   .0066542    -0.25   0.802    -.0147123    .0113719
        T+19 |  -.0003214    .005816    -0.06   0.956    -.0117204    .0110777
        T+20 |   .0014124   .0077852     0.18   0.856    -.0138463    .0166712
        T+21 |  -.0028021    .008095    -0.35   0.729     -.018668    .0130638
        T+22 |  -.0016207   .0081708    -0.20   0.843    -.0176351    .0143938
        T+23 |   .0019917   .0077358     0.26   0.797    -.0131703    .0171536
        T+24 |   .0019442   .0064272     0.30   0.762    -.0106528    .0145412
        T+25 |   .0024827   .0071242     0.35   0.727    -.0114805    .0164459
        T+26 |  -.0179738   .0159027    -1.13   0.258    -.0491425    .0131949
        T+27 |  -.0074342   .0139019    -0.53   0.593    -.0346814    .0198129
        T+28 |   .0050378   .0118902     0.42   0.672    -.0182667    .0283422
        T+29 |   .0113423   .0182562     0.62   0.534    -.0244392    .0471238
        T+30 |   .0302177   .0040121     7.53   0.000     .0223541    .0380814
------------------------------------------------------------------------------
Control: Never Treated

See Callaway and Sant'Anna (2021) for details
ATT by Periods Before and After treatment
Event Study:Dynamic effects
file m1.ster saved
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
     Pre_avg |  -.0005745   .0006988    -0.82   0.411     -.001944    .0007951
    Post_avg |   .0000192    .003332     0.01   0.995    -.0065113    .0065498
         Tm6 |  -.0023616   .0026732    -0.88   0.377    -.0076009    .0028778
         Tm5 |   .0018408   .0033694     0.55   0.585     -.004763    .0084447
         Tm4 |  -.0006476   .0029742    -0.22   0.828     -.006477    .0051818
         Tm3 |   .0010017   .0033283     0.30   0.763    -.0055217    .0075251
         Tm2 |   .0029037   .0037752     0.77   0.442    -.0044956     .010303
         Tm1 |  -.0061838   .0032727    -1.89   0.059    -.0125982    .0002306
         Tp0 |   .0081809   .0028068     2.91   0.004     .0026796    .0136822
         Tp1 |  -.0023625    .003672    -0.64   0.520    -.0095594    .0048344
         Tp2 |  -.0040996   .0034869    -1.18   0.240    -.0109339    .0027346
         Tp3 |   .0019913   .0044057     0.45   0.651    -.0066437    .0106263
         Tp4 |  -.0039832   .0044561    -0.89   0.371     -.012717    .0047507
         Tp5 |  -.0011873   .0050038    -0.24   0.812    -.0109946    .0086199
         Tp6 |   .0015952    .005285     0.30   0.763    -.0087632    .0119535
------------------------------------------------------------------------------
warning: option expression() does not contain option predict() or xb().
variable T not found
r(111);

end of do-file

r(111);

So I have 2 basic questions:
(1) Am I correct in retaining only the prior 6 and post 6 estimates if I am interested in considering the trends in outcome 6 periods prior to the policy and 6 periods post the policy (see line 5 of my code)? My suspicion is that this is incorrect as the event time estimates for the 6 periods pre and post must be a weighted average of the 159 coefficients in the output table?
(2) How can I do a post csdid margins, similar to "margins, expression(_b[tb_`i']+_b[intsmall_tb_`i']) post " to calculate the total effect at each event time.

Many thanks in advance for your help. And apologies for some basic questions.
Sincerely,
Sumedha

test - xsmle and interpretation

Hey listers,

I am using the "xsmle" command to run fixed effects models on panel data. I am particularly interested in computing the indirect effects of my covarietes so I added the "effects" option. All my variables are in logarithms - dependent ln(x) and independent ln(y)-, but I'm not sure about the interpretation of the coefficients the "effects" option retrieves (LR Direct, Indirect and Total Effects).

For example, if LR Direct retrieves a coefficient of 0.312331. Should I go: a 1% change v1 is correlated with a 0.312331% change in y or a 31.2331% change in y.

Best regards,

Marcos

Is -python script- much slower in Stata 17 than in Stata 16?

I use the tuples command (from SSC). The tuples command is implemented in terms of python script and, optionally, uses Mata code.

I run the following script:

Code:

cls 
about
python query 

clear all
macro drop _all

numlist "1/17"

timer clear

// timing python script
timer on 1
tuples `r(numlist)'
timer off 1

macro drop _tuple*

// timing mata code
timer on 2
tuples `r(numlist)' , nopython
timer off 2

timer list

Here are results from Stata 16

Code:

Stata/IC 16.1 for Windows (64-bit x86-64)
Revision 14 Jun 2022
Copyright 1985-2019 StataCorp LLC

Total physical memory:       16.00 GB
Available physical memory:   10.67 GB

Stata license: Single-user  perpetual
Serial number: omitted
  Licensed to: omitted

. python query 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    omitted

    Python system information
      initialized          yes
      version              3.10.0
      architecture         64-bit
      library path         omitted\Python\Python310\python310.dll

omited

. timer list
   1:      2.47 /        1 =       2.4660
   2:      5.35 /        1 =       5.3530

Python does the job in about 2.5 seconds; in half the time it takes Mata.

Here are the results from the identical script in Stata 17:

Code:

. about

Stata/BE 17.0 for Windows (64-bit x86-64)
Revision 23 Aug 2022
Copyright 1985-2021 StataCorp LLC

Total physical memory:       16.00 GB
Available physical memory:   10.65 GB

Stata license: Single-user  perpetual
Serial number: omitted
  Licensed to: omitted

. python query 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    omitted
    
    Python system information
      initialized          yes
      version              3.10.4
      architecture         64-bit
      library path         omitted\Python\Python310\python310.dll

omitted

. timer list
   1:      6.54 /        1 =       6.5350
   2:      4.90 /        1 =       4.8980

Good news: Mata is faster now. Bad news: Python is much slower.

What is going on?

How to export polychoric matrix results

Hello everyone

I have now searched far and wide for an answer to this question. I am using polychoric matrices in a paper that I am working on and want to include the results of the matrices in my paper (in a matrix format). I cannot find a way to export the matrices in a way that is easily formatted in a Word or Excel document (or even a text document for that matter).

Are there any suggestions on how I can do this, please? I usually use the estpost/estout commands when using the normal corr matrices, but it does not seem to work with polychoric correlation matrices.

Kind regards
Odile

Expanding obs depending on several variables

Dear Statalist, I have a dataset of several firms patenting. However, there are patents with more than one firm, which is captured by different variables (citing_firm_id1, citing_firm_id2, citing_firm_id3…). My aim is to end with a single obs for each firm-year (a panel of firms). For this, I use the main firm variable which is “citing_firm_id1”. The problem is that there are some firms that never appear in this main variable, but they might appear in the rest of variables (citing_firm_id2, citing_firm_id3…), something I capture using the variables match2, match3,... which is equal to 1 if they do not appear in the main variable citing_firm_id1.

I would like to put those obs in which a given firm does not appear in the main variable (if match2, or match3… are 1) in the main variable. I think that maybe expanding only those obs and changing their ids in citing_firm_id1 for their ids in citing_firm_id2 or citing_firm_id3 might be a solution. Is it possible to do this only for those ids reporting a 1 in each of their matchx variables?

For instance, firm 993 is in the second firm variables (citing_firm_id2) and have match2==1. So, this firm never appear in citing_firm_id1 and thus, I would be losing such a firm. The idea would be to expand these 7 obs and put its id (993) in citing_firm_id1 only for those expanded obs. Notice however, that this firm is also in variable citing_firm_id3, so, I will need the same for this obs. It may happen this firm also appearing in several other variables (citing_firm_id4, citing_firm_id5… until citing_firm_id49) reporting its match4, maych5… variable as equal 1.

Also notice that in the case of citing_firm_id1== 2422461, the firms ids in citing_firm_id2 and citing_firm_id3 are reported as never showing up in citing_firm_id1. Thus, these 5 obs should be expanded 10 times (5 for id 3811035; and 5 for id 2754) while putting such ids in the expanded citing_firm_id1.

If you think there is a better way of dealing with this, it will be more than welcome.
Here you have a small dataex example for if you can help me with this problem.

Thanks a lot for your help.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long(citing_firm_id1 citing_firm_id2) byte match2 int citing_firm_id3 byte match3 long(cited_pub_date citing_appln_id cited_appln_id) byte cit_total long citing_pub_date
1170342     993 1    . . 20110427 471259150 323675857 9 20180314
1170342     993 1    . . 20120613 470647463 339614333 8 20180221
1170342     993 1    . . 20100203 330934545 266772412 4 20120523
1170342     993 1    . . 20040114 448427401    340722 9 20170517
1170342     993 1    . . 20100203 420528862 266772412 4 20151202
1170342     993 1    . . 19960925 448427401  17174701 9 20170517
1170342     993 1    . . 20080716 496107987      7659 5 20191023
 114959    9472 1    . . 19810107  16716120  16457427 2 19890412
 114959    9472 1    . . 19851002  16716119  16605862 5 19890412
 114959    9472 1    . . 19810107  16716119  16457427 5 19890412
3083727    9429 .    . . 19951018  15747335  17106735 4 20020206
3083727    9429 .    . . 19910807  15747335  16888971 4 20020206
3083727    9429 .    . . 19941005  15747335  17032091 4 20020206
 520374   12024 .  993 1 20120613 451180603 339733097 7 20161019
2422461 3811035 1 2754 1 20140521 457930913 412195248 4 20171108
2422461 3811035 1 2754 1 20121212 457930913 363965896 4 20171108
2422461 3811035 1 2754 1 20121212 457930906 363965896 5 20171115
2422461 3811035 1 2754 1 19980617 457930906  17236097 5 20171115
2422461 3811035 1 2754 1 20140521 457930906 412195248 5 20171115
end

Tuesday, September 27, 2022

Not show "Standard errors in parenthesis p <0.10, p<0.05, p<0.001" below table, while keeping the significance stars with esttab

Hey everyone,

As in the title, I'd like to save my regression results in a table, however, I can't seem to find a way to keep the significance stars while not showing "Standard errors in parenthesis *p <0.10, **p<0.05, ***p<0.001" under the table with esttab. I basically want this because below each table I add a lot of notes in LaTex and I've seen a lot of people putting the "Standard errors in parenthesis *p <0.10, **p<0.05, ***p<0.001" at the end of the notes instead of immediately below the table. Since I'm going to manually put "Standard errors..." in the LaTeX caption of the table, I want to produce just the table.

This is the code I have so far:

esttab m1 m2 m3 m4 using "results/tables/regression1.tex", replace label se star(* 0.10 ** 0.05 *** 0.01)

Thanks in advance!

Array

Xtabond2 for system GMM. Please help me for coding

Dear all,

I am working with the xtabond2 command in Stata to solve the endogeneity problem of my estimation. I read the construction of doing xtabond2 from David Roodman. However, I am still confused that my coding is right or wrong. I appreciate receiving your advice :

Code:

*************:
xtabond2 ltotalfertility l.ltotalfertility l(0/2).(pand_res_pt) i.year, ///
iv(i.year, eq(both)) ///
iv(l(0/2).(pand_res_pt), eq(both)) ///
gmm(ltotalfertility, lag(2 .) collapse eq(both))  ///
h(1) ar(3) two cluster(ifscode)

here, by choosing

Code:

. xtabond2 ltotalfertility l.ltotalfertility l(0/2).(pand_res_pt) i.year, ///
> iv(i.year, eq(both)) ///
> iv(l(0/2).(pand_res_pt), eq(both)) ///
> gmm(ltotalfertility, lag(2 .) collapse eq(both))  ///
> h(1) ar(3) two cluster(ifscode)
Favoring speed over space. To switch, type or click on mata: mata set matafavor space, perm.
1996b.year dropped due to collinearity
1997.year dropped due to collinearity
2009.year dropped due to collinearity
2018.year dropped due to collinearity
2019.year dropped due to collinearity
Warning: Two-step estimated covariance matrix of moments is singular.
  Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
  Difference-in-Sargan/Hansen statistics may be negative.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: ifscode                         Number of obs      =      3622
Time variable : year                            Number of groups   =       183
Number of instruments = 44                      Obs per group: min =         1
Wald chi2(23) =  69569.39                                      avg =     19.79
Prob > chi2   =     0.000                                      max =        20
                                   (Std. err. adjusted for clustering on ifscode)
---------------------------------------------------------------------------------
                |              Corrected
ltotalfertility | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
ltotalfertility |
            L1. |   .8861639   .0218916    40.48   0.000     .8432571    .9290708
                |
    pand_res_pt |
            --. |  -.0253932   .0059697    -4.25   0.000    -.0370936   -.0136929
            L1. |  -.0309516    .008387    -3.69   0.000    -.0473897   -.0145134
            L2. |  -.0282224   .0065436    -4.31   0.000    -.0410476   -.0153972
                |
           year |
          1998  |  -.0060163   .0048398    -1.24   0.214    -.0155022    .0034696
          1999  |  -.0054971    .004538    -1.21   0.226    -.0143914    .0033972
          2000  |  -.0053192   .0040945    -1.30   0.194    -.0133442    .0027059
          2001  |  -.0116792    .004663    -2.50   0.012    -.0208184   -.0025399
          2002  |  -.0085614   .0040365    -2.12   0.034    -.0164728     -.00065
          2003  |  -.0082516   .0035831    -2.30   0.021    -.0152744   -.0012288
          2004  |  -.0066783   .0034789    -1.92   0.055    -.0134968    .0001401
          2005  |  -.0094819   .0034326    -2.76   0.006    -.0162097   -.0027541
          2006  |   -.006441   .0031204    -2.06   0.039    -.0125569   -.0003251
          2007  |  -.0062036   .0032063    -1.93   0.053    -.0124879    .0000806
          2008  |   -.006044   .0029945    -2.02   0.044    -.0119131   -.0001749
          2010  |   .0023753   .0019727     1.20   0.229    -.0014911    .0062418
          2011  |  -.0024552   .0013919    -1.76   0.078    -.0051833    .0002729
          2012  |  -.0121593   .0034885    -3.49   0.000    -.0189966    -.005322
          2013  |  -.0177037   .0037538    -4.72   0.000    -.0250609   -.0103465
          2014  |  -.0139368   .0039577    -3.52   0.000    -.0216938   -.0061797
          2015  |  -.0176519   .0041448    -4.26   0.000    -.0257756   -.0095281
          2016  |   -.018251   .0041753    -4.37   0.000    -.0264344   -.0100676
          2017  |  -.0233959   .0044382    -5.27   0.000    -.0320946   -.0146973
                |
          _cons |   .1172786   .0240108     4.88   0.000     .0702183    .1643389
---------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(pand_res_pt L.pand_res_pt L2.pand_res_pt)
    D.(1996b.year 1997.year 1998.year 1999.year 2000.year 2001.year 2002.year
    2003.year 2004.year 2005.year 2006.year 2007.year 2008.year 2009.year
    2010.year 2011.year 2012.year 2013.year 2014.year 2015.year 2016.year
    2017.year 2018.year 2019.year)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(2/23).ltotalfertility collapsed
Instruments for levels equation
  Standard
    pand_res_pt L.pand_res_pt L2.pand_res_pt
    1996b.year 1997.year 1998.year 1999.year 2000.year 2001.year 2002.year
    2003.year 2004.year 2005.year 2006.year 2007.year 2008.year 2009.year
    2010.year 2011.year 2012.year 2013.year 2014.year 2015.year 2016.year
    2017.year 2018.year 2019.year
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    DL.ltotalfertility collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -4.49  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   0.67  Pr > z =  0.501
Arellano-Bond test for AR(3) in first differences: z =  -1.82  Pr > z =  0.069
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(20)   =  17.39  Prob > chi2 =  0.628
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(20)   =  96.22  Prob > chi2 =  0.000
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  GMM instruments for levels
    Hansen test excluding group:     chi2(19)   =  70.91  Prob > chi2 =  0.000
    Difference (null H = exogenous): chi2(1)    =  25.31  Prob > chi2 =  0.000
  iv(1996b.year 1997.year 1998.year 1999.year 2000.year 2001.year 2002.year 2003.year 2004.year 2005.year 2006.year 2007.year 2008.year 2009
> .year 2010.year 2011.year 2012.year 2013.year 2014.year 2015.year 2016.year 2017.year 2018.year 2019.year)
    Hansen test excluding group:     chi2(1)    =   0.10  Prob > chi2 =  0.757
    Difference (null H = exogenous): chi2(19)   =  96.12  Prob > chi2 =  0.000
  iv(pand_res_pt L.pand_res_pt L2.pand_res_pt)
    Hansen test excluding group:     chi2(17)   =  90.00  Prob > chi2 =  0.000
    Difference (null H = exogenous): chi2(3)    =   6.21  Prob > chi2 =  0.102


.
end of do-file

the lag(2 .) , but when I use the collapse the Hansen test is not significant.

***also I used the "xtdpdgmm" to get te better results, I did this :

Code:

xtdpdgmm L(0/1).ltotalfertility l(0/2).(pand_res_pt) i.year, noserial gmmiv(L.ltotalfertility, collapse model(difference)) iv(l(0/2).(pand_res_pt) i.year, difference model(difference)) twostep vce(robust)

and the results :

Code:

. xtdpdgmm L(0/1).ltotalfertility l(0/2).(pand_res_pt) i.year, noserial gmmiv(L.ltotalfertility, collapse model(difference)) iv(l(0/2).(pand
> _res_pt) i.year, difference model(difference)) twostep vce(robust)
note: 1996.year identifies no observations in the sample.
note: 1997.year identifies no observations in the sample.
note: 2017.year omitted because of collinearity.
note: 2018.year identifies no observations in the sample.
note: 2019.year identifies no observations in the sample.

Generalized method of moments estimation

Fitting full model:

Step 1:
initial:       f(b) =  19.115759
alternative:   f(b) =  4.6514905
rescale:       f(b) =  .08442007
Iteration 0:   f(b) =  .08442007  
Iteration 1:   f(b) =  .00083005  
Iteration 2:   f(b) =  .00082701  
Iteration 3:   f(b) =  .00082701  

Step 2:
Iteration 0:   f(b) =  .97149141  
Iteration 1:   f(b) =  .57064682  
Iteration 2:   f(b) =  .56335002  
Iteration 3:   f(b) =   .5633197  
Iteration 4:   f(b) =  .56331884  
Iteration 5:   f(b) =  .56331881  

Group variable: ifscode                      Number of obs         =      3622
Time variable: year                          Number of groups      =       183

Moment conditions:     linear =      43      Obs per group:    min =         1
                    nonlinear =      18                        avg =  19.79235
                        total =      61                        max =        20

                                 (Std. err. adjusted for 183 clusters in ifscode)
---------------------------------------------------------------------------------
                |              WC-Robust
ltotalfertility | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
ltotalfertility |
            L1. |   1.025479   .0123817    82.82   0.000     1.001211    1.049746
                |
    pand_res_pt |
            --. |   .0027941   .0027985     1.00   0.318    -.0026909    .0082791
            L1. |   .0014515   .0040002     0.36   0.717    -.0063887    .0092917
            L2. |   .0011745   .0039944     0.29   0.769    -.0066543    .0090033
                |
           year |
          1996  |          0  (empty)
          1997  |          0  (empty)
          1998  |  -.0134633   .0039956    -3.37   0.001    -.0212946    -.005632
          1999  |  -.0099362   .0039732    -2.50   0.012    -.0177235   -.0021489
          2000  |  -.0072415   .0039882    -1.82   0.069    -.0150582    .0005752
          2001  |  -.0112745   .0031739    -3.55   0.000    -.0174952   -.0050537
          2002  |  -.0051081   .0031922    -1.60   0.110    -.0113647    .0011486
          2003  |  -.0028348   .0032195    -0.88   0.379    -.0091448    .0034752
          2004  |   .0006642   .0032341     0.21   0.837    -.0056745     .007003
          2005  |  -.0004954    .002737    -0.18   0.856    -.0058598    .0048691
          2006  |    .004047   .0031755     1.27   0.202    -.0021767    .0102708
          2007  |   .0055304   .0032534     1.70   0.089    -.0008462    .0119071
          2008  |   .0068074   .0032675     2.08   0.037     .0004032    .0132115
          2009  |   .0009863   .0024508     0.40   0.687    -.0038171    .0057897
          2010  |   .0030281   .0027427     1.10   0.270    -.0023475    .0084037
          2011  |   .0013373   .0022822     0.59   0.558    -.0031357    .0058103
          2012  |   .0048407   .0023568     2.05   0.040     .0002214    .0094599
          2013  |     .00075   .0015893     0.47   0.637    -.0023651     .003865
          2014  |   .0057334   .0019981     2.87   0.004     .0018172    .0096496
          2015  |   .0034253   .0013914     2.46   0.014     .0006982    .0061525
          2016  |    .003241   .0012295     2.64   0.008     .0008312    .0056508
          2017  |          0  (omitted)
          2018  |          0  (empty)
          2019  |          0  (empty)
                |
          _cons |   -.037772   .0117266    -3.22   0.001    -.0607558   -.0147882
---------------------------------------------------------------------------------
Instruments corresponding to the linear moment conditions:
 1, model(diff):
   L1.L.ltotalfertility L2.L.ltotalfertility L3.L.ltotalfertility
   L4.L.ltotalfertility L5.L.ltotalfertility L6.L.ltotalfertility
   L7.L.ltotalfertility L8.L.ltotalfertility L9.L.ltotalfertility
   L10.L.ltotalfertility L11.L.ltotalfertility L12.L.ltotalfertility
   L13.L.ltotalfertility L14.L.ltotalfertility L15.L.ltotalfertility
   L16.L.ltotalfertility L17.L.ltotalfertility L18.L.ltotalfertility
   L19.L.ltotalfertility L20.L.ltotalfertility
 2, model(diff):
   D.pand_res_pt D.L.pand_res_pt D.L2.pand_res_pt D.1999bn.year D.2000.year
   D.2001.year D.2002.year D.2003.year D.2004.year D.2005.year D.2006.year
   D.2007.year D.2008.year D.2009.year D.2010.year D.2011.year D.2012.year
   D.2013.year D.2014.year D.2015.year D.2016.year D.2017.year
 3, model(level):
   _cons

.
end of do-file

the results are not the same, Please guide me if is it the right command.

Many thanks in advance for your valuable time and advice.

Regards,

Does gen,sum() and egen,total() work differently when combined with bysort?

Hi,

Please consider the following data:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(distid hospid number)
1 1 10
1 2 10
1 3 10
2 2 12
2 1 12
2 3 12
3 2 13
3 3 13
3 1 13
end

I used sum() and total() combined with bysort but got different results. Do sum() and total() work differently when combined with bysort?

Code:

sort distid hospid

.
. by distid: gen sum=sum( number )

.
. by distid: egen total=total( number )

and it produced the following:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(distid hospid number sum total)
1 1 10 10 30
1 2 10 20 30
1 3 10 30 30
2 1 12 12 36
2 2 12 24 36
2 3 12 36 36
3 1 13 13 39
3 2 13 26 39
3 3 13 39 39
end

Monday, September 26, 2022

How to make a graph that shows reliable and clinically significant change in particular variable?

Dear Statalisters,

I appreciate if anyone can help with how to make a graph showing reliable and clinically significant change in particular variable, something like panel A or B of figure 2 below from a paper. I searched every Stata forum questions and Stata materials related to this plot, including a book on A visual Guide to Stata Graphics (Third Edition) by Michael N. Mitchell, yet nothing matched my question.

Array

I also have quite similar data to the above study, with the following variables as pasted below:

- ID is participant ID, with 99 participants (1-99).
- GAD_0 is depression score at baseline (0-21)
- GAD_8 is depression score at week 8 (post-intervention time) after clinical intervention (0-20)
- Dif_GAD_0_8 is difference score between pre- and post-intervention (-7-20)
- RCCriterion_0_8=4.45312 is the cutoff for assessing the reliable change or reliable improvement. This parameter was estimated based on the formula: 1.96*SE_diff = 1.96*SD* sqrt(2-r1-r2)where SE_diff = SE of the difference; SD = SD of test 1; r1 and r2 are reliability of tests 1 and 2. So, if Dif_GAD_0_8 > 1.96*SE_diff or >4.45312, then we can identify him/her as reliable improvers.
- Actual_Dif_GAD_0_8 is a dichotomous variable classifying a participant as reliable changer or not based on the above criteria.
- A clinical cutoff for GAD defined from literature is a score of 13.

So, with the above data, how can I make a plot of figure 2 (like Panel A) as above, if possible, can you suggest with Stata codes based on Stata data below.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float ID double(GAD_0 GAD_8) float(Dif_GAD_0_8 RCCriterion_GAD_0_8 Actual_Dif_GAD_0_8)
 1 20  4 16 4.45312 1
 2  4  7 -3 4.45312 0
 3  9  4  5 4.45312 1
 4  7  6  1 4.45312 0
 5 17  6 11 4.45312 1
 6 18  7 11 4.45312 1
 7  7  4  3 4.45312 0
 8 13  5  8 4.45312 1
 9 12  4  8 4.45312 1
10 18  6 12 4.45312 1
11  7  1  6 4.45312 1
12 14  0 14 4.45312 1
13 12  4  8 4.45312 1
14 17  5 12 4.45312 1
15 12  7  5 4.45312 1
16 21 10 11 4.45312 1
17  3  2  1 4.45312 0
18  4  5 -1 4.45312 0
19  7  5  2 4.45312 0
20 17  4 13 4.45312 1
21  5  4  1 4.45312 0
22 14  5  9 4.45312 1
23  0  3 -3 4.45312 0
24  8 10 -2 4.45312 0
25 12  3  9 4.45312 1
26 11  9  2 4.45312 0
27 15  3 12 4.45312 1
28 21 14  7 4.45312 1
29 16  4 12 4.45312 1
30  4  2  2 4.45312 0
31  8  4  4 4.45312 0
32  7  5  2 4.45312 0
33 10  6  4 4.45312 0
34 16 10  6 4.45312 1
35 12  3  9 4.45312 1
36 14  6  8 4.45312 1
37  4  4  0 4.45312 0
38  6  5  1 4.45312 0
39 20  8 12 4.45312 1
40 15  4 11 4.45312 1
41 10  5  5 4.45312 1
42 16  4 12 4.45312 1
43  4  1  3 4.45312 0
44 19  5 14 4.45312 1
45  9  2  7 4.45312 1
46  6  4  2 4.45312 0
47 12  7  5 4.45312 1
48 14  5  9 4.45312 1
49  7  7  0 4.45312 0
50 16  7  9 4.45312 1
51  9  3  6 4.45312 1
52  1  3 -2 4.45312 0
53  7  4  3 4.45312 0
54 20  8 12 4.45312 1
55 15  4 11 4.45312 1
56  8  6  2 4.45312 0
57 13  6  7 4.45312 1
58 18  7 11 4.45312 1
59 12  4  8 4.45312 1
60 16  6 10 4.45312 1
61  5  4  1 4.45312 0
62 20  0 20 4.45312 1
63  7  5  2 4.45312 0
64 18  6 12 4.45312 1
65 16 12  4 4.45312 0
66 19  0 19 4.45312 1
67 15  8  7 4.45312 1
68  7  4  3 4.45312 0
69 20  7 13 4.45312 1
70 10  3  7 4.45312 1
71 17 20 -3 4.45312 0
72  7  9 -2 4.45312 0
73 14  6  8 4.45312 1
74 11  7  4 4.45312 0
75  9  2  7 4.45312 1
76  7 11 -4 4.45312 0
77 21  9 12 4.45312 1
78  3  3  0 4.45312 0
79 14  9  5 4.45312 1
80 19  0 19 4.45312 1
81 16 13  3 4.45312 0
82  9 16 -7 4.45312 0
83  8  3  5 4.45312 1
84  6  4  2 4.45312 0
85 12  8  4 4.45312 0
86  6  5  1 4.45312 0
87 18  4 14 4.45312 1
88 21  9 12 4.45312 1
89  7  5  2 4.45312 0
90  5  4  1 4.45312 0
91  7  5  2 4.45312 0
92  4  5 -1 4.45312 0
93 20 19  1 4.45312 0
94  1  8 -7 4.45312 0
95 21  6 15 4.45312 1
96 13  8  5 4.45312 1
97  3  4 -1 4.45312 0
98 11  5  6 4.45312 1
99 12  5  7 4.45312 1
end

I thank you for your advice and help.

how to use stcurve for Cox model after multiple imputation?

Hello, Statalist!

I have a question about the stcurve after a multiple imputation.

So, after the following command:
mi estimate:stcox var1 var2 var3

I wanted to plot the survival probability, and used stcurve command:

stcurve, survival at1(var1=1) at2(var1=2) yla(,nogrid)

but I got an error code:
last estimates not found
r(301);

My question is: how to run stcurve after multiple imputations?

Interpretation for Vector Error Correction Model with 4 variables and 2 cointegration vectors

I have 4 time series variables, y, x, z, w,, which are filtered by tsmooth command.
I tried to do cointegration test for these 4 variables using vecrank command and found that there are 2 cointegration relationships. I tried to do vector error correction model using vce command.
I got two cointegration vectors below. One cointegration vector is y-2.888z-9.519w+constant=0 and the other cointegration vector is x+1.231z+1.467w+constant=0.

Array

My question is how to calculate the long-run relationship between y and x. I think three methods. The first is to insert z=-(x+1.467w+constant)/1.231 in the second cointegration (_ce2) to the first cointegration equation (_ce1) and get the relationships like this, y=-2.345x+6.077w+constant.

The second is to insert w=-(x+1.231z+constant)/1.467 in the second cointegration (_ce2) to the first cointegration equation (_ce1) and get the relationships like this, y=-6.487x-5.099z+constant.

The third is to insert z=-(x+1.467w+constant)/1.231 and w=-(x+1.231z+constant)/1.467 in the second cointegration (_ce2) to the first cointegration equation (_ce1) and get the relationships like this, y=-8.833x—7.987z-3.441w+constant.

Which one is correct?
In the first cointegration vector, y=2.888z+9.519w+constant, there is positive relationship between y and z and y and w. However, in the third method, their relationship signs are changed. How do I interpret their relationship?

Please explain the long -run relationships among four variables with more than 2 cointegration relationships.

Jaimin Lee

Selectively list date variable as matrix row name using macro

Greetings, I'm hoping for some aid here. Having searched the forum for a solution, I've yet to find one that works. Here is the code I'm working with

Code:

clear
input str20 DATESTR double SALES GDP byte SELECT
"2019m1" 1243 209 0
"2019m2" 889 209 0
"2019m3" 1220 210 0
"2019m4" 1594 211 0
"2019m5" 1458 212 0
"2019m6" 1178 213 0
"2019m7" 1452 214 0
"2019m8" 1453 214 1
"2019m9" 1281 215 1
"2019m10" 1444 216 1
"2019m11" 1185 216 1
"2019m12" 1194 217 1
"2020m1" 1255 216 1
"2020m2" 1029 216 1
"2020m3" 1290 215 1
"2020m4" 674 208 1
"2020m5" 1014 201 1
"2020m6" 1757 195 1
"2020m7" 1826 200 1
"2020m8" 1563 206 1
"2020m9" 1483 211 1
"2020m10" 1388 213 1
"2020m11" 1288 214 0
"2020m12" 1321 215 0
"2021m1" 1301 217 0
"2021m2" 1211 219 0
"2021m3" 1698 220 0
"2021m4" 2027 223 0
"2021m5" 1715 225 0
"2021m6" 1746 227 0
"2021m7" 1790 229 0
"2021m8" 1803 230 0
"2021m9" 1535 232 0
"2021m10" 1503 235 0
"2021m11" 1549 237 0
"2021m12" 1401 240 0
"2022m1" 1350 241 0
"2022m2" 1238 243 0
"2022m3" 1809 244 0
"2022m4" 1929 246 0
"2022m5" 1612 247 0
"2022m6" 1690 249 0
end
gen date=monthly(DATESTR,"YM")
tsset date, m
mkmat SALES GDP if SELECT==1, matrix(goodmatrix) rown(date)

The final command here creates a matrix with the specified columns, however the row names return in their numerical form. Is there a way I can specify the row names be listed as written?

NOTE: I must use the time variable 'date' variable to accomplish this, I cannot use 'DATESTR'.

Code help: Monte Carlo simulation

I've attached a small Do file for a Monte Carlo Simulation I am running for an economics class. For some reason the code errors. Can anyone explain what may be causing the issue?

Cluster based on string similarity

Hey Community,

I'm quite new to working with Stata and therefore desperately looking for help! I have a dataset consisting of >200 firms and different characteristics of these firms such as their industry affiliation (see example below). However, each firm has multiple industry group affiliations. My goal is to cluster these firms based on the similarity of industry group affiliation and to create a new categorical variable consisting of those 3 clusters. Has anyone experience with this kind of problem or can help me on how to ideally approach this? Thank you so much in advance!!

Data:

firm_id	industry_groups
1	Advertising, Commerce and Shopping, Sales and Marketing
2	Advertising, Media and Entertainment, Mobile, Sales and Marketing, Software
3	Energy, Natural Resources, Sustainability
...	...

Obtaining main results after jwdid

Hello, I have just started to use jwdid command, however, I cannot see the effect on main coefficients as the output list is too long. How can I retrieve the main coefficients? Please see the attachmentArray

Sunday, September 25, 2022

How to get the predicted value of each group?

Dear all,

I want get the predicted value of each regression of each group? Here is an example.

Code:

webuse grunfeld,clear

forval i = 1/10{
    reg invest kstock if company == `i'
    predict y`i' if company == `i'
    predict e`i' if company == `i',res

}
egen y = rowtotal(y1-y10)
egen e = rowtotal(e1-e10)

y and e is what I wanted. But this is just a simple example. In my real data, I have 12000 companies.

Anyone have another simple method to get the predicted y and e of each group?

Dropping observations if entity does not exist in both waves

Hi,

I am new to Stata and trying to undertake a task for a stat class.

I have a dataset containing two waves of a survey, and I want to keep only observations available in both wave1 and wave2.

Please see below for an example of the data.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long hhid double hhpid byte indiv float wave byte sex
10001 1000103  3 2 1
10001 1000108  8 2 0
10002 1000203  3 1 0
10002 1000208  8 2 1
10002 1000209  9 2 1
10003 1000303  3 1 0
10003 1000304  4 1 1
10003 1000305  5 1 0
10003 1000306  6 1 0
10003 1000306  6 2 0
10004 1000405  5 2 1
10005 1000503  3 2 0
10005 1000504  4 2 1
10005 1000505  5 2 1
10008 1000804  4 1 1
10008 1000805  5 1 1
10008 1000806  6 1 0
10008 1000807  7 1 1
10008 1000807  7 2 1
10008 1000808  8 2 0
10009 1000903  3 2 1
10009 1000905  5 2 1
10009 1000906  6 2 1
10010 1001004  4 1 1
10011 1001105  5 1 0
10011 1001106  6 1 1
10013 1001308  8 1 1
10016 1001603  3 1 1
end
label values sex sex
label def sex 0 "0. Male", modify
label def sex 1 "1. Female", modify

Making balanced panel by dropping observation based on a certain variable

I want to make a balanced panel based on my ln_wage outcome variable. In my data most of data has 22 observations fro 22 years. But, there are some counties which doesn't have 22 observation and I need to drop them to make a balanced panel . Can you kindly advise how I can do that by coding ? If a county doesn't have 22 observation in ln_wage variable , then I need to drop the county from whole sample. How can I execute that ?

Code:

 tab county if ln_wage !=.

     county |      Freq.     Percent        Cum.
------------+-----------------------------------
       1003 |         18        0.12        0.12
       1005 |          3        0.02        0.13
       1015 |         21        0.13        0.27
       1017 |          4        0.03        0.29
       1049 |          3        0.02        0.31
       1051 |          1        0.01        0.32
       1055 |         22        0.14        0.46
       1069 |         22        0.14        0.60
       1073 |         22        0.14        0.74
       1077 |          6        0.04        0.78
       1081 |         12        0.08        0.86
       1083 |          2        0.01        0.87
       1089 |         22        0.14        1.01
       1093 |          6        0.04        1.05
       1095 |         22        0.14        1.19
       1097 |         22        0.14        1.33
       1101 |         22        0.14        1.47
       1103 |         22        0.14        1.61
end

I've given a sample of my data in the following section

[CODE]

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input double county float ln_wage int year
1001         . 2001
1001         . 2002
1001         . 2006
1001         . 2007
1001         . 2008
1001         . 2009
1001         . 2010
1001         . 2011
1001         . 2012
1001         . 2013
1001         . 2014
1001         . 2015
1001         . 2016
1003         . 2001
1003         . 2002
1003         . 2003
1003  8.852218 2004
1003  9.299419 2005
1003  9.502734 2006
1003   9.36311 2007
1003  9.166079 2008
1003  9.166397 2009
1003  9.187581 2010
1003  9.142589 2011
1003  9.294462 2012
1003  9.279741 2013
1003 9.3235655 2014
1003  8.929891 2015
1003  8.941599 2016
1003  8.965699 2017
1003  8.934246 2018
1003  8.908757 2019
1003  8.950823 2020
1003  8.973106 2021
1005         . 2003
1005   8.66577 2004
1005  8.836918 2005
1005  8.686667 2006
1005         . 2007
1005         . 2008
1005         . 2009
1005         . 2010
1005         . 2011
1005         . 2012

end

IF statement using a local string variable

I have a code that looks like the below (minimum working example). A loop that loops through 2 scenarios. Each scenario will use a different sample.
For some samples I want to run some stata commands whereas for others I don't. The if statement if `data_sample' == "sample1", does not seem to work and I cannot figure out why.

Any help would be appreciated.

Nathan

Code:

local scenario_list 1 2

foreach scenario of local scenario_list {    

use "${tempdata}/final_dataset2.dta", clear

if `scenario' == 1 {
    
    local data_sample = "sample1"
    *some drop keep statements here
    }
    else if `scenario' == 2 {
    local data_sample = "sample2"
   * some drop keep statements here
    }

if `data_sample' == "sample1"
   *do some stuff here
}

}

How can we set Stata so that potential commands would pop up downward whenever we type in a string or word?

Hey colleagues,

Today I was asked a question by a student of my intro. stat. using Stata class. Please refer to the attached picture clip.
Array
How can we set the Stata so that whenever we type in a string or word, potential commands would pop up so that we could choose a command instead of typing it in word by word.

For instance, when we type in "tabu" in the command box, "tabulate oneway" or "tabulate twoway" would pop up downward so that we could choose one according to what we want to do.

I searched online for a solution, but in vain.

How can we do it?

Thanks a lot!

keep if

Hi,
I want to use " keep if" to keep a subset of observations in my dataset, but I get an error: Type mismatch

This is the instruction I'm giving: keep if P1_DEPARTAMENTO==11

What am I doing wrong?

Saving matrices of results

Hello all,
I wonder if someone could help me.
I'm using this code to save the results from quantile regressions.
The strange thing is that I've done this already many times and had no problem. It must be something very simple. I've recently started using Stata 17.

Code:

forvalues e =  10(10)90 {
        est restore rifQ`e'
        qui est replay rifQ`e'
        matrix m=r(table)
        matrix bq=nullmat(bq)\m["b",...]
       
 
}

#I get the error

"invalid syntax"
or 
"conformability error"

What I am doing wrong? Thank you so much!!

mitigate against the potential problem of life-cycle effects

Dear Stata community,

My data is panel data, and the independent variable is available only in the last year, so it's time invariant. In order to mitigate against the potential problem of life cycle effects influencing the independent variable, I have to condition it Tj on a polynomial in age A, i.e. Tj = hjA + ej. The resulting residuals are standardised and used as indicators of independent variable net of life cycle influences. The following is my code, is it right? I couldn't be more appreciate if you can give me details.

Code:

reg Openness age
predict r_O, res
norm r_O,method(zee)

Saturday, September 24, 2022

repeated time values within panel: combining rows with the same date

Hi all,

My question is about a panel data set for which in would like to regress a certain monthly score on holding period returns of equities. I'm getting to know Stata step by step and have done some reshaping and cleaning already but I cant seem to find the right way to handle this last part. Im dealing with 4 different monthly scores (0-100) and monthly holding period returns for 1000 equities in a time frame of 10 years.
The problem is the structure of the current dataset has a separate time variable for each combination of date - firm - 1/4 scores, which causes 4 identical date observations per company & return combination. I would like to ask for a way to delete the 'missing values' and combine the 4 score variables with 1 date variable row per firm.
The data currently looks like this:

Date TICKER CUSIP PERMNO RET combined environment social governance
2019m10 A 00846U10 87432 -.011483718 . . . 84.38
2019m10 A 00846U10 87432 -.011483718 79.34
2019m10 A 00846U10 87432 -.011483718 . . 94.51 .
2019m10 A 00846U10 87432 -.011483718 88.32
2019m11 A 00846U10 87432 .066270582 . 79.96 . .
2019m11 A 00846U10 87432 .066270582 83.22
2019m11 A 00846U10 87432 .066270582 87.58 .
2019m11 A 00846U10 87432 .066270582 93.6 .
2019m12 A 00846U10 87432 .058437552 83.22
2019m12 A 00846U10 87432 .058437552 79.96
2019m12 A 00846U10 87432 .058437552 87.58
2019m12 A 00846U10 87432 .058437552 93.6
2017m1 AAIC 04135620 85653 .010121496 15.75
2017m1 AAIC 04135620 85653 .010121496 0
2017m1 AAIC 04135620 85653 .010121496 25.17
2017m1 AAIC 04135620 85653 .010121496 10.18
2017m2 AAIC 04135620 85653 -.015364095 10.18
2017m2 AAIC 04135620 85653 -.015364095 25.17
2017m2 AAIC 04135620 85653 -.015364095 15.75
2017m2 AAIC 04135620 85653 -.015364095 0
2017m3 AAIC 04135620 85653 .001017662 25.17
2017m3 AAIC 04135620 85653 .001017662 15.75
2017m3 AAIC 04135620 85653 .001017662 10.18
2017m3 AAIC 04135620 85653 .001017662 0
2017m4 AAIC 04135620 85653 .030431727 15.75

I think it might be important to mention that the data is not complete for all dates or scores for some firms, also the order of the (non-missing) scores is not consistent.

Appreciate any help or tips, thank you very much in advance.

Generating mean by interacting categorical variables

Hi all
I would like to generate a new variable newvar that is the mean of the dummy variable dummy by the interaction of year and place (both categorical variables).

I have tried using

HTML Code:

mean bad, over(season)

however this only works for one variable, when I try the two together I get returned with" interactions not allowed".
When I use just one of season or place the command works fine however this is not what I'm looking for.

Any solution would be much appreciated

remove * from string

Hello

I have a variable Value, which is the value of some assets. The data is imported from excel and is in string format. For some values, there is a * at the end, I wonder how I can remove this * while preserving 2 decimal places for each value. I tried substr, but the length of the string is different in different cases, so that doesn't work.

Here are some example data

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str17 Value
"82,291.39*"       
"39570.6"          
"24074.64"         
"15,845.35*"       
"15,774.22*"       
"11,649.53*"       
"11246.64"         
"11,028.59*"       
"11000"            
"9915.68"          
"8518.709999999999"
"7997.74"          
"7896"             
"7,734.59*"        
"7650"             
"7637.72"          
"7255.6"           
"6,681.68*"        
"6218.47"          
"6,199.11*"        
"6064.65"          
"6060"
end

Any idea how to solve this problem? Thanks a lot for any help

Fitstat gives error message

Hi all,

When I use fitstat after a ologit regression, I get this message:

Code:

. fitstat

Measures of Fit for ologit of A1

variable _cons not found
r(111);

My full output is this:

Code:

. ologit A1 i.G1 i.SQ3_2_1 i.SQ3_3_range_1 i.SQ3_4_1 i.SQ3_7_1 i.SQ4 i.BA2, nolog

Ordered logistic regression                     Number of obs     =      2,593
                                                LR chi2(17)       =     148.84
                                                Prob > chi2       =     0.0000
Log likelihood =  -2800.086                     Pseudo R2         =     0.0259

-------------------------------------------------------------------------------
           A1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
         1.G1 |  -.1382861   .0897025    -1.54   0.123    -.3140998    .0375277
    1.SQ3_2_1 |   .4094172   .1316978     3.11   0.002     .1512942    .6675402
              |
SQ3_3_range_1 |
           1  |  -.6737818   .2910436    -2.32   0.021    -1.244217   -.1033469
           2  |  -.9890188   .2823736    -3.50   0.000    -1.542461   -.4355767
           3  |  -1.057691   .2835609    -3.73   0.000     -1.61346   -.5019221
           4  |  -.7718496   .2936067    -2.63   0.009    -1.347308   -.1963911
              |
      SQ3_4_1 |
           1  |  -1.040826   .3424114    -3.04   0.002     -1.71194   -.3697122
           2  |   -1.49965   .3464901    -4.33   0.000    -2.178758   -.8205415
           3  |  -1.817044   .3596155    -5.05   0.000    -2.521877    -1.11221
           4  |  -1.035033   .5379609    -1.92   0.054    -2.089417    .0193508
              |
    1.SQ3_7_1 |  -.0076205   .1417316    -0.05   0.957    -.2854094    .2701684
        1.SQ4 |   .2350615   .1456459     1.61   0.107    -.0503993    .5205223
              |
          BA2 |
           1  |   .2381254   .1349751     1.76   0.078     -.026421    .5026717
           2  |  -.0809185   .1510311    -0.54   0.592    -.3769339     .215097
           3  |   .0645342   .1473375     0.44   0.661     -.224242    .3533104
           4  |   .4530845    .173775     2.61   0.009     .1124917    .7936773
           5  |   .3805272   .2506331     1.52   0.129    -.1107047    .8717591
--------------+----------------------------------------------------------------
        /cut1 |  -6.746417   .4951856                     -7.716963   -5.775871
        /cut2 |  -2.589905   .4630787                     -3.497523   -1.682287
        /cut3 |   -.410473   .4591015                     -1.310295    .4893494
        /cut4 |    2.07468   .4812484                       1.13145    3.017909
        /cut5 |   4.288335   .6735743                      2.968153    5.608516
-------------------------------------------------------------------------------

. fitstat

Measures of Fit for ologit of A1

variable _cons not found
r(111);

Am I doing something wrong?

Thanks in advance.

Count how often an ID occurs in a column

Hi,

I have a question which should be fairly easy to answer. I have a variable in my dataset called ind_id (individual ids). I want to get an overview of how often each ID occurs in the column. Ideally, I could also perform some basic descriptive statistics on the result (i.e. max/min, average, percentile). Thanks for your help.

Best, Valentin

how to choose one variable with non-missing value among 3 variables

Dear all,

I have a set of data for firm deals. For each deal, there are 3 completion dates, Completeddate, Expectedcompletiondate, and Assumedcompletiondate. I have a thorough inspection of the data and found that in most of the cases, only 1 of the 3 dates has a value, the other 2 are missing. In some cases, 2 out of the 3 dates have value, but the 2 dates are the same. So I want to create a new variable, complete_date, which takes the non-missing value of the 3 dates, or chooses one value if there are 2 non-missing dates (given that the 2 dates are the same, it doesn't matter which date to choose). How can I achieve this? Thanks a lot for any help!

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long DealNumber int(Completeddate Expectedcompletiondate Assumedcompletiondate)
1907206228     .     .     .
1907063377 20087     .     .
1943130728 21643     .     .
1633010357     .     . 19288
1907140101     .     . 21220
1907243411 21155     .     .
1943102519 21909     .     .
1907094570     .     . 20910
1907203574 20884     .     .
1943042795 21543     .     .
1943040193     .     . 22089
1943152252 22071     .     .
1907196308 20948     .     .
1907000052     . 19900 19900
end
format %tdnn/dd/CCYY Completeddate
format %tdnn/dd/CCYY Expectedcompletiondate
format %tdnn/dd/CCYY Assumedcompletiondate

The dates look as if they are random numbers but if you put it in stata it will back to normal mm/dd/yyyy format. The last row of the data is an example of there are 2 non-missing dates and they are the same.

How to increase the category size in STATA

Please tell me how to increase the category size in the STATA output table.
here I am attaching the STATA output table:

orga11 1 3 4 Total

ACINETOBACTER BARMA.. 0 1 0 1
ACINOBACTER RADIORE.. 0 1 0 1
AEROMONAS HYDROPHILA 1 0 0 1
Acinetobacter bauma.. 1 1 0 2

in the data file, these names are in full but in the table of the output window, they did not appear properly.

Friday, September 23, 2022

stratifying parametric lognormal regression model

I wanted to find out how I can stratify a parametric regression model by a variable. So assuming I fit a lognormal distribution on ethnicity: (streg i.ethnicity, dist(lognorm) vce(robust) tr ) but want to stratify this a categorical variable, birthweight, how I would I go about this?

Variables selection methods , minimum number of variables - max information

Good morning to everyone,
I am trying to run a SEM with more levels: the fact is that it is very heavy and I would like to reduce variables in the best possible way.
For now, as I have a lot of variables which explain one dimension, I simply try to look at the significance of coefficients and the increase of R² (in normal regression) contributed by each individual regressor.
I was wondering if there is a more precise/sophisticated methodology (inside or outside SEM) ?

Many thanks in advance for your time,
wishing you all a great weekend ahead!

Is it prudent to use both logistic regression and parametric survival models on same dataset?

I am doing a study which has two objectives:

effects of variables on timing to vaccination
factors associated with delays to vaccination

I have conducted a parametric survival model to assess the effect of those variables to timing of vaccination.(after the cox proportional assumption was violated)
I now want to define my delay variable and then run a logistic regression to determine the factors associated with delay

Is this a prudent way to go about my analysis?

Thursday, September 22, 2022

Dataset for Health Econometrics

Hello,
I am reading a Book on "Applied Econometrics for Health" and they recommended a dataset called "Health and Lifestyle Survey (HALS)" from a survey conducted in Britain. Could someone help me to get the dataset?

Grouping with a condition

Hello all,

Below is the data I am using:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(start end) byte ID
16071 16454 1
16455 16620 2
16819 17166 2
17167 17911 2
end
format %td start
format %td end

I want to group these observations based on whether the end date is one day before the next start date. If the end date is not one day before the next start date, then it is its own group. For example, the first two observations should be group 1 and the last two observations should be groups 2 and 3.

I would appreciate any help.

Anoush K.

Bar graph: switching grouping category

Hi everyone,

Seems like a simple thing to do, but I can't figure it out.

Code:

graph bar innovation_faith innovation_reforms innovation_critiques, over(gender)

I get this, but I want the bars to be grouped where men and women bars are next to each other for easy comparison, for each outcome variable. Using

Code:

by(gender)

Code:

asyvars

don't work either.

Best,
Jason

Array

merge and reshape survey data (with ordinal and numerical variables) to long data

Hello and thank you for reading -

I am dealing with survey data from 7 years and I need to merge them using the participants ID (nomem_encr) as the key variable. But doing so I render a wide data set. For my Fixed effect analysis I need long data. Is it possible to add the other data sets with the same content of questions just named differently so that it becomes a long data?

The variables for 2021 are ch21n001 ch21n002 and so on and for 2020 they are ch20m001 ch20m002 so following the same logic and order

When I appended the data and tried to -reshape long ch, i(nomem_encr) j(year) string - it showed an error saying the number of observations are not identical.

Excuse me for maybe not using the right format here in this forum this is my first question

Thank you so much in advance

dataex

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(nomem_encr ch21n_m ch21n001 ch21n002 ch21n003 ch21n004 ch21n005 ch21n006 ch21n007 ch21n008 ch21n009 ch21n010 ch21n011 ch21n012 ch21n013 ch21n014 ch21n015 ch21n016 ch21n017 ch21n018 ch21n020 ch21n021 ch21n022 ch21n023)
800009 202111 1 66 . 2 2 . 7 7 . . 1 1 5 1 4 177  97 1 1 2 1 1
800015 202111 1 59 1 3 3 7 5 . . . 1 1 4 1 4 169  77 1 1 1 1 1
800058 202111 2 24 0 3 4 8 8 . . . 3 4 4 4 2 168  71 2 1 3 4 1
800085 202112 1 44 1 3 4 6 5 . . . . . . . .   .   . . . . . .
800100 202111 2 30 1 5 5 5 5 . . . 2 2 4 1 6 160  65 2 1 1 1 1
800119 202111 2 71 0 3 3 . . 7 7 . 3 3 5 3 4 170  90 2 3 3 3 1
800127 202111 2 38 1 1 1 0 0 . . . 6 5 2 5 1 163 115 1 5 5 5 1
800131 202111 2 67 0 3 5 . 7 7 . . 2 2 5 2 4 166  62 1 1 1 1 1
800161 202112 1 51 1 4 4 9 8 . . . 1 1 4 2 5 185  89 1 1 1 1 1
end
label values ch21n001 ch21n001
label def ch21n001 1 "male", modify
label def ch21n001 2 "female", modify
label values ch21n003 ch21n003
label def ch21n003 0 "has no paid job", modify
label def ch21n003 1 "has paid job", modify
label values ch21n004 ch21n004
label def ch21n004 1 "poor", modify
label def ch21n004 2 "moderate", modify
label def ch21n004 3 "good", modify
label def ch21n004 4 "very good", modify
label def ch21n004 5 "excellent", modify
label values ch21n005 ch21n005
label def ch21n005 1 "considerably poorer", modify
label def ch21n005 2 "somewhat poorer", modify
label def ch21n005 3 "the same", modify
label def ch21n005 4 "somewhat better", modify
label def ch21n005 5 "considerably better", modify
label values ch21n006 ch21n006
label def ch21n006 0 "no chance at all", modify
label values ch21n007 ch21n007
label def ch21n007 0 "no chance at all", modify
label values ch21n008 ch21n008
label values ch21n009 ch21n009
label values ch21n010 ch21n010
label values ch21n011 ch21n011
label def ch21n011 1 "never", modify
label def ch21n011 2 "seldom", modify
label def ch21n011 3 "sometimes", modify
label def ch21n011 6 "continuously", modify
label values ch21n012 ch21n012
label def ch21n012 1 "never", modify
label def ch21n012 2 "seldom", modify
label def ch21n012 3 "sometimes", modify
label def ch21n012 4 "often", modify
label def ch21n012 5 "mostly", modify
label values ch21n013 ch21n013
label def ch21n013 2 "seldom", modify
label def ch21n013 4 "often", modify
label def ch21n013 5 "mostly", modify
label values ch21n014 ch21n014
label def ch21n014 1 "never", modify
label def ch21n014 2 "seldom", modify
label def ch21n014 3 "sometimes", modify
label def ch21n014 4 "often", modify
label def ch21n014 5 "mostly", modify
label values ch21n015 ch21n015
label def ch21n015 1 "never", modify
label def ch21n015 2 "seldom", modify
label def ch21n015 4 "often", modify
label def ch21n015 5 "mostly", modify
label def ch21n015 6 "continuously", modify
label values ch21n018 ch21n018
label def ch21n018 1 "yes", modify
label def ch21n018 2 "no", modify
label values ch21n020 ch21n020
label def ch21n020 1 "not at all", modify
label def ch21n020 3 "a bit", modify
label def ch21n020 5 "very much", modify
label values ch21n021 ch21n021
label def ch21n021 1 "not at all", modify
label def ch21n021 2 "hardly", modify
label def ch21n021 3 "a bit", modify
label def ch21n021 5 "very much", modify
label values ch21n022 ch21n022
label def ch21n022 1 "not at all", modify
label def ch21n022 3 "a bit", modify
label def ch21n022 4 "quite a lot", modify
label def ch21n022 5 "very much", modify
label values ch21n023 ch21n023
label def ch21n023 1 "without any trouble", modify

------------------ copy up to and including the previous line ------------------

Listed 9 out of 9 observations

The data for 2020 looks like this

. dataex

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(nomem_encr ch20m_m ch20m001 ch20m002 ch20m003 ch20m004 ch20m005 ch20m006 ch20m007 ch20m008 ch20m009 ch20m011 ch20m012 ch20m013 ch20m014 ch20m015 ch20m016 ch20m017 ch20m018) str244 ch20m019 double(ch20m020 ch20m021 ch20m022 ch20m023 ch20m024)
800009 202011 1 65 1 3 3  8  7 7 . 1 1 6 1 3 177  95 1 "Slokdarmproblemen"                              1 1 1 1 1
800015 202011 1 58 1 3 3  8  5 . . 1 1 5 1 4 169  78 1 "Hoge bloeddruk"                                 1 1 1 1 1
800057 202012 1 45 1 4 2  5  5 . . 1 1 5 3 4 198 105 2 " "                                              1 1 1 1 1
800058 202011 2 23 0 3 4  8  8 . . 3 3 5 4 3 168  72 2 " "                                              4 2 5 1 1
800100 202012 2 29 1 5 5 10 10 . . 1 1 6 1 6 160  70 2 " "                                              1 1 1 1 1
800119 202012 2 70 0 3 2  .  7 6 5 1 1 4 1 4 170  90 2 " "                                              2 2 2 1 1
800127 202011 2 37 1 1 2  1  0 . . 4 5 2 5 2 163   1 1 "Bloedziekte"                                    5 5 3 1 1
800131 202011 2 66 0 3 3  . 10 7 . 3 1 4 1 5 166  62 1 "astma"                                          1 1 1 1 2
800161 202011 1 50 1 4 4  8  7 . . 1 1 5 1 5 185  86 1 "artrose linker knie na kruisbandletsel in 1988" 2 1 1 1 1
end
label values ch20m004 ch20m004
label def ch20m004 1 "poor", modify
label def ch20m004 3 "good", modify
label def ch20m004 4 "very good", modify
label def ch20m004 5 "excellent", modify
label values ch20m005 ch20m005
label def ch20m005 2 "somewhat poorer", modify
label def ch20m005 3 "the same", modify
label def ch20m005 4 "somewhat better", modify
label def ch20m005 5 "considerably better", modify
label values ch20m006 ch20m006
label def ch20m006 10 "absolutely certain", modify
label values ch20m007 ch20m007
label def ch20m007 0 "no chance at all", modify
label def ch20m007 10 "absolutely certain", modify
label values ch20m008 ch20m008
label values ch20m009 ch20m009
label values ch20m011 ch20m011
label def ch20m011 1 "never", modify
label def ch20m011 3 "sometimes", modify
label def ch20m011 4 "often", modify
label values ch20m012 ch20m012
label def ch20m012 1 "never", modify
label def ch20m012 3 "sometimes", modify
label def ch20m012 5 "mostly", modify
label values ch20m013 ch20m013
label def ch20m013 2 "seldom", modify
label def ch20m013 4 "often", modify
label def ch20m013 5 "mostly", modify
label def ch20m013 6 "continuously", modify
label values ch20m014 ch20m014
label def ch20m014 1 "never", modify
label def ch20m014 3 "sometimes", modify
label def ch20m014 4 "often", modify
label def ch20m014 5 "mostly", modify
label values ch20m015 ch20m015
label def ch20m015 2 "seldom", modify
label def ch20m015 3 "sometimes", modify
label def ch20m015 4 "often", modify
label def ch20m015 5 "mostly", modify
label def ch20m015 6 "continuously", modify
label values ch20m018 ch20m018
label def ch20m018 1 "yes", modify
label def ch20m018 2 "no", modify
label values ch20m020 ch20m020
label def ch20m020 1 "not at all", modify
label def ch20m020 2 "hardly", modify
label def ch20m020 4 "quite a lot", modify
label def ch20m020 5 "very much", modify
label values ch20m021 ch20m021
label def ch20m021 1 "not at all", modify
label def ch20m021 2 "hardly", modify
label def ch20m021 5 "very much", modify
label values ch20m022 ch20m022
label def ch20m022 1 "not at all", modify
label def ch20m022 2 "hardly", modify
label def ch20m022 3 "a bit", modify
label def ch20m022 5 "very much", modify
label values ch20m023 ch20m023
label def ch20m023 1 "without any trouble", modify
label values ch20m024 ch20m024
label def ch20m024 1 "without any trouble", modify
label def ch20m024 2 "with some trouble", modify

------------------ copy up to and including the previous line ------------------

Listed 9 out of 9 observations

.