Good morning all from a very gloomy Ireland where we have just reported our highest daily number of C-19 cases.
I have a quick question on standardising variables using panel data. I have two waves of data (pre and during-Covid), with variables for each wave suffixed with 1 and 2 e.g. jobsat1 and jobsat2. I am using a fe regression model to examine the impact of Covid-19 on different outcomes (mental health; job satisfaction etc). In order to be able to compare the effects I need to standardise the variables . To do that I started with the data in wide format and generated zscores using the following code.
egen float zjobsat1 = std(jobsat1), mean(0) std(1)
egen float zjobsat2 = std(jobsat2), mean(0) std(1)
I also used the code zscore jobsat1 /// zscore jobsat2 and it produced the same results
To conduct the fe regression, I reshaped my wide data into long format and applied xtset (id wave). This gave me a variable zjobsat.
My problem is that when I look at the difference in the zscores for my jobsat variable (and indeed all my other outcome variables) between waves 1 and 2 (i.e. zscore2-zscore1), the results are very weird I.e. the wrong sign and there is no significant changes at all when I run ttests. e.g. When I run a ttest on the raw scores (in red below), followed by a ttest on the zscores (in green below):
. ttest jobsat2==jobsat1
Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
jobsat2 | 618 2.425017 .0219672 .5460965 2.381878 2.468157
jobsat1 | 618 2.535626 .0225456 .5604743 2.49135 2.579901
---------+--------------------------------------------------------------------
diff | 618 -.1106083 .0163644 .4068135 -.1427451 -.0784716
------------------------------------------------------------------------------
mean(diff) = mean(jobsat2 - jobsat1) t = -6.7591
Ho: mean(diff) = 0 degrees of freedom = 617
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. ttest zjobsat2==zjobsat1
Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
zjobsat2 | 618 3.26e-10 .0402259 1 -.0789963 .0789963
zjobsat1 | 618 -.0005655 .040252 1.000648 -.0796129 .078482
---------+--------------------------------------------------------------------
diff | 618 .0005655 .0295705 .7351097 -.0575054 .0586364
------------------------------------------------------------------------------
mean(diff) = mean(zjobsat2 - zjobsat1) t = 0.0191
Ho: mean(diff) = 0 degrees of freedom = 617
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.5076 Pr(|T| > |t|) = 0.9847 Pr(T > t) = 0.4924
You will see that the t-score changes sign and magnitude and the difference between the figure for the two waves is no longer significant. I am not very familiar with z-scores in general and am at a loss as to why this is happening. Is it something to do with me giving separate zscores to jobsat1 and jobsat2? Do I need to standardise across both waves simultaneously and if so, how do I do this using Stata code?
I assume that I am doing something probably fairly basic wrong so any advice on how to fix this problem would be greatly appreciated.
Best wishes
Diane
Related Posts with Standardising variables across waves in panel data analysis - how?
Counting letters in Stata?Hello Maybe a weird question, but I can do Code: local i = 1 foreach x of numlist 1/10 { di `i'…
Comparing models and profiles (including significance) using LPA (continous variables)Hello everyone, Thanks for all the helpful information on this site. I am trying to compare models…
displaying Odds Ratios and 95% Confidence Intervals in word using outregHi All I am rather new to Stata and have encountered some difficulty in generating a word documents…
Sort two variables simultaneouslyHi is it possible to sort two variables for example by id number and sex. So if I have the following…
Creating deciles/quintiles based on a certain group of observations by yearHello I have a panel data of all NYSE, AMEX and NASDAQ firms. I am trying to create size deciles ba…
Subscribe to:
Post Comments (Atom)
0 Response to Standardising variables across waves in panel data analysis - how?
Post a Comment