Good morning all from a very gloomy Ireland where we have just reported our highest daily number of C-19 cases.
I have a quick question on standardising variables using panel data. I have two waves of data (pre and during-Covid), with variables for each wave suffixed with 1 and 2 e.g. jobsat1 and jobsat2. I am using a fe regression model to examine the impact of Covid-19 on different outcomes (mental health; job satisfaction etc). In order to be able to compare the effects I need to standardise the variables . To do that I started with the data in wide format and generated zscores using the following code.
egen float zjobsat1 = std(jobsat1), mean(0) std(1)
egen float zjobsat2 = std(jobsat2), mean(0) std(1)
I also used the code zscore jobsat1 /// zscore jobsat2 and it produced the same results
To conduct the fe regression, I reshaped my wide data into long format and applied xtset (id wave). This gave me a variable zjobsat.
My problem is that when I look at the difference in the zscores for my jobsat variable (and indeed all my other outcome variables) between waves 1 and 2 (i.e. zscore2-zscore1), the results are very weird I.e. the wrong sign and there is no significant changes at all when I run ttests. e.g. When I run a ttest on the raw scores (in red below), followed by a ttest on the zscores (in green below):
. ttest jobsat2==jobsat1
Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
jobsat2 | 618 2.425017 .0219672 .5460965 2.381878 2.468157
jobsat1 | 618 2.535626 .0225456 .5604743 2.49135 2.579901
---------+--------------------------------------------------------------------
diff | 618 -.1106083 .0163644 .4068135 -.1427451 -.0784716
------------------------------------------------------------------------------
mean(diff) = mean(jobsat2 - jobsat1) t = -6.7591
Ho: mean(diff) = 0 degrees of freedom = 617
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. ttest zjobsat2==zjobsat1
Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
zjobsat2 | 618 3.26e-10 .0402259 1 -.0789963 .0789963
zjobsat1 | 618 -.0005655 .040252 1.000648 -.0796129 .078482
---------+--------------------------------------------------------------------
diff | 618 .0005655 .0295705 .7351097 -.0575054 .0586364
------------------------------------------------------------------------------
mean(diff) = mean(zjobsat2 - zjobsat1) t = 0.0191
Ho: mean(diff) = 0 degrees of freedom = 617
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.5076 Pr(|T| > |t|) = 0.9847 Pr(T > t) = 0.4924
You will see that the t-score changes sign and magnitude and the difference between the figure for the two waves is no longer significant. I am not very familiar with z-scores in general and am at a loss as to why this is happening. Is it something to do with me giving separate zscores to jobsat1 and jobsat2? Do I need to standardise across both waves simultaneously and if so, how do I do this using Stata code?
I assume that I am doing something probably fairly basic wrong so any advice on how to fix this problem would be greatly appreciated.
Best wishes
Diane
Related Posts with Standardising variables across waves in panel data analysis - how?
Multivariate regression discontinuity designHi everyone I would really appreciate to have your opinion on this idea: My master thesis is about…
Sum of a variable over all observations except one (loop)Dear all, Let's suppose I have a variable displaying the amount of input in a given industry for ea…
Margins command after mixed-effect tobit regressionI'm hoping to get advice on the difference between two predictive margins outputs at their observed …
How to know the x value when two lines intersectHello, I got the following graph after running the following codes (a fixed-effects model, wg is a …
Condition on bysort and count across all observations by groupHello, input hid day_bought min_day str20 product str20 manufacturer hid day_bought min_day produc…
Subscribe to:
Post Comments (Atom)
0 Response to Standardising variables across waves in panel data analysis - how?
Post a Comment