Hi all,
for my bachelor thesis I want to find out individual factors that promote a (regular) participation in direct democratic votes in Switzerland between 1995 and 2015. For my analysis, I use the Swiss national election studies, cumulated file 1971-2015 dataset and Stata version 15.1. I have four hypothesis that I want to test independently from each other.
To approach to my research question I want to run a bivariate regression for the year.
My dependent variable prdd is participation rate in direct democratic votes (a scale from 0 to 1, with decimal values: 0, 0.1, 0.2,..1)
In hypothesis 1 I treat overall trust in instutions of the representative system (trustindex) as the explaining variable,
I combine three variables of the survey (trust in parliament, trust in federal council, trust in parties - each on a scale from 0=no trust to 10=full trust) into one overall trust variable by using the command
egen trustindex = rowmean (tparty tparl tcoun) and finally replace all the countless values of trustindex in a scale from 0 to 1 with 11 values (equivalent to scale of dependent variable).
I control for:
- age (18-97)
- education (scale from 1-9, increasing educational degrees: from 1==primary education up to 9==univerisity)
- year (1995, 1999, 2003, 2007, 2011 - with 1995 as my base level)
Now, when trying to run regression I have some problems or questions:
Since my independet variables have different scales than my dependent variable (apart from trustindex) I though it might be necessary to standardise.
Is it correct that I do not need to standardise trustindex since it already has the same scale as my dependent variable?
Do I need to standardise my controlling variable "year" if treat it as a categorial variable and if I just want to see the differences in participation compared to my base level 1995?
I found two ways to standardise variables, which are supposed to lead to same results:
a) a regression using the [, beta] command or
b) pre-standardize each variable using the "egen [newvarname] = std(varname)" command
After running several different regression with the beta command or pre-standardised variables I come to the conclusion that beta-coefficients just do not match my coefficients in regressions with pre-stadardised variables:
Using the beta command
reg prdd trustindex age educ i.year, beta
Source | SS df MS Number of obs = 20,431
-------------+---------------------------------- F(7, 20423) = 355.58
Model | 193.601785 7 27.6573979 Prob > F = 0.0000
Residual | 1588.51339 20,423 .07778061 R-squared = 0.1086
-------------+---------------------------------- Adj R-squared = 0.1083
Total | 1782.11518 20,430 .087230307 Root MSE = .27889
------------------------------------------------------------------------------
prdd | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
trustindex | .2983321 .0107972 27.63 0.000 .1848966
age | .0034016 .0001179 28.85 0.000 .194637
educ | .0232571 .0009122 25.50 0.000 .1708613
year |
1999 | -.049819 .0062391 -7.99 0.000 -.0586392
2003 | -.031216 .0052029 -6.00 0.000 -.046019
2007 | -.0382453 .0056663 -6.75 0.000 -.0509464
2011 | .0332743 .007754 4.29 0.000 .0306206
|
_cons | .2985779 .0103352 28.89 0.000 .
With pre-standardised variables
reg prdd trustindex agestd educstd i.year
Source | SS df MS Number of obs = 20,431
-------------+---------------------------------- F(7, 20423) = 355.58
Model | 193.601786 7 27.6573979 Prob > F = 0.0000
Residual | 1588.51339 20,423 .07778061 R-squared = 0.1086
-------------+---------------------------------- Adj R-squared = 0.1083
Total | 1782.11518 20,430 .087230307 Root MSE = .27889
------------------------------------------------------------------------------
prdd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
trustindex | .2983321 .0107972 27.63 0.000 .2771688 .3194954
agestd | .0592009 .0020523 28.85 0.000 .0551782 .0632237
educstd | .052088 .002043 25.50 0.000 .0480836 .0560923
|
year |
1999 | -.049819 .0062391 -7.99 0.000 -.0620481 -.03759
2003 | -.031216 .0052029 -6.00 0.000 -.0414142 -.0210178
2007 | -.0382453 .0056663 -6.75 0.000 -.0493518 -.0271388
2011 | .0332743 .007754 4.29 0.000 .0180758 .0484728
|
_cons | .5886924 .0076708 76.74 0.000 .573657 .6037279
And even if I standardise the trustindex, which I thought migth be the reason for the differences, standardised coefficients still differ:
reg prdd trustindexstd agestd educstd i.year
Source | SS df MS Number of obs = 20,431
-------------+---------------------------------- F(7, 20423) = 355.58
Model | 193.601786 7 27.657398 Prob > F = 0.0000
Residual | 1588.51339 20,423 .07778061 R-squared = 0.1086
-------------+---------------------------------- Adj R-squared = 0.1083
Total | 1782.11518 20,430 .087230307 Root MSE = .27889
-------------------------------------------------------------------------------
prdd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
trustindexstd | .0546088 .0019764 27.63 0.000 .0507349 .0584827
agestd | .0592009 .0020523 28.85 0.000 .0551782 .0632237
educstd | .052088 .002043 25.50 0.000 .0480836 .0560923
|
year |
1999 | -.049819 .0062391 -7.99 0.000 -.0620481 -.03759
2003 | -.031216 .0052029 -6.00 0.000 -.0414142 -.0210178
2007 | -.0382453 .0056663 -6.75 0.000 -.0493518 -.0271388
2011 | .0332743 .007754 4.29 0.000 .0180758 .0484728
|
_cons | .7672332 .0034341 223.42 0.000 .7605021 .7739643
How is it possible that my standardised beta-coefficient of age is .194637 while it is .0592009 when pre-standardising the varible. What are the more precise and reliable results? Or what did I do wrong?
If any unclearities remain, please ask. I hope someone can help. Thanks in advance
Rebecca
Related Posts with Standardisation of variables and differing standardised coefficients
Show estimates for identical variables next to each other (e.g. via esttab)I have several models with (three-way) interactions terms, some of them are manually coded. I am sea…
Using esttab to show both linear regression and logistic regression (odds ratio) outcomesHello, I'm trying to display 3 regression outcomes together by using Code: esttab command. Among t…
Inquiry about randomization (block size 4, 1:1 allocation)Hello, I'm conducting Stata command for randomization with the following details: Treatment group …
Obtaining milliseconds from -now()-According to the documentation: tc = now() where tc: number of milliseconds from 01jan1960 00…
Using asdoc, acum to store values that I can write laterI am trying to use asdoc to create a table with the data I want. I first use: Code: sum XYZ if tin…
Subscribe to:
Post Comments (Atom)
0 Response to Standardisation of variables and differing standardised coefficients
Post a Comment