Hi all,

for my bachelor thesis I want to find out individual factors that promote a (regular) participation in direct democratic votes in Switzerland between 1995 and 2015. For my analysis, I use the Swiss national election studies, cumulated file 1971-2015 dataset and Stata version 15.1. I have four hypothesis that I want to test independently from each other.

To approach to my research question I want to run a bivariate regression for the year.
My dependent variable prdd is participation rate in direct democratic votes (a scale from 0 to 1, with decimal values: 0, 0.1, 0.2,..1)

In hypothesis 1 I treat overall trust in instutions of the representative system (trustindex) as the explaining variable,

I combine three variables of the survey (trust in parliament, trust in federal council, trust in parties - each on a scale from 0=no trust to 10=full trust) into one overall trust variable by using the command

egen trustindex = rowmean (tparty tparl tcoun) and finally replace all the countless values of trustindex in a scale from 0 to 1 with 11 values (equivalent to scale of dependent variable).

I control for:
- age (18-97)
- education (scale from 1-9, increasing educational degrees: from 1==primary education up to 9==univerisity)
- year (1995, 1999, 2003, 2007, 2011 - with 1995 as my base level)


Now, when trying to run regression I have some problems or questions:

Since my independet variables have different scales than my dependent variable (apart from trustindex) I though it might be necessary to standardise.

Is it correct that I do not need to standardise trustindex since it already has the same scale as my dependent variable?
Do I need to standardise my controlling variable "year" if treat it as a categorial variable and if I just want to see the differences in participation compared to my base level 1995?


I found two ways to standardise variables, which are supposed to lead to same results:
a) a regression using the [, beta] command or
b) pre-standardize each variable using the "egen [newvarname] = std(varname)" command

After running several different regression with the beta command or pre-standardised variables I come to the conclusion that beta-coefficients just do not match my coefficients in regressions with pre-stadardised variables:


Using the beta command
reg prdd trustindex age educ i.year, beta


Source | SS df MS Number of obs = 20,431
-------------+---------------------------------- F(7, 20423) = 355.58
Model | 193.601785 7 27.6573979 Prob > F = 0.0000
Residual | 1588.51339 20,423 .07778061 R-squared = 0.1086
-------------+---------------------------------- Adj R-squared = 0.1083
Total | 1782.11518 20,430 .087230307 Root MSE = .27889

------------------------------------------------------------------------------
prdd | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
trustindex | .2983321 .0107972 27.63 0.000 .1848966
age | .0034016 .0001179 28.85 0.000 .194637
educ | .0232571 .0009122 25.50 0.000 .1708613

year |
1999 | -.049819 .0062391 -7.99 0.000 -.0586392
2003 | -.031216 .0052029 -6.00 0.000 -.046019
2007 | -.0382453 .0056663 -6.75 0.000 -.0509464
2011 | .0332743 .007754 4.29 0.000 .0306206
|
_cons | .2985779 .0103352 28.89 0.000 .




With pre-standardised variables
reg prdd trustindex agestd educstd i.year


Source | SS df MS Number of obs = 20,431
-------------+---------------------------------- F(7, 20423) = 355.58
Model | 193.601786 7 27.6573979 Prob > F = 0.0000
Residual | 1588.51339 20,423 .07778061 R-squared = 0.1086
-------------+---------------------------------- Adj R-squared = 0.1083
Total | 1782.11518 20,430 .087230307 Root MSE = .27889

------------------------------------------------------------------------------
prdd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
trustindex | .2983321 .0107972 27.63 0.000 .2771688 .3194954
agestd | .0592009 .0020523 28.85 0.000 .0551782 .0632237
educstd | .052088 .002043 25.50 0.000 .0480836 .0560923
|
year |
1999 | -.049819 .0062391 -7.99 0.000 -.0620481 -.03759
2003 | -.031216 .0052029 -6.00 0.000 -.0414142 -.0210178
2007 | -.0382453 .0056663 -6.75 0.000 -.0493518 -.0271388
2011 | .0332743 .007754 4.29 0.000 .0180758 .0484728
|
_cons | .5886924 .0076708 76.74 0.000 .573657 .6037279



And even if I standardise the trustindex, which I thought migth be the reason for the differences, standardised coefficients still differ:
reg prdd trustindexstd agestd educstd i.year


Source | SS df MS Number of obs = 20,431
-------------+---------------------------------- F(7, 20423) = 355.58
Model | 193.601786 7 27.657398 Prob > F = 0.0000
Residual | 1588.51339 20,423 .07778061 R-squared = 0.1086
-------------+---------------------------------- Adj R-squared = 0.1083
Total | 1782.11518 20,430 .087230307 Root MSE = .27889

-------------------------------------------------------------------------------
prdd | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
trustindexstd | .0546088 .0019764 27.63 0.000 .0507349 .0584827
agestd | .0592009 .0020523 28.85 0.000 .0551782 .0632237
educstd | .052088 .002043 25.50 0.000 .0480836 .0560923
|
year |
1999 | -.049819 .0062391 -7.99 0.000 -.0620481 -.03759
2003 | -.031216 .0052029 -6.00 0.000 -.0414142 -.0210178
2007 | -.0382453 .0056663 -6.75 0.000 -.0493518 -.0271388
2011 | .0332743 .007754 4.29 0.000 .0180758 .0484728
|
_cons | .7672332 .0034341 223.42 0.000 .7605021 .7739643



How is it possible that my standardised beta-coefficient of age is .194637 while it is .0592009 when pre-standardising the varible. What are the more precise and reliable results? Or what did I do wrong?

If any unclearities remain, please ask. I hope someone can help. Thanks in advance
Rebecca