This lead my school statistician and I to consider whether PCA might be an interesting option to better explain the level 2 variables by including the components as predictors into the regression model.
My level 2 data is all binary - 0 (No) 1 (Yes), and I've been a little ensure as to whether it is methodologically sound to use PCA with this data. I have given it a go and below is my code and output. I have only included the first 5 components as these all have eigenvalues >1.
First question - should I be doing a PCA with binary data?
Second question - what is the lowest cut-off for eigenvectors to meaningfully interpret my components? I've read that it should be 0.4 but nothing in my 1st component exceeds that.
So assuming that PCA should be done on the data I have, with the results I'm getting, would it even give a meaningful contribution to my analysis?
Many thanks!!
Code:
pca form_b1 form_b2 form_b3 form_c1 form_c2 form_c3 form_d1 form_d2 form_d3 form_e1 form_e3 form_f1 form_f2 form_f3, comp(5)
Code:
Principal components/correlation Number of obs = 1,135
Number of comp. = 5
Trace = 14
Rotation: (unrotated = principal) Rho = 0.6257
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 3.50087 2.01768 0.2501 0.2501
Comp2 | 1.48319 .084154 0.1059 0.3560
Comp3 | 1.39904 .156531 0.0999 0.4559
Comp4 | 1.24251 .107735 0.0888 0.5447
Comp5 | 1.13477 .347163 0.0811 0.6257
Comp6 | .787612 .0950305 0.0563 0.6820
Comp7 | .692581 .00607674 0.0495 0.7315
Comp8 | .686504 .0411919 0.0490 0.7805
Comp9 | .645312 .0614412 0.0461 0.8266
Comp10 | .583871 .0308468 0.0417 0.8683
Comp11 | .553024 .0660023 0.0395 0.9078
Comp12 | .487022 .0411471 0.0348 0.9426
Comp13 | .445875 .0880659 0.0318 0.9744
Comp14 | .357809 . 0.0256 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
------------------------------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4 Comp5 | Unexplained
-------------+--------------------------------------------------+-------------
form_b1 | 0.2131 0.2332 -0.0044 -0.6302 0.0905 | .2576
form_b2 | 0.0620 0.3261 -0.1045 0.1312 0.7055 | .2274
form_b3 | 0.2687 -0.1544 -0.2096 -0.0530 0.3434 | .5132
form_c1 | 0.3268 0.2693 -0.2007 0.1626 -0.1420 | .4064
form_c2 | 0.3726 0.1053 0.0576 0.0959 -0.1454 | .4574
form_c3 | 0.3104 -0.1792 0.0396 -0.1310 -0.3348 | .4645
form_d1 | 0.1625 0.3111 0.3685 0.2940 -0.2593 | .3904
form_d2 | 0.2091 0.1813 0.4271 0.3283 0.1697 | .3762
form_d3 | 0.1524 -0.2816 -0.4357 0.4589 -0.0123 | .2737
form_e1 | 0.2759 -0.4481 0.1188 0.1738 0.2267 | .3201
form_e3 | 0.1924 -0.2250 0.5397 -0.1565 0.2336 | .2955
form_f1 | 0.3271 0.3670 -0.1767 0.0206 -0.0982 | .3706
form_f2 | 0.3728 -0.0133 -0.2265 -0.2494 0.0309 | .363
form_f3 | 0.2901 -0.3232 0.0632 -0.0887 -0.1005 | .5236
------------------------------------------------------------------------------
0 Response to Is PCA appropriate with binary data?
Post a Comment