I am working with data from 126 schools in rural Angola. I want to create a index for school infrastructure and use it in my regressions. My data looks as following:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long school_id float pc15 byte chalkboards15 float bathrooms15 int I_desks15 byte I_classrooms15 1 0 17 1 21 9 2 0 16 2 18 8 3 0 21 0 5 13 4 0 9 9 5 0 5 1 10 0 6 5 6 1 12 8 16 12 7 1 11 4 15 9 8 0 7 2 120 3 9 1 2 2 2 2 10 1 28 4 12 23 11 0 6 1 9 6 12 0 36 2 7 7 13 1 13 4 4 11 14 1 7 3 3 3 15 0 13 1 6 2 16 0 10 1 8 10 17 0 8 0 5 3 18 0 10 0 5 3 19 0 34 2 17 13 20 0 4 2 6 2 21 0 25 0 3 3 22 1 16 2 14 11 23 0 5 0 100 3 24 1 9 2 8 5 25 0 5 0 119 3 26 0 14 0 1 2 27 0 4 0 3 2 28 0 3 2 120 3 29 0 12 4 20 9 30 0 0 0 0 0 31 0 2 0 82 2 32 0 3 2 3 3 33 0 20 1 14 9 34 0 10 2 14 6 35 0 8 2 15 8 36 0 6 4 6 6 37 0 6 2 3 5 38 1 13 0 3 10 39 0 7 2 5 5 40 0 14 0 139 4 41 0 8 0 2 1 42 0 6 0 1 0 43 0 15 2 2 3 44 0 3 2 3 3 45 0 13 2 300 13 46 0 6 0 4 3 47 0 7 2 5 6 48 0 3 2 0 3 49 0 3 2 4 3 50 0 6 0 2 3 51 0 6 2 3 3 52 0 8 2 1 3 53 0 5 2 4 3 54 0 3 2 0 3 55 0 9 0 3 3 56 0 4 2 0 2 57 0 7 2 3 3 58 0 4 2 62 2 59 0 5 2 2 3 60 0 4 2 4 2 61 0 12 2 404 11 62 0 5 0 3 3 63 0 2 2 1 2 64 0 2 2 0 2 65 0 8 2 4 2 66 0 6 0 4 0 67 0 6 1 3 2 68 0 6 1 5 3 69 0 12 2 9 3 70 0 3 2 1 2 71 0 10 2 6 5 72 0 3 0 1 0 73 0 5 2 4 3 74 0 8 2 2 6 75 0 6 0 3 0 76 0 6 2 270 6 77 0 6 0 2 0 78 0 7 0 5 0 79 0 9 0 0 1 80 0 4 1 4 4 81 1 8 2 25 7 82 0 8 0 1 7 83 0 16 2 8 3 84 0 4 2 5 3 85 0 6 2 1 5 86 0 4 0 120 3 87 0 18 4 8 18 88 0 11 0 5 3 89 0 9 2 9 7 90 0 8 1 7 6 91 0 6 2 6 6 92 0 13 4 13 11 93 0 14 4 7 4 94 0 2 0 75 2 95 0 7 0 3 2 96 0 3 2 5 3 97 0 10 2 6 8 98 0 3 0 120 3 99 0 8 2 7 4 100 0 2 0 0 1 end
local measures "std_I_water15 std_I_electricity15 std_bathrooms15 std_I_chairs15 std_I_classrooms15"
pca measures
predict indexpca15
pca std_I_water15 std_I_electricity15 std_bathrooms15 std_I_chairs15 std_I_classrooms15
Principal components/correlation Number of obs = 126
Number of comp. = 5
Trace = 5
Rotation: (unrotated = principal) Rho = 1.0000
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 1.6917 .398078 0.3383 0.3383
Comp2 | 1.29363 .479866 0.2587 0.5971
Comp3 | .813761 .12548 0.1628 0.7598
Comp4 | .688281 .175654 0.1377 0.8975
Comp5 | .512627 . 0.1025 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
------------------------------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4 Comp5 | Unexplained
-------------+--------------------------------------------------+-------------
std_I_wat~15 | 0.2529 0.6278 0.4259 0.5311 -0.2802 | 0
std_I_ele~15 | 0.5307 0.2801 0.3070 -0.6114 0.4146 | 0
std_bathr~15 | 0.5189 0.0835 -0.6795 0.3800 0.3430 | 0
std_I_cha~15 | 0.2406 -0.6324 0.5076 0.4074 0.3442 | 0
std_I_cla~15 | 0.5720 -0.3472 -0.0702 -0.1837 -0.7166 | 0
------------------------------------------------------------------------------
Q1. Do I need to -rotate- the PCA; if yes, what is the interpretation for the rotation?
Q2. Once I run the code, I obtain the unexplained variance always equal to zero; does this make sense or I am doing something uncorrect?
Q3. Would you suggest a different iter to obtain the PCA?
0 Response to Pincipal Component Analysis Index
Post a Comment