Hi everyone.

I am working with data from 126 schools in rural Angola. I want to create a index for school infrastructure and use it in my regressions. My data looks as following:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long school_id float pc15 byte chalkboards15 float bathrooms15 int I_desks15 byte I_classrooms15
  1 0 17 1  21  9
  2 0 16 2  18  8
  3 0 21 0   5 13
  4 0  9 9   5  0
  5 1 10 0   6  5
  6 1 12 8  16 12
  7 1 11 4  15  9
  8 0  7 2 120  3
  9 1  2 2   2  2
 10 1 28 4  12 23
 11 0  6 1   9  6
 12 0 36 2   7  7
 13 1 13 4   4 11
 14 1  7 3   3  3
 15 0 13 1   6  2
 16 0 10 1   8 10
 17 0  8 0   5  3
 18 0 10 0   5  3
 19 0 34 2  17 13
 20 0  4 2   6  2
 21 0 25 0   3  3
 22 1 16 2  14 11
 23 0  5 0 100  3
 24 1  9 2   8  5
 25 0  5 0 119  3
 26 0 14 0   1  2
 27 0  4 0   3  2
 28 0  3 2 120  3
 29 0 12 4  20  9
 30 0  0 0   0  0
 31 0  2 0  82  2
 32 0  3 2   3  3
 33 0 20 1  14  9
 34 0 10 2  14  6
 35 0  8 2  15  8
 36 0  6 4   6  6
 37 0  6 2   3  5
 38 1 13 0   3 10
 39 0  7 2   5  5
 40 0 14 0 139  4
 41 0  8 0   2  1
 42 0  6 0   1  0
 43 0 15 2   2  3
 44 0  3 2   3  3
 45 0 13 2 300 13
 46 0  6 0   4  3
 47 0  7 2   5  6
 48 0  3 2   0  3
 49 0  3 2   4  3
 50 0  6 0   2  3
 51 0  6 2   3  3
 52 0  8 2   1  3
 53 0  5 2   4  3
 54 0  3 2   0  3
 55 0  9 0   3  3
 56 0  4 2   0  2
 57 0  7 2   3  3
 58 0  4 2  62  2
 59 0  5 2   2  3
 60 0  4 2   4  2
 61 0 12 2 404 11
 62 0  5 0   3  3
 63 0  2 2   1  2
 64 0  2 2   0  2
 65 0  8 2   4  2
 66 0  6 0   4  0
 67 0  6 1   3  2
 68 0  6 1   5  3
 69 0 12 2   9  3
 70 0  3 2   1  2
 71 0 10 2   6  5
 72 0  3 0   1  0
 73 0  5 2   4  3
 74 0  8 2   2  6
 75 0  6 0   3  0
 76 0  6 2 270  6
 77 0  6 0   2  0
 78 0  7 0   5  0
 79 0  9 0   0  1
 80 0  4 1   4  4
 81 1  8 2  25  7
 82 0  8 0   1  7
 83 0 16 2   8  3
 84 0  4 2   5  3
 85 0  6 2   1  5
 86 0  4 0 120  3
 87 0 18 4   8 18
 88 0 11 0   5  3
 89 0  9 2   9  7
 90 0  8 1   7  6
 91 0  6 2   6  6
 92 0 13 4  13 11
 93 0 14 4   7  4
 94 0  2 0  75  2
 95 0  7 0   3  2
 96 0  3 2   5  3
 97 0 10 2   6  8
 98 0  3 0 120  3
 99 0  8 2   7  4
100 0  2 0   0  1
end
I standarsized all the measures of school infrastructure that I want to include and I used the command -predict- in order to create my Index. Some of the variables included are dummy variables, but since I standartized them all, they are all centered at zero. However, I am new to the concept of PCA and I am not sure what I am doing in STATA is correct. I am using the following code:

local measures "std_I_water15 std_I_electricity15 std_bathrooms15 std_I_chairs15 std_I_classrooms15"
pca measures
predict indexpca15


pca std_I_water15 std_I_electricity15 std_bathrooms15 std_I_chairs15 std_I_classrooms15

Principal components/correlation Number of obs = 126
Number of comp. = 5
Trace = 5
Rotation: (unrotated = principal) Rho = 1.0000

--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 1.6917 .398078 0.3383 0.3383
Comp2 | 1.29363 .479866 0.2587 0.5971
Comp3 | .813761 .12548 0.1628 0.7598
Comp4 | .688281 .175654 0.1377 0.8975
Comp5 | .512627 . 0.1025 1.0000
--------------------------------------------------------------------------

Principal components (eigenvectors)

------------------------------------------------------------------------------
Variable | Comp1 Comp2 Comp3 Comp4 Comp5 | Unexplained
-------------+--------------------------------------------------+-------------
std_I_wat~15 | 0.2529 0.6278 0.4259 0.5311 -0.2802 | 0
std_I_ele~15 | 0.5307 0.2801 0.3070 -0.6114 0.4146 | 0
std_bathr~15 | 0.5189 0.0835 -0.6795 0.3800 0.3430 | 0
std_I_cha~15 | 0.2406 -0.6324 0.5076 0.4074 0.3442 | 0
std_I_cla~15 | 0.5720 -0.3472 -0.0702 -0.1837 -0.7166 | 0
------------------------------------------------------------------------------


Q1. Do I need to -rotate- the PCA; if yes, what is the interpretation for the rotation?

Q2. Once I run the code, I obtain the unexplained variance always equal to zero; does this make sense or I am doing something uncorrect?

Q3. Would you suggest a different iter to obtain the PCA?