I am working with data for Indian schools. I intend to create an Index by combining several variables(availability of library, no. of computers in the school, no. of toilets in the school) that can reflect the net infrastructure available in the school. I wish to use this index later in my regression specification.

I performed a pca to construct this index. However, my first principal component only explained around 23% of the variation in the data. The second component also explained around 20% variation. I read that 23% is quite low and the first component cannot directly be used as the index in such cases.

Can somebody recommend what shall I do in such a case. Is there any other way to construct the index. Is it sensible to combine the first two components. If yes, how can I do it?