Dear Statalist users,
I'm a PhD student and stata beginner, and probably my question would appear silly to many of you though I've been spending two days on the issue without sorting the problem out. I use a survey in which I have income deciles and electoral choice for every respondent. I have a score that I made in a previous analysis which I have to use in order to "weight" the actual percentages of votes for each party, in the sense that the vote of a given respondent will not weight 1 anymore but the equivalent of the score of the correspondent party he/she voted.
My objective is to make a line graph that indicates on the y axis the weigthed percentages of votes for parties together, on the x axis the income deciles. My code isn't working when the score is applied.
For drawing the unweighted graph for, let's say, the following two parties (M5S and LN) together, I wrote this code, where votes for parties are expressed in dummy variables (1 = yes; 0 = no):
**** LN
egen LN_income = total (LN), by(income)
egen LN_income_norm = count (LN), by(income)
gen LN_new = (LN_income/ LN_income_norm)
sort income
** error bar
gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)
** to plot
gen LN_new_low = LN_new - LN_new_error
gen LN_new_high = LN_new + LN_new_error
********* M5S
egen M5S_income = total (M5S), by(income)
egen M5S_income_norm = count (M5S), by(income)
gen M5S_new = (M5S_income/ M5S_income_norm)
sort income
** error bar
gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)
** to plot
gen M5S_new_low = M5S_new - M5S_new_error
gen M5S_new_high = M5S_new + M5S_new_error
**** sum of the two parties
gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
sort income
** error bar
gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
** to plot
gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error
*** final plot (basic form with no options)
line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income
When I apply the score, the code becomes:
******************************** LN
egen LN_income = total (LN), by(income)
egen LN_income_norm = count (LN), by(income)
gen LN_new = (LN_income/ LN_income_norm)* 0.78
sort income
** error bar
gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)
** to plot
gen LN_new_low = LN_new - LN_new_error
gen LN_new_high = LN_new + LN_new_error
***************************** M5S
egen M5S_income = total (M5S), by(income)
egen M5S_income_norm = count (M5S), by(income)
gen M5S_new = (M5S_income/ M5S_income_norm) *0.56
sort income
** error bar
gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)
** to plot
gen M5S_new_low = M5S_new - M5S_new_error
gen M5S_new_high = M5S_new + M5S_new_error
**************** sum of the two parties
gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
sort income
** error bar
gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
** to plot
gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error
*** final plot (basic form)
line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income
The problem is with normalization, because although the graph has a credible shape, normalised values are wrong. I logically got to the conclusion that there must be an extra passage that I am missing about normalising with the sum of the two parties*mean_score after obtaining the percentage*score, but I got lost on how to do this (provided I'm right).
I thank you in advance for your help,
J.
Related Posts with Normalization of variables with scores in graph line
Clique identification within larger networkI am interested in identifying cliques of entities (think of groups of common friends within a socia…
Different estimation when using interaction terms compared to manual multiplicationHello, I am getting different results when I use interaction terms with xtreg compared multiplying …
Divide population according to predefined predicted probabilities cut-offsDear all, I run a logistic regression model with predictors (age, diabetes, hypertension, tobacco)…
How can determine the best cut point with large number of observationI have data set contains 300 observation and when I am using the Roctab command to calculate sensiti…
Calculating ratios in panel dataI searched Stata help and this forum but found no solution. I want to calculate some simple ratios o…
Subscribe to:
Post Comments (Atom)
0 Response to Normalization of variables with scores in graph line
Post a Comment