Dear Statalist users,
I'm a PhD student and stata beginner, and probably my question would appear silly to many of you though I've been spending two days on the issue without sorting the problem out. I use a survey in which I have income deciles and electoral choice for every respondent. I have a score that I made in a previous analysis which I have to use in order to "weight" the actual percentages of votes for each party, in the sense that the vote of a given respondent will not weight 1 anymore but the equivalent of the score of the correspondent party he/she voted.
My objective is to make a line graph that indicates on the y axis the weigthed percentages of votes for parties together, on the x axis the income deciles. My code isn't working when the score is applied.
For drawing the unweighted graph for, let's say, the following two parties (M5S and LN) together, I wrote this code, where votes for parties are expressed in dummy variables (1 = yes; 0 = no):
**** LN
egen LN_income = total (LN), by(income)
egen LN_income_norm = count (LN), by(income)
gen LN_new = (LN_income/ LN_income_norm)
sort income
** error bar
gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)
** to plot
gen LN_new_low = LN_new - LN_new_error
gen LN_new_high = LN_new + LN_new_error
********* M5S
egen M5S_income = total (M5S), by(income)
egen M5S_income_norm = count (M5S), by(income)
gen M5S_new = (M5S_income/ M5S_income_norm)
sort income
** error bar
gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)
** to plot
gen M5S_new_low = M5S_new - M5S_new_error
gen M5S_new_high = M5S_new + M5S_new_error
**** sum of the two parties
gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
sort income
** error bar
gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
** to plot
gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error
*** final plot (basic form with no options)
line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income
When I apply the score, the code becomes:
******************************** LN
egen LN_income = total (LN), by(income)
egen LN_income_norm = count (LN), by(income)
gen LN_new = (LN_income/ LN_income_norm)* 0.78
sort income
** error bar
gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)
** to plot
gen LN_new_low = LN_new - LN_new_error
gen LN_new_high = LN_new + LN_new_error
***************************** M5S
egen M5S_income = total (M5S), by(income)
egen M5S_income_norm = count (M5S), by(income)
gen M5S_new = (M5S_income/ M5S_income_norm) *0.56
sort income
** error bar
gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)
** to plot
gen M5S_new_low = M5S_new - M5S_new_error
gen M5S_new_high = M5S_new + M5S_new_error
**************** sum of the two parties
gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
sort income
** error bar
gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
** to plot
gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error
*** final plot (basic form)
line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income
The problem is with normalization, because although the graph has a credible shape, normalised values are wrong. I logically got to the conclusion that there must be an extra passage that I am missing about normalising with the sum of the two parties*mean_score after obtaining the percentage*score, but I got lost on how to do this (provided I'm right).
I thank you in advance for your help,
J.
Related Posts with Normalization of variables with scores in graph line
Spmatrix command and its memory needDear All, I am using Stata 16. I try to run spatial regressions with sp commands. I am using an eco…
How to estimate a monthly treatment effect in fixed effect panel data model ?Dear statalist, I am trying to estimate a fixed-effect model to evaluate the effect of program inte…
significance of dynamic multiplierhow can we understand that dm is significant by looking at dm table? below figure is taken as an exc…
Reading multiple files with infix #delimHi, I am a beginner in Stata, and I'm trying to do read a number of files to form a dataset (and pos…
The modulus -mod()- function is giving me negative values. Is this a bug?The help for the modulus function, -mod()- reads: mod(x,y) Description: the modulus of x with respe…
Subscribe to:
Post Comments (Atom)
0 Response to Normalization of variables with scores in graph line
Post a Comment