Dear Statalist users,
I'm a PhD student and stata beginner, and probably my question would appear silly to many of you though I've been spending two days on the issue without sorting the problem out. I use a survey in which I have income deciles and electoral choice for every respondent. I have a score that I made in a previous analysis which I have to use in order to "weight" the actual percentages of votes for each party, in the sense that the vote of a given respondent will not weight 1 anymore but the equivalent of the score of the correspondent party he/she voted.
My objective is to make a line graph that indicates on the y axis the weigthed percentages of votes for parties together, on the x axis the income deciles. My code isn't working when the score is applied.
For drawing the unweighted graph for, let's say, the following two parties (M5S and LN) together, I wrote this code, where votes for parties are expressed in dummy variables (1 = yes; 0 = no):
**** LN
egen LN_income = total (LN), by(income)
egen LN_income_norm = count (LN), by(income)
gen LN_new = (LN_income/ LN_income_norm)
sort income
** error bar
gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)
** to plot
gen LN_new_low = LN_new - LN_new_error
gen LN_new_high = LN_new + LN_new_error
********* M5S
egen M5S_income = total (M5S), by(income)
egen M5S_income_norm = count (M5S), by(income)
gen M5S_new = (M5S_income/ M5S_income_norm)
sort income
** error bar
gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)
** to plot
gen M5S_new_low = M5S_new - M5S_new_error
gen M5S_new_high = M5S_new + M5S_new_error
**** sum of the two parties
gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
sort income
** error bar
gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
** to plot
gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error
*** final plot (basic form with no options)
line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income
When I apply the score, the code becomes:
******************************** LN
egen LN_income = total (LN), by(income)
egen LN_income_norm = count (LN), by(income)
gen LN_new = (LN_income/ LN_income_norm)* 0.78
sort income
** error bar
gen LN_new_error = sqrt(LN_new*(1 - LN_new)/LN_income_norm)
** to plot
gen LN_new_low = LN_new - LN_new_error
gen LN_new_high = LN_new + LN_new_error
***************************** M5S
egen M5S_income = total (M5S), by(income)
egen M5S_income_norm = count (M5S), by(income)
gen M5S_new = (M5S_income/ M5S_income_norm) *0.56
sort income
** error bar
gen M5S_new_error = sqrt(M5S_new*(1 - M5S_new)/M5S_income_norm)
** to plot
gen M5S_new_low = M5S_new - M5S_new_error
gen M5S_new_high = M5S_new + M5S_new_error
**************** sum of the two parties
gen sum_M5S_LN = (M5S_income/ M5S_income_norm) + (LN_income/ LN_income_norm)
gen sum_M5S_LN_income_norm = M5S_income_norm + LN_income_norm
sort income
** error bar
gen sum_M5S_LN_error = sqrt(sum_M5S_LN*(1 - sum_M5S_LN)/sum_M5S_LN_income_norm)
** to plot
gen sum_M5S_LN_low = sum_M5S_LN - sum_M5S_LN_error
gen sum_M5S_LN_high = sum_M5S_LN + sum_M5S_LN_error
*** final plot (basic form)
line sum_M5S_LN income|| rcap sum_M5S_LN_low sum_M5S_LN_high income
The problem is with normalization, because although the graph has a credible shape, normalised values are wrong. I logically got to the conclusion that there must be an extra passage that I am missing about normalising with the sum of the two parties*mean_score after obtaining the percentage*score, but I got lost on how to do this (provided I'm right).
I thank you in advance for your help,
J.
Related Posts with Normalization of variables with scores in graph line
Average and Median values by groupDear All Will appreciate if you can help me on the following query. The following data shows firms…
Maki's CointegrationI was wondering if anyone has written an ado file for Maki's cointegration: Maki, Daiki, (2012), Tes…
How can I select observations containing specific words?Dear all, I need to pick persons A & B whose position include exact "Independent" from the list…
Problem with specifying a range of year or listing yearsHello, I am quite new to Stata and am trying to analyse some time series data but am having some is…
Sorted bar graphs by year for multiple variables Code: * Example generated by -dataex-. To install: ssc install dataex clear input str32 country in…
Subscribe to:
Post Comments (Atom)
0 Response to Normalization of variables with scores in graph line
Post a Comment