Hello,

I have just started using Stata for my econometrics course. I am trying to build an OLS model examining determinants of income. As explanatory variables I chose education (recoded in 4 dummy variables according to the level of educ obtained), place of living (recoded into 3 dummy variables according to the size of the place of living, i.e 1=countryside, 2=town, 3=big city), gender (0= male, 1=female), if married (0=no, 1=yes) and age (I am taking into account age>18). Also, by looking at the distribution of income ( in thousands) I know it is worth considering log of income, but it barely changed my analysis (if changed anything ). All of the chosen explanatory variables seem reasonable, however when I try to plot for instance income and education, I obtain such result:
Array
When I try to plot education and place of living, I obtain more or less the same graph. However, when I examine education or any other variable with tab command, all the data looks fine. I am overhelmed by the task since I have just started using this app. I would be extremely grateful for every piece of advice on how to fix this issue and patience- I am very eager to learn but I clearly have some trouble understanding what I am doing. I hope it is easier to solve than it seems
If needed, here is how I created my dummy variables:

**THE PLACE OF LIVING**
recode domicil 1=5 2=4 4=2 5=1
label variable domicil "pl_living"
label define pl_living 1 "Farm or home in countryside" 2 "Country village" 3 "Town or small city" 4 "Suburbs or outskirts of big city" 5 "big city"
label values domicil pl_living
codebook domicil
replace domicil=1 if domicil>=1 & domicil<=2 //countryside
replace domicil=2 if domicil>=3 & domicil<=4 //town
replace domicil=3 if domicil==5 //big city
tab domicil, gen(domicil)


**THE LEVEL OF EDUCATION BASED ON POLISH SCHOOLING SYSTEM, NO EDUCATION DROPPED**
drop if edlvgpl==1

replace edlvgpl =1 if edlvgpl==2 | edlvgpl==3 | edlvgpl==4
replace edlvgpl =2 if edlvgpl==6 | edlvgpl==5
replace edlvgpl =3 if edlvgpl==7 | edlvgpl==8 | edlvgpl==9 | edlvgpl==10 | edlvgpl==11 | edlvgpl==12
replace edlvgpl =4 if edlvgpl==13 | edlvgpl==14 | edlvgpl==15
tab edlvgpl, gen(edlvgpl)