I have a survey data on 10000 delivery person. I have number of delivery they made in 49 neighborhoods (that is 49 columns plus 01 as "other neighborhoods") and one column as their earning per hour (EPH). I am trying to explain that a person's EPH depends on his choice of neighborhoods to work in.
For initial exploration, I divided the 10000 into 10 groups (based on EPH, low to high) and plotted the fraction of delivery they made in different neighborhoods. It is clear from the graph that high earning groups pick the rich neighborhoods more. (I arbitrarily picked a neighborhood that I know rich people live in and a neighborhood I know to be poor. Also, I picked "fraction of delivery made in neighborhood X" instead of "total delivery" because different delivery person has different years of experience).
My question is how do I statistically model the variation in EPH due to choice of neighborhoods? What kind of regression model should I be studying? How do I deal with the fact that neighborhoods are spatially correlated?
Additionally,
1) I have data from the delivery company which tells me true EPH per neighborhood, aggregated over ALL delivery for ALL delivery person. That is 125 rows for 125 neighborhoods and 2 columns with EPH and Total Delivery. How do I incorporate this information in my modelling?
2) I have the Well-Known Text representation of the neighborhoods.
Thanks!
Related Posts with How to model "Sparial Variability" or "Choice of Location"
Generating Variable Based on Multiple ConditionsHi all, I am currently working on a stata dataset in regards to institutional holdings. I would lik…
Twoway Line Chart for Regression Dummy CoefficientsDear all, Suppose I have a large dataset that comprises the age and income of Individuals. Now assu…
interpolate between two year dates using expandHi all, For my panel data, I want to linearly interpolate between two years. I have data for the pe…
Random sample panel datahello guys! My data consist of unbalanced Panel data with the identifier gvkey_num fyear, fyear con…
Rdplot confidence interval not showingHi all, I have the following code: rdplot aux1hm aux1dm if agech >= `agelow' & agech <= `…
Subscribe to:
Post Comments (Atom)
0 Response to How to model "Sparial Variability" or "Choice of Location"
Post a Comment