Hi Statalist,

I’m new to spatial regression and am using the xsmle command to regress sex ratios at birth across Indian states. My goal is to measure cultural spillover effects regarding son preference and sex selection throughout the country.

I’m using both SDM and SAR models with spatial lag of y, temporal lag of y, and spatial-temporal lag of y. I began with a simple inverse distance weighting matrix, which seemed to work just fine. When I instead create my own weighting matrix, the regression results vary wildly depending on whether and how I normalize the weights matrix (row normalized versus spectral versus not normalized). The weights in the matrix take the form w_ij = [Pop_i^a * Pop_j^b ] / Dist_ij^c, where Pop_i is population of state i, Pop_j is population of state j, Dist_ij is distance between the centroids of states i and j based on latitude and longitude coordinates, and the parameters a, b, and c are estimated using a gravity equation such that w_ij is migrants coming from state j to state i. I am using the ppml command to estimate the gravity equation, then spmat to create the matrix. Unlike the idistance matrix, this one is asymmetric since migration from state i to state j will differ from migration from state j to state i, and so a and b are not equal.

Specifically, the problem is that depending on whether and how I normalize my weights matrix, the regression produces autoregressive parameter estimates that raise red flags:

1) When I normalize the weight matrix, the parameter on the spatial lag is much larger than 1 in the SDM but not SAR model. Why/how can the spatial lag parameter be greater than 1? I don’t believe that there is an explosive process here, and this outcomes is also inconsistent. When I don't normalize, the parameter is less than 1, unless I include a state-time trend into the mode, so again, not consistent results.

2) Secondly, the parameters on the temporal lag and the spatial-temporal lag switch between positive and negative depending on matrix normalization. While it is theoretically possible that these lags would be negative, the sensitivity of these outcomes to the weighting matrix is concerning.

I really appreciate any insights on this!