I am trying to understand the relationship between two variables with non-parametric regressions using commands npregress, lpoly, and lowess. Are they all considered to be kernel regressions?
As far as I understand:
(1) All of them fit local regressions at each point (ie, observation) based on a neighbourhood of points (within the chosen bandwidth). The further away from the observation in question, the less weight the data contribute to that regression. This makes the resulting function smooth when they are added together.
(2) The main difference between -lpoly- and -lowess- and -npregress- is that the -lowess- and -npregress- fits linear regressions or local means while -lpoly- fit polynomial regressions (i.e., you can choose the degree of the polynomial). Therefore, lpoly seems more general.
(3) Besides, there are some differences in terms of the default bandwidth and whether more than one explanatory variable can be included. Are there any other (important) differences I am missing out on?
I have been trying all the three commands with the same regression specification. The -lpoly- regression has proven to be a lot faster with my data, which I do not understand why given that this seemed to be the most general estimator (see item (2) above). The specification with command -npregress- has been taking forever: it has been 30 hours and the command is still running (I have 14 million obs). I ran the same specification with -lpoly- and got the result in less than 2 hours. I have also been running the same specification with -lowess- for the past 5 hours and still have no result.
Is there any way in which I could speed up the estimation with -npregress- and -lowess-? I am only interested in the prediction and not on standard errors.
Many thanks
Paula
Related Posts with Non-parametric regression estimations to understand relationship between 2 variables: npregress, lpoly vs lowess
Multiple if conditions within loopDear Stata users, I have an unbalanced panel dataset where I tag all lags for each firm/year pair (…
Finite mixture models (FFMDear Stata users, I am in need of help, Does anyone know how to calculate the density properties (m…
How to compare list of dates to a reference date within each observationI am using Stata/SE 15.1 with Windows OS on a dataset with 6,650 observations and 2,341 variables. I…
Exponential moving average (tssmooth exponential) over last X observationsHi! I do not fully understand the "tssmooth exponential" command. How exactly is the value calculate…
How to juxtapose dot graphs for two groups of a single categorical variable? (Example in post)Hello everyone! I made a dot graph using the command "graph dot", to show the division of household…
Subscribe to:
Post Comments (Atom)
0 Response to Non-parametric regression estimations to understand relationship between 2 variables: npregress, lpoly vs lowess
Post a Comment