Hi,
I am writing this post to ask whether there is any Stata command (or doing manually) to apply the random forest (or other machine learning algorithm) to time-series data.
I have a household-level panel survey data; collected from 2,800 households for 10 years (thus 28,000 observations) and has 130 variables.
My goal is to select the features from 100 right-hand side variables using machine learning to find the model that best predicts the continuous outcome variable.
The problem is that, machine learning requires training data and test data to be independent, which is obviously violated in serially correlated time-series data.
I had no problem running lasso; I used "cvlasso", a user-written package that allows users to run LASSO with time-series data by properly cross-validating it.
However, I have not find any Stata command which runs random forest with time-series data. I checked "rforest" and "chaidforest", but neither seems to deal with auto-correlated time series data.
Is there any Stata command, or manual way to run random forest with time series data?
Or more in general, is there any machine learning algorithm other than lasso and random frorest, but works well with time series data? I just need one more machine learning algorithm to run so I can compare it with lasso.
Thank you.
System: Windows 10
Stata Version: 16.1 MP
Related Posts with How to Run Random Forest for Time-Series Data
Interpolation on null values with the most frequent valueHello. In a dataset like this I need to replace the null values of x1 with the most frequent value o…
Detecting careless responses in STATAHello, I wonder if anyone could share any tips or tricks for detecting careless responses (unengag…
Asking for definition of various types of treatment?Borusyak, 2021 has a sentence We further discuss the implications of our results when treatment …
Solution for inlist - expression too longHi Statalisters, I just would like to share a solution for the inlist limit of 10 string arguments,…
Interaction intepretationDear Statalist: I have the following question on how to interpret the results of the ## interaction…
Subscribe to:
Post Comments (Atom)
0 Response to How to Run Random Forest for Time-Series Data
Post a Comment