Hey all,
I wrote a couple Stata packages to do regression or classification with random forests and neural networks (specifically, multi-layer perceptrons) in Stata 16. These programs are basically wrappers for methods in the popular Python library scikit-learn. The packages will automatically load the required Stata variables into Python, use some scikit-learn methods on the data, and return predictions and other information to Stata's interface. This is essentially an expanded version of the example .ado file provided in Stata's release notes for the new Stata Function Interface.
I split these into two separate packages:
1. pyforest.ado - regression and classification with random forests
2. pymlp.ado - regression and classification with multi-layer perceptrons
The syntax for specifying optional arguments is nearly identical to the syntax used in scikit-learn. This means that the scikit-learn documentation is also a readable reference for using these packages. Of course, both of these packages also contain built-in Stata help files.
You can read a bit more about these packages and install them with instructions on GitHub:
https://github.com/mdroste/stata-pyforest
https://github.com/mdroste/stata-pymlp
I am still actively developing both of these packages, and I plan to submit them to SSC very soon. I am sure there are some bugs that will need to be fixed before then, since I put both of them together over the last two days or so. There's a whole bunch of stuff that I think should be added, but since both seem to be very much usable right now, I figured it's worth posting what I have for now.
If you have any issues with these packages, definitely let me know either on this thread or on Github.
I hope this is useful!
Mike
Related Posts with Random forests and neural nets in Stata (with Python integration)
Complex survey settingsHi, I am trying to set-up a complex survey data set in Stata and Mplus so that they have the same s…
'No observations' errorDear all, I'm trying to conduct a unit root test to my variables using the following code: Code: …
Superscript and subscriptDear Stata users, I have a variable with both superscript and subscript. When I wrote both superscr…
Comparison of predicted values after OLS regressionDear all, I am interested in predicted values after an OLS regression. Moreover, I am curious how t…
Reshaping Datastream DataHello community. first of all I would like to thank you for the inclusion in this really instructive…
Subscribe to:
Post Comments (Atom)
0 Response to Random forests and neural nets in Stata (with Python integration)
Post a Comment