Hey all,
I wrote a couple Stata packages to do regression or classification with random forests and neural networks (specifically, multi-layer perceptrons) in Stata 16. These programs are basically wrappers for methods in the popular Python library scikit-learn. The packages will automatically load the required Stata variables into Python, use some scikit-learn methods on the data, and return predictions and other information to Stata's interface. This is essentially an expanded version of the example .ado file provided in Stata's release notes for the new Stata Function Interface.
I split these into two separate packages:
1. pyforest.ado - regression and classification with random forests
2. pymlp.ado - regression and classification with multi-layer perceptrons
The syntax for specifying optional arguments is nearly identical to the syntax used in scikit-learn. This means that the scikit-learn documentation is also a readable reference for using these packages. Of course, both of these packages also contain built-in Stata help files.
You can read a bit more about these packages and install them with instructions on GitHub:
https://github.com/mdroste/stata-pyforest
https://github.com/mdroste/stata-pymlp
I am still actively developing both of these packages, and I plan to submit them to SSC very soon. I am sure there are some bugs that will need to be fixed before then, since I put both of them together over the last two days or so. There's a whole bunch of stuff that I think should be added, but since both seem to be very much usable right now, I figured it's worth posting what I have for now.
If you have any issues with these packages, definitely let me know either on this thread or on Github.
I hope this is useful!
Mike
Related Posts with Random forests and neural nets in Stata (with Python integration)
"cannot open C:\Program Files (x86)\Stata13\inter.txt" , what i can do to open the matrix file?i create a distance matrix using spmat command, and when i export the following matrix to a text for…
Correlate scales from different sources when data is in long formatHi My data is in long format and consists of six sub-scales (Scale1-Scale6) and three different res…
rolling geometric meanHi, Is there a simple way to calculate rolling geometric means, for example 10 year window? Can I u…
Ordinal Data: CFA treating data accordingly vs continuouslyI have ordinal, that is to say Likert-type of data, with four response categories. Bandalos (2018) …
xtcsd Command ErrorHi Dear, While running xtcsd, pesaran abs I get the following error unknown egen function group() …
Subscribe to:
Post Comments (Atom)
0 Response to Random forests and neural nets in Stata (with Python integration)
Post a Comment