Wednesday, March 13, 2013
New package for ensembling R models
I've written a new R package called caretEnsemble for creating ensembles of caret models in R. It currently works well for regression models, and I've written some preliminary support for binary classification models.
At this point, I've got 2 different algorithms for combining models:
1. Greedy stepwise ensembles (returns a weight for each model)
2. Stacks of caret models
(You can also manually specify weights for a greedy ensemble)
The greedy algorithm is based on the work of Caruana et al., 2004, and inspired by the medley package here on github. The stacking algorithm simply builds a second caret model on top of the existing models (using their predictions as input), and employs all of the flexibility of the caret package.
All the models in the ensemble must use the same training/test folds. Both algorithms use the out-of-sample predictions to find the weights and train the stack. Here's a brief script demonstrating how to use the package:
Please feel free to submit any comments here or on github. I'd also be happy to include any patches you feel like submitting. In particular, I could use some help writing support for multi-class models, writing more tests, and fixing bugs.
Posted by Zachary Mayer at 7:36 AM