What is Ensemble Learning? How many types of ensemble methods are there?

Medium Last updated on May 7, 2022, 1:27 a.m.

An ensemble is simply a collection of models that are all trained to perform the same task. It can consist of many different versions of the same model, or many different types of models. The final output of an ensemble is typically obtained through a (weighted) average of values or vote of the predictions of the different models in the ensemble.

It has been proven mathematically that an ensemble of different models that all achieve similar generalization performance often outperforms any of the individual models as it reduces the variance in prediction outcomes. For example, suppose we have an ensemble of binary classification functions fk(x) for k=1,2,..K; let us also assume they all have the same expected error < 0.5, but these errors are all independent. The main intuition of ensembles is that the majority of the K classifiers in the ensemble will be correct on many examples where any individual classifier makes an error. A simple majority vote can significantly improve classification performance by decreasing variance.

There are three main types of ensemble learning methods are bagging, stacking, and boosting.

Markdown Monster icon


Bootstrap aggregation or Bagging is an approximation to the previous method that takes a single training set Tr and randomly sub-samples from it K times (with replacement) to form K training sets Tr1, Tr2, .., Trk. Each of these training sets is used to train a different instance of the same classifier obtaining K classification functions $$ f_{1}(x), f_{2}(x), ..f_{K}(x) $$

Note that the errors observed here by classification functions won’t be totally independent because the data sets aren’t independent, but the random re-sampling usually introduces enough diversity to decrease the variance and give improved performance.

Usage of Bagging:

  1. Bagging is particularly useful for high-variance, high-capacity models.
  2. It has been generally associated with decision trees. One such widely used bagging approach is known as random forest.
  3. The random forests algorithm further decorrelates the learned trees by only considering a random subset of the available features when deciding which variable to split on at each node in the tree.


Boosting is an ensemble method based on iteratively re-weighting the data set instead of randomly resampling it. The main idea is to up-weight the importance of data cases that are misclassified by the classifiers currently in the ensemble, and then add a next classifier that will focus on data cases that are causing the errors.


Unlike bagging and boosting, stacking is an algorithm for combining several different types of models. The main idea is to form a train-validation-test split and train many classifiers on the training data. Then the trained classifiers are used to make predictions on the validation data set and a new feature representation is then created where each data case consists of the vector of predictions of each classifier in the ensemble. Finally, a meta-classifier called a combiner is trained to minimize the validation error given the data. The extra layer of combiner learning can deal with correlated classifiers as well as classifiers that perform poorly.