CIS 241, Dr. Ladd
🌳🌲🌳🌲
But we will focus on their more common use as classifiers!
But they are not so reliable one-at-a-time!
And what do you call a lot of trees? A forest!
You can see all the metaphors here: a forest, a musical ensemble, etc.
The decision trees are put together using “bagging”: bootstrap aggregating.
This is referred to as “variable importance” and takes advantage of decision trees’ skill at finding patterns in the data.
min_samples_leaf
: the minimum number
of records in a terminal node (leaf)max_leaf_nodes
: the maximum number of
nodes in the entire treeSetting these can help you create smaller trees and avoid spurious results!
By now, you’re equipped to find out how to do this on your own, so let’s try an example.
Here’s a hint:
penguins
dataset.species
of the penguins.(continued on next slide)
Good luck! 🌲🌳🌲🌳