![]() Understanding how the Decision Tree was built Thanks to the confusion matrix we can retrieve the accuracy : all the diagonal elements are the good predictions, 5+14+9=28, and all the predictions is all the squares, 5+14+2+9=30. Finally the bottom-right square shows that all the virginica irises have been classified as virginica. This is the reason why we don’t have a 100% accuracy. The second line shows that out of 16 versicolor irises 14 have been classified as versicolor and 2 have been mistaken for virginica. On the top-left square we can see that for the 5 setosa irises, the Decision Tree has predicted setosa for the species. The confusion matrix above is made up of two axes, the y-axis is the target, the true value for the species of the iris and the x-axis is the species the Decision Tree has predicted for this iris. The confusion matrix can help us.Ĭonfusion matrix of the Decision Tree on the testing set This metric is interesting but does not help us understand what the Decision Tree gets wrong. Without optimizing the hyperparameters (like the tree depth, minimum number of leaves in a node or to split a node…) and with only two features we already obtain 93% of accuracy on the testing set.Īccuracy is the number of good predictions over the number of predictions. ![]() ![]() Modeling and EvaluatingĪs you will have understood, the model chosen is a… (We then remove observations where there are duplicates for these features to be able to see every point on the graphs that we will plot to help our understanding). To ease our understanding of how a Decision Tree works we will only work on two features : petal width and sepal width. Preparing the dataset and feature selection In total, we have 150 observations (150 rows), 50 observations for each iris species : the dataset is balanced. The first 4 columns are the first 4 features that we will use to predict the target, the iris species, represented by the last column with numerical values : 0 for setosa, 1 for versicolor, 2 for virginica. On the picture above we can see the first 10 rows of the iris dataset.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |