Introduction to Machine Learning – Decision Tree Classifiers

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()

new_example

	ml_experience	class_attendance	lab1	lab2	lab3	lab4	quiz1
0	1	0	1	1	0	0	0

model.predict(new_example)

NotFittedError: This DecisionTreeClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

Detailed traceback: 
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.12/site-packages/sklearn/tree/_classes.py", line 529, in predict
    check_is_fitted(self)
  File "/usr/local/lib/python3.12/site-packages/sklearn/utils/validation.py", line 1754, in check_is_fitted
    raise NotFittedError(msg % {"name": type(estimator).__name__})

X_binary.head()

	ml_experience	class_attendance	lab1	lab2	lab3	lab4	quiz1
0	1	1	1	1	0	1	1
1	1	0	1	1	0	0	1
2	0	0	0	0	0	0	0
3	0	1	1	1	1	1	0
4	0	1	0	0	1	1	0

y.head()

0        A+
1    not A+
2    not A+
3        A+
4        A+
Name: quiz2, dtype: object

model.fit(X_binary, y);

new_example

	ml_experience	class_attendance	lab1	lab2	lab3	lab4	quiz1
0	1	0	1	1	0	0	0

(model.predict(new_example)[0])

'not A+'

model.score(X_binary, y)

0.9047619047619048

How does predict work?

observation

	ml_experience	class_attendance	lab1	lab2	lab3	lab4	quiz1
0	1	0	1	1	0	1	1

How does fit work

Which features are most useful for classification?
Minimize impurity at each question/node
Common criteria to minimize impurity
- Gini Index
- Information gain
- Cross entropy

Decision Tree Classifiers

How does predict work?

How does fit work

Let’s apply what we learned!