1.1. Exercises
Splitting our data
Decision Tree Outcome
Splitting Data in Action
Instructions:
Running a coding exercise for the first time could take a bit of time for everything to load. Be patient, it could take a few minutes.
When you see ____ in a coding exercise, replace it with what you assume to be the correct code. Run it and see if you obtain the desired output. Submit your code to validate if you were correct.
Make sure you remove the hash (#) symbol in the coding portions of this question. We have commented them so that the line wonβt execute and you can test your code after each step.
Letβs split our data using train_test_split() on our candy bars dataset.
Tasks:
- Split the
Xandydataframes into 4 objects:X_train,X_test,y_train,y_test. - Make the test set 0.2 (or the train set 0.8) and make sure to use
random_state=7. - Build a model using
DecisionTreeClassifier(). - Save this in an object named
model. - Fit your model on the objects
X_trainandy_train. - Evaluate the accuracy of the model using
.score()onX_trainandy_trainsave the values in an object namedtrain_score. - Repeat the above action but this time evaluate the accuracy of the model using
.score()onX_testandy_test(which the model has never seen before) and save the values in an object namedtest_score.