3.1. Exercises

Logistic Regression Prediction

We have the following text, which we wish to classify as either a positive or negative movie review.

Using the words below (which are features in our model) with associated coefficients, answer the next 2 questions.

The input for the feature value is the number of times the word appears in the review.

Word Coefficient
excellent 2.2
disappointment -2.4
flawless 1.4
boring -1.3
unwatchable -1.7

Intercept = 1.3


Question 1

I thought it was going to be excellent but instead, it was unwatchable and boring.


True or False: Logistic Regression

Applying Logistic Regression

Instructions:
Running a coding exercise for the first time could take a bit of time for everything to load. Be patient, it could take a few minutes.

When you see ____ in a coding exercise, replace it with what you assume to be the correct code. Run it and see if you obtain the desired output. Submit your code to validate if you were correct.

Make sure you remove the hash (#) symbol in the coding portions of this question. We have commented them so that the line won’t execute and you can test your code after each step.

Let’s give a warm welcome back to our wonderful Pokémon dataset. We want to see how well our model does with logistic regression. Let’s try building a simple model with default parameters to start.

Tasks:

  • Build and fit a pipeline containing the column transformer and a logistic regression model and use the parameter class_weight="balanced". Name this pipeline pkm_pipe.
  • Score your model on the test set using the default accuracy measurement. Save this in an object named lr_scores.
  • Fill in the blanks below to assess the model’s feature coefficients.
Hint 1
  • Are you fitting your pipeline?
  • Are you scoring your pipeline on the test data?
  • Are you finding the coefficients using pkm_pipe['logisticregression'].coef_[0]?
  • Are you using numeric_features to find your model’s feature names?
Fully worked solution: