Introducing Linear Regression

Linear Regression

train_df.head()
length weight
73 1.489130 10.507995
53 1.073233 7.658047
80 1.622709 9.748797
49 0.984653 9.731572
23 0.484937 3.016555

Ridge

from sklearn.linear_model import LinearRegression

LinearRegression();


from sklearn.linear_model import Ridge

rm = Ridge()
rm.fit(X_train, y_train);


rm.predict(X_train)[:5]
array([10.09739051,  7.90823334, 10.80050927,  7.44197529,  4.81162144])


rm.score(X_train, y_train)
0.8125029624787177

alpha

rm2 = Ridge(alpha=10000)
rm2.fit(X_train, y_train);


rm2.score(X_train, y_train)
0.004541128724857568
from sklearn.model_selection import cross_validate

scores_dict ={
"alpha" :10.0**np.arange(-2,6,1),
"train_scores" : list(),
"cv_scores" : list(),
}
for alpha in scores_dict['alpha']:
    ridge_model = Ridge(alpha=alpha)
    results = cross_validate(ridge_model, X_train, y_train, return_train_score=True)
    scores_dict['train_scores'].append(results["train_score"].mean())
    scores_dict['cv_scores'].append(results["test_score"].mean())


pd.DataFrame(scores_dict)
alpha train_scores cv_scores
0 0.01 0.812961 0.799169
1 0.10 0.812945 0.799199
2 1.00 0.811461 0.798103
... ... ... ...
5 1000.00 0.035217 0.003744
6 10000.00 0.003629 -0.028689
7 100000.00 0.000364 -0.032041

8 rows × 3 columns

Visualizing linear regression

404 image

Let’s apply what we learned!