model_comparison.model_comparison

model_comparison.model_comparison(
    models,
    X,
    y,
    metric='accuracy',
    greater_is_better=False,
)

Compare multiple fitted scikit-learn models and return the best-performing one.

Models are evaluated on the same dataset using a user-specified evaluation metric. The model with the highest score is returned.

Parameters

Name	Type	Description	Default
models	list of sklearn.base.BaseEstimator	A list of fitted scikit-learn model objects that implement the `predict` method.	required
X	pandas.DataFrame or array - like	Feature matrix used for evaluation.	required
y	pandas.Series or array - like	True target values.	required
metric	str	Evaluation metric used for comparison. Must be a valid scikit-learn classification metric (e.g. “accuracy”, “f1”, “precision”, “recall”).	`"accuracy"`
greater_is_better		Ensures proper comparison is performed for our chosen metric If False, error metric, lower error is better If True, accuracy measure. Higher accuracy is better.	`False`

Returns

Name	Type	Description
	sklearn.base.BaseEstimator	The model with the best performance according to the selected evaluation metric.

Raises

Name	Type	Description
	ValueError	If the metric is not supported or if models is empty or not a valid sklearn object.

Examples

>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.tree import DecisionTreeClassifier
>>>X, y = make_classification(
n_samples=200,
n_features=5,
n_informative=3,
n_redundant=0,
n_classes=2,
random_state=42
)
>>> models = [LogisticRegression().fit(X, y),
...           DecisionTreeClassifier().fit(X, y)]
>>> best_model = model_comparison(models, X, y, metric="accuracy")