Recently, the MCC-F1 curve has been proposed as an alternative, better way of assessing the performance of score-based binary classifiers [1].
This Python package implements a function to compute the MCC-F1 curve, namely mcc_f1_curve
, similarly to the precision_recall_curve
and roc_curve
functions of scikit-learn.
pip install py-mcc-f1
from mcc_f1 import mcc_f1_curve, plot_mcc_f1_curve
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
# Load data and train model
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
clf = LogisticRegression().fit(X_train, y_train)
# Calculate MCC-F1 metric
# @TODO
# Get predictions and MCC-F1 curve points
y_score = clf.predict_proba(X_test)[:,1]
mcc, f1, thresholds = mcc_f1_curve(y_test, y_score)
# Plot MCC-F1 curve
plot_mcc_f1_curve(clf, X_test, y_test)
Please refer to the function's docstring for further comments and details.
- Function to plot the MCC-F1 curve, (e.g.,
plot_mcc_f1_curve
), similar tosklearn/metrics/_plot/precision_recall_curve.py
andsklearn/metrics/_plot/roc_curve.py
; - Function to compute the MCC-F1 metric, as defined in section 2.2 of the original paper.
If you would like to contribute to this package, please follow the common community guidelines.
Please, also keep in mind that the main goal of this project is to be of similar implementation and quality as scikit-learn. Pull requests should pass the existing unit-tests, and add new ones when necessary.
To run the tests:
make test
This package is distributed under the MIT license.