diff --git a/docs/quickstart.rst b/docs/quickstart.rst index bca5be431..703465ed9 100644 --- a/docs/quickstart.rst +++ b/docs/quickstart.rst @@ -68,7 +68,8 @@ is about binary classification, but we similarly support regression. Prerequisites ^^^^^^^^^^^^^ -In order to run the code samples in the Quickstart tutorial, you need to install the following dependencies: +In order to run the code samples in the Quickstart tutorial, you need to +install the following dependencies: .. code-block:: bash @@ -77,35 +78,52 @@ In order to run the code samples in the Quickstart tutorial, you need to install Loading the dataset ^^^^^^^^^^^^^^^^^^^ -For this example we use the -`UCI adult dataset `_ where the -objective is to predict whether a person makes more (label 1) or less (0) -than $50,000 a year. +For this example, we use a `clinical dataset `_ +of hospital re-admissions over a ten-year period (1998-2008) for +diabetic patients across 130 different hospitals in the U.S. This scenario +builds upon prior research on how racial disparities impact health care +resource allocation in the U.S. For an in-depth analysis of this dataset, +review the `SciPy tutorial `_ +that the Fairlearn team presented in 2021. + +We will use machine learning to predict whether an individual in the dataset +is readmitted to the hospital within 30 days of hospital release. +A hospital readmission within 30 days can be viewed as a proxy that the +patients needed more assistance at the release time. +In the next section, we build a classification model to accomplish the +prediction task. .. doctest:: quickstart >>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt - >>> from fairlearn.datasets import fetch_adult - >>> data = fetch_adult(as_frame=True) - >>> X = pd.get_dummies(data.data) - >>> y_true = (data.target == '>50K') * 1 - >>> sex = data.data['sex'] - >>> sex.value_counts() - sex - Male 32650 - Female 16192 + >>> from fairlearn.datasets import fetch_diabetes_hospital + >>> data = fetch_diabetes_hospital(as_frame=True) + >>> X = data.data + >>> X.drop(columns=["readmitted", "readmit_binary"], inplace=True) + >>> y = data.target + >>> X_ohe = pd.get_dummies(X) + >>> race = X['race'] + >>> race.value_counts() + race + Caucasian 76099 + AfricanAmerican 19210 + Unknown 2273 + Hispanic 2037 + Other 1506 + Asian 641 Name: count, dtype: int64 -.. figure:: auto_examples/images/sphx_glr_plot_quickstart_selection_rate_001.png - :target: auto_examples/plot_quickstart_selection_rate.html +.. figure:: _images/sphx_glr_plot_quickstart_counts_001.png + :target: auto_examples/plot_quickstart_counts.html :align: center + Evaluating fairness-related metrics ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Firstly, Fairlearn provides fairness-related metrics that can be compared +First, Fairlearn provides fairness-related metrics that can be compared between groups and for the overall population. Using existing metric definitions from `scikit-learn `_ @@ -115,38 +133,57 @@ we can evaluate metrics for subgroups within the data as below: :options: +NORMALIZE_WHITESPACE >>> from fairlearn.metrics import MetricFrame - >>> from sklearn.metrics import accuracy_score + >>> from sklearn.metrics import accuracy_score, balanced_accuracy_score >>> from sklearn.tree import DecisionTreeClassifier - >>> + >>> from sklearn.model_selection import train_test_split + >>> np.random.seed(42) # set seed for consistent results + >>> X_train, X_test, y_train, y_test, A_train, A_test = train_test_split(X_ohe, y, race, random_state=123) >>> classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4) - >>> classifier.fit(X, y_true) + >>> classifier.fit(X_train, y_train) DecisionTreeClassifier(...) - >>> y_pred = classifier.predict(X) - >>> mf = MetricFrame(metrics=accuracy_score, y_true=y_true, y_pred=y_pred, sensitive_features=sex) + >>> y_pred = (classifier.predict_proba(X_test)[:,1] >= 0.1) + >>> mf = MetricFrame(metrics=accuracy_score, y_true=y_test, y_pred=y_pred, sensitive_features=A_test) >>> mf.overall - 0.8443... + 0.514... >>> mf.by_group - sex - Female 0.9251... - Male 0.8042... + race + AfricanAmerican 0.530935 + Asian 0.658683 + Caucasian 0.503535 + Hispanic 0.612524 + Other 0.591549 + Unknown 0.574576 Name: accuracy_score, dtype: float64 -Additionally, Fairlearn has lots of other standard metrics built-in, such as -selection rate, i.e., the percentage of the population which have '1' as -their label: +Note that our decision threshold for positive predictions is 0.1. +In practice, this threshold would be driven by risk or capacity +considerations. For this example, we set the threshold based on the risk +of readmission. The threshold of 0.1 corresponds to saying that a +10% risk of readmission is viewed as sufficient for referral to a +post-discharge care program. +Fairlearn has many standard metrics built-in, such as +false negative rate, i.e., the rate of occurrence of negative classifications +when the true value of the outcome label is positive. +In the context of this dataset, the false positive rate captures the +individuals who in reality would be readmitted to the hospital, but +the model does not predict that outcome. .. doctest:: quickstart :options: +NORMALIZE_WHITESPACE - >>> from fairlearn.metrics import selection_rate - >>> sr = MetricFrame(metrics=selection_rate, y_true=y_true, y_pred=y_pred, sensitive_features=sex) - >>> sr.overall - 0.1638... - >>> sr.by_group - sex - Female 0.0635... - Male 0.2135... - Name: selection_rate, dtype: float64 + >>> from fairlearn.metrics import false_negative_rate + >>> mf = MetricFrame(metrics=false_negative_rate, y_true=y_test, y_pred=y_pred, sensitive_features=A_test) + >>> mf.overall + 0.309... + >>> mf.by_group + race + AfricanAmerican 0.296089 + Asian 0.500000 + Caucasian 0.308555 + Hispanic 0.307692 + Other 0.333333 + Unknown 0.420000 + Name: false_negative_rate, dtype: float64 Fairlearn also allows us to quickly plot these metrics from the :class:`fairlearn.metrics.MetricFrame` @@ -156,7 +193,7 @@ Fairlearn also allows us to quickly plot these metrics from the :start-after: # Analyze metrics using MetricFrame :end-before: # Customize plots with ylim -.. figure:: auto_examples/images/sphx_glr_plot_quickstart_001.png +.. figure:: _images/sphx_glr_plot_quickstart_001.png :target: auto_examples/plot_quickstart.html :align: center @@ -167,35 +204,45 @@ Mitigating disparity If we observe disparities between groups we may want to create a new model while specifying an appropriate fairness constraint. Note that the choice of fairness constraints is crucial for the resulting model, and varies based on -application context. If selection rate is highly relevant for fairness in this -contrived example, we can attempt to mitigate the observed disparity using the -corresponding fairness constraint called Demographic Parity. In real world +application context. Since both false positives and false negatives are relevant for fairness in this +hypothetical example, we can attempt to mitigate the observed disparity using the +fairness constraint called Equalized Odds, which bounds disparities in both types of error. In real world applications we need to be mindful of the sociotechnical context when making such decisions. The Exponentiated Gradient mitigation technique used fits the -provided classifier using Demographic Parity as the objective, leading to -a vastly reduced difference in selection rate: +provided classifier using Equalized Odds as the constraint and a suitably weighted Error Rate +as the objective, leading to a vastly reduced difference in accuracy: .. doctest:: quickstart :options: +NORMALIZE_WHITESPACE - >>> from fairlearn.reductions import DemographicParity, ExponentiatedGradient - >>> np.random.seed(0) # set seed for consistent results with ExponentiatedGradient - >>> - >>> constraint = DemographicParity() + >>> from fairlearn.reductions import ErrorRate, EqualizedOdds, ExponentiatedGradient + >>> objective = ErrorRate(costs={'fp': 0.1, 'fn': 0.9}) + >>> constraint = EqualizedOdds(difference_bound=0.01) >>> classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4) - >>> mitigator = ExponentiatedGradient(classifier, constraint) - >>> mitigator.fit(X, y_true, sensitive_features=sex) + >>> mitigator = ExponentiatedGradient(classifier, constraint, objective=objective) + >>> mitigator.fit(X_train, y_train, sensitive_features=A_train) ExponentiatedGradient(...) - >>> y_pred_mitigated = mitigator.predict(X) - >>> - >>> sr_mitigated = MetricFrame(metrics=selection_rate, y_true=y_true, y_pred=y_pred_mitigated, sensitive_features=sex) - >>> sr_mitigated.overall - 0.1661... - >>> sr_mitigated.by_group - sex - Female 0.1552... - Male 0.1715... - Name: selection_rate, dtype: float64 + >>> y_pred_mitigated = mitigator.predict(X_test) + >>> mf_mitigated = MetricFrame(metrics=accuracy_score, y_true=y_test, y_pred=y_pred_mitigated, sensitive_features=A_test) + >>> mf_mitigated.overall + 0.5251... + >>> mf_mitigated.by_group + race + AfricanAmerican 0.524358 + Asian 0.562874 + Caucasian 0.525588 + Hispanic 0.549902 + Other 0.478873 + Unknown 0.511864 + Name: accuracy_score, dtype: float64 + +Note that :class:`ExponentiatedGradient` does not have a `predict_proba` +method, but we can adjust the target decision threshold by specifying +(possibly unequal) costs for false positives and false negatives. +In our example we use the cost of 0.1 for false positives and 0.9 for false negatives. +Without fairness constraints, this would exactly correspond to +referring patients with the readmission risk of 10% or higher +(as we used earlier). What's next? diff --git a/docs/user_guide/installation_and_version_guide/v0.10.0.rst b/docs/user_guide/installation_and_version_guide/v0.10.0.rst index f8489ed79..ada16c392 100644 --- a/docs/user_guide/installation_and_version_guide/v0.10.0.rst +++ b/docs/user_guide/installation_and_version_guide/v0.10.0.rst @@ -4,4 +4,4 @@ v0.10.0 * Added bootstrapping to :class:`MetricFrame`, along with a new section in the user guide * Added intersectionality in mental health care example notebook. -* Added user guide for :class:`fairlearn.postprocessing.ThresholdOptimizer` \ No newline at end of file +* Added user guide for :class:`fairlearn.postprocessing.ThresholdOptimizer` diff --git a/examples/plot_quickstart.py b/examples/plot_quickstart.py index 7c6a36d33..75948c3c0 100644 --- a/examples/plot_quickstart.py +++ b/examples/plot_quickstart.py @@ -8,8 +8,9 @@ """ import pandas as pd -from fairlearn.datasets import fetch_adult +from fairlearn.datasets import fetch_diabetes_hospital from sklearn.metrics import accuracy_score, precision_score +from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeClassifier # %% @@ -21,14 +22,19 @@ selection_rate, ) -data = fetch_adult() -X = pd.get_dummies(data.data) -y_true = (data.target == ">50K") * 1 -sex = data.data["sex"] +data = fetch_diabetes_hospital(as_frame=True) +X = data.data +X.drop(columns=["readmitted", "readmit_binary"], inplace=True) +y_true = data.target +X_ohe = pd.get_dummies(X) +race = X['race'] + +X_train, X_test, y_train, y_test, \ + A_train, A_test = train_test_split(X_ohe, y_true, race, random_state=123) classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4) -classifier.fit(X, y_true) -y_pred = classifier.predict(X) +classifier.fit(X_train, y_train) +y_pred = classifier.predict(X_test) # Analyze metrics using MetricFrame metrics = { @@ -40,7 +46,7 @@ "count": count, } metric_frame = MetricFrame( - metrics=metrics, y_true=y_true, y_pred=y_pred, sensitive_features=sex + metrics=metrics, y_true=y_test, y_pred=y_pred, sensitive_features=A_test ) metric_frame.by_group.plot.bar( subplots=True, diff --git a/examples/plot_quickstart_counts.py b/examples/plot_quickstart_counts.py new file mode 100644 index 000000000..6ebedbfea --- /dev/null +++ b/examples/plot_quickstart_counts.py @@ -0,0 +1,29 @@ +# Copyright (c) Microsoft Corporation and Fairlearn contributors. +# Licensed under the MIT License. + +""" +============= +Value counts +============= +""" +from fairlearn.datasets import fetch_diabetes_hospital +import matplotlib.pyplot as plt + + +# %% +fig, ax = plt.subplots() + +data = fetch_diabetes_hospital(as_frame=True) +X = data.data +X.drop(columns=["readmitted", "readmit_binary"], inplace=True) +y_true = data.target +race = X['race'] + +df = race.value_counts().reset_index() + +ax.bar(df["race"], df["count"]) +ax.set_title('Counts by race') +ax.tick_params(axis='x', labelrotation=45) + +plt.tight_layout() +plt.show() diff --git a/examples/plot_quickstart_selection_rate.py b/examples/plot_quickstart_selection_rate.py deleted file mode 100644 index 56eac37b3..000000000 --- a/examples/plot_quickstart_selection_rate.py +++ /dev/null @@ -1,25 +0,0 @@ -# Copyright (c) Microsoft Corporation and Fairlearn contributors. -# Licensed under the MIT License. - -""" -================================= -Selection rates in census dataset -================================= -""" -from fairlearn.datasets import fetch_adult - -# %% -from fairlearn.metrics import MetricFrame, selection_rate - -data = fetch_adult() -X = data.data -y_true = (data.target == ">50K") * 1 -sex = X["sex"] - -selection_rates = MetricFrame( - metrics=selection_rate, y_true=y_true, y_pred=y_true, sensitive_features=sex -) - -fig = selection_rates.by_group.plot.bar( - legend=False, rot=0, title="Fraction earning over $50,000" -)