Skip to content

Commit

Permalink
DOC Changing the dataset from Adult to Diabetes Hospital in Getting S…
Browse files Browse the repository at this point in the history
…tarted (fairlearn#1310)

Co-authored-by: Hilde Weerts <[email protected]>
Co-authored-by: Roman Lutz <[email protected]>
Co-authored-by: MiroDudik <[email protected]>
Co-authored-by: Richard Edgar (Microsoft) <[email protected]>
  • Loading branch information
5 people authored Apr 4, 2024
1 parent ee71adc commit 23789b2
Show file tree
Hide file tree
Showing 5 changed files with 151 additions and 94 deletions.
167 changes: 107 additions & 60 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,8 @@ is about binary classification, but we similarly support regression.
Prerequisites
^^^^^^^^^^^^^

In order to run the code samples in the Quickstart tutorial, you need to install the following dependencies:
In order to run the code samples in the Quickstart tutorial, you need to
install the following dependencies:

.. code-block:: bash
Expand All @@ -77,35 +78,52 @@ In order to run the code samples in the Quickstart tutorial, you need to install
Loading the dataset
^^^^^^^^^^^^^^^^^^^

For this example we use the
`UCI adult dataset <https://archive.ics.uci.edu/ml/datasets/Adult>`_ where the
objective is to predict whether a person makes more (label 1) or less (0)
than $50,000 a year.
For this example, we use a `clinical dataset <https://archive.ics.uci.edu/dataset/296/diabetes+130-us+hospitals+for+years+1999-2008>`_
of hospital re-admissions over a ten-year period (1998-2008) for
diabetic patients across 130 different hospitals in the U.S. This scenario
builds upon prior research on how racial disparities impact health care
resource allocation in the U.S. For an in-depth analysis of this dataset,
review the `SciPy tutorial <https://github.com/fairlearn/talks/tree/main/2021_scipy_tutorial>`_
that the Fairlearn team presented in 2021.

We will use machine learning to predict whether an individual in the dataset
is readmitted to the hospital within 30 days of hospital release.
A hospital readmission within 30 days can be viewed as a proxy that the
patients needed more assistance at the release time.
In the next section, we build a classification model to accomplish the
prediction task.

.. doctest:: quickstart

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from fairlearn.datasets import fetch_adult
>>> data = fetch_adult(as_frame=True)
>>> X = pd.get_dummies(data.data)
>>> y_true = (data.target == '>50K') * 1
>>> sex = data.data['sex']
>>> sex.value_counts()
sex
Male 32650
Female 16192
>>> from fairlearn.datasets import fetch_diabetes_hospital
>>> data = fetch_diabetes_hospital(as_frame=True)
>>> X = data.data
>>> X.drop(columns=["readmitted", "readmit_binary"], inplace=True)
>>> y = data.target
>>> X_ohe = pd.get_dummies(X)
>>> race = X['race']
>>> race.value_counts()
race
Caucasian 76099
AfricanAmerican 19210
Unknown 2273
Hispanic 2037
Other 1506
Asian 641
Name: count, dtype: int64

.. figure:: auto_examples/images/sphx_glr_plot_quickstart_selection_rate_001.png
:target: auto_examples/plot_quickstart_selection_rate.html
.. figure:: _images/sphx_glr_plot_quickstart_counts_001.png
:target: auto_examples/plot_quickstart_counts.html
:align: center


Evaluating fairness-related metrics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Firstly, Fairlearn provides fairness-related metrics that can be compared
First, Fairlearn provides fairness-related metrics that can be compared
between groups and for the overall population. Using existing metric
definitions from
`scikit-learn <https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics>`_
Expand All @@ -115,38 +133,57 @@ we can evaluate metrics for subgroups within the data as below:
:options: +NORMALIZE_WHITESPACE

>>> from fairlearn.metrics import MetricFrame
>>> from sklearn.metrics import accuracy_score
>>> from sklearn.metrics import accuracy_score, balanced_accuracy_score
>>> from sklearn.tree import DecisionTreeClassifier
>>>
>>> from sklearn.model_selection import train_test_split
>>> np.random.seed(42) # set seed for consistent results
>>> X_train, X_test, y_train, y_test, A_train, A_test = train_test_split(X_ohe, y, race, random_state=123)
>>> classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4)
>>> classifier.fit(X, y_true)
>>> classifier.fit(X_train, y_train)
DecisionTreeClassifier(...)
>>> y_pred = classifier.predict(X)
>>> mf = MetricFrame(metrics=accuracy_score, y_true=y_true, y_pred=y_pred, sensitive_features=sex)
>>> y_pred = (classifier.predict_proba(X_test)[:,1] >= 0.1)
>>> mf = MetricFrame(metrics=accuracy_score, y_true=y_test, y_pred=y_pred, sensitive_features=A_test)
>>> mf.overall
0.8443...
0.514...
>>> mf.by_group
sex
Female 0.9251...
Male 0.8042...
race
AfricanAmerican 0.530935
Asian 0.658683
Caucasian 0.503535
Hispanic 0.612524
Other 0.591549
Unknown 0.574576
Name: accuracy_score, dtype: float64

Additionally, Fairlearn has lots of other standard metrics built-in, such as
selection rate, i.e., the percentage of the population which have '1' as
their label:
Note that our decision threshold for positive predictions is 0.1.
In practice, this threshold would be driven by risk or capacity
considerations. For this example, we set the threshold based on the risk
of readmission. The threshold of 0.1 corresponds to saying that a
10% risk of readmission is viewed as sufficient for referral to a
post-discharge care program.
Fairlearn has many standard metrics built-in, such as
false negative rate, i.e., the rate of occurrence of negative classifications
when the true value of the outcome label is positive.
In the context of this dataset, the false positive rate captures the
individuals who in reality would be readmitted to the hospital, but
the model does not predict that outcome.

.. doctest:: quickstart
:options: +NORMALIZE_WHITESPACE

>>> from fairlearn.metrics import selection_rate
>>> sr = MetricFrame(metrics=selection_rate, y_true=y_true, y_pred=y_pred, sensitive_features=sex)
>>> sr.overall
0.1638...
>>> sr.by_group
sex
Female 0.0635...
Male 0.2135...
Name: selection_rate, dtype: float64
>>> from fairlearn.metrics import false_negative_rate
>>> mf = MetricFrame(metrics=false_negative_rate, y_true=y_test, y_pred=y_pred, sensitive_features=A_test)
>>> mf.overall
0.309...
>>> mf.by_group
race
AfricanAmerican 0.296089
Asian 0.500000
Caucasian 0.308555
Hispanic 0.307692
Other 0.333333
Unknown 0.420000
Name: false_negative_rate, dtype: float64

Fairlearn also allows us to quickly plot these metrics from the
:class:`fairlearn.metrics.MetricFrame`
Expand All @@ -156,7 +193,7 @@ Fairlearn also allows us to quickly plot these metrics from the
:start-after: # Analyze metrics using MetricFrame
:end-before: # Customize plots with ylim

.. figure:: auto_examples/images/sphx_glr_plot_quickstart_001.png
.. figure:: _images/sphx_glr_plot_quickstart_001.png
:target: auto_examples/plot_quickstart.html
:align: center

Expand All @@ -167,35 +204,45 @@ Mitigating disparity
If we observe disparities between groups we may want to create a new model
while specifying an appropriate fairness constraint. Note that the choice of
fairness constraints is crucial for the resulting model, and varies based on
application context. If selection rate is highly relevant for fairness in this
contrived example, we can attempt to mitigate the observed disparity using the
corresponding fairness constraint called Demographic Parity. In real world
application context. Since both false positives and false negatives are relevant for fairness in this
hypothetical example, we can attempt to mitigate the observed disparity using the
fairness constraint called Equalized Odds, which bounds disparities in both types of error. In real world
applications we need to be mindful of the sociotechnical context when making
such decisions. The Exponentiated Gradient mitigation technique used fits the
provided classifier using Demographic Parity as the objective, leading to
a vastly reduced difference in selection rate:
provided classifier using Equalized Odds as the constraint and a suitably weighted Error Rate
as the objective, leading to a vastly reduced difference in accuracy:

.. doctest:: quickstart
:options: +NORMALIZE_WHITESPACE

>>> from fairlearn.reductions import DemographicParity, ExponentiatedGradient
>>> np.random.seed(0) # set seed for consistent results with ExponentiatedGradient
>>>
>>> constraint = DemographicParity()
>>> from fairlearn.reductions import ErrorRate, EqualizedOdds, ExponentiatedGradient
>>> objective = ErrorRate(costs={'fp': 0.1, 'fn': 0.9})
>>> constraint = EqualizedOdds(difference_bound=0.01)
>>> classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4)
>>> mitigator = ExponentiatedGradient(classifier, constraint)
>>> mitigator.fit(X, y_true, sensitive_features=sex)
>>> mitigator = ExponentiatedGradient(classifier, constraint, objective=objective)
>>> mitigator.fit(X_train, y_train, sensitive_features=A_train)
ExponentiatedGradient(...)
>>> y_pred_mitigated = mitigator.predict(X)
>>>
>>> sr_mitigated = MetricFrame(metrics=selection_rate, y_true=y_true, y_pred=y_pred_mitigated, sensitive_features=sex)
>>> sr_mitigated.overall
0.1661...
>>> sr_mitigated.by_group
sex
Female 0.1552...
Male 0.1715...
Name: selection_rate, dtype: float64
>>> y_pred_mitigated = mitigator.predict(X_test)
>>> mf_mitigated = MetricFrame(metrics=accuracy_score, y_true=y_test, y_pred=y_pred_mitigated, sensitive_features=A_test)
>>> mf_mitigated.overall
0.5251...
>>> mf_mitigated.by_group
race
AfricanAmerican 0.524358
Asian 0.562874
Caucasian 0.525588
Hispanic 0.549902
Other 0.478873
Unknown 0.511864
Name: accuracy_score, dtype: float64

Note that :class:`ExponentiatedGradient` does not have a `predict_proba`
method, but we can adjust the target decision threshold by specifying
(possibly unequal) costs for false positives and false negatives.
In our example we use the cost of 0.1 for false positives and 0.9 for false negatives.
Without fairness constraints, this would exactly correspond to
referring patients with the readmission risk of 10% or higher
(as we used earlier).


What's next?
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/installation_and_version_guide/v0.10.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ v0.10.0
* Added bootstrapping to :class:`MetricFrame`, along with a new section
in the user guide
* Added intersectionality in mental health care example notebook.
* Added user guide for :class:`fairlearn.postprocessing.ThresholdOptimizer`
* Added user guide for :class:`fairlearn.postprocessing.ThresholdOptimizer`
22 changes: 14 additions & 8 deletions examples/plot_quickstart.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@
"""

import pandas as pd
from fairlearn.datasets import fetch_adult
from fairlearn.datasets import fetch_diabetes_hospital
from sklearn.metrics import accuracy_score, precision_score
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# %%
Expand All @@ -21,14 +22,19 @@
selection_rate,
)

data = fetch_adult()
X = pd.get_dummies(data.data)
y_true = (data.target == ">50K") * 1
sex = data.data["sex"]
data = fetch_diabetes_hospital(as_frame=True)
X = data.data
X.drop(columns=["readmitted", "readmit_binary"], inplace=True)
y_true = data.target
X_ohe = pd.get_dummies(X)
race = X['race']

X_train, X_test, y_train, y_test, \
A_train, A_test = train_test_split(X_ohe, y_true, race, random_state=123)

classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4)
classifier.fit(X, y_true)
y_pred = classifier.predict(X)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

# Analyze metrics using MetricFrame
metrics = {
Expand All @@ -40,7 +46,7 @@
"count": count,
}
metric_frame = MetricFrame(
metrics=metrics, y_true=y_true, y_pred=y_pred, sensitive_features=sex
metrics=metrics, y_true=y_test, y_pred=y_pred, sensitive_features=A_test
)
metric_frame.by_group.plot.bar(
subplots=True,
Expand Down
29 changes: 29 additions & 0 deletions examples/plot_quickstart_counts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Copyright (c) Microsoft Corporation and Fairlearn contributors.
# Licensed under the MIT License.

"""
=============
Value counts
=============
"""
from fairlearn.datasets import fetch_diabetes_hospital
import matplotlib.pyplot as plt


# %%
fig, ax = plt.subplots()

data = fetch_diabetes_hospital(as_frame=True)
X = data.data
X.drop(columns=["readmitted", "readmit_binary"], inplace=True)
y_true = data.target
race = X['race']

df = race.value_counts().reset_index()

ax.bar(df["race"], df["count"])
ax.set_title('Counts by race')
ax.tick_params(axis='x', labelrotation=45)

plt.tight_layout()
plt.show()
25 changes: 0 additions & 25 deletions examples/plot_quickstart_selection_rate.py

This file was deleted.

0 comments on commit 23789b2

Please sign in to comment.