Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eegbci api: allow downloading multiple subjects #12918

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions doc/documentation/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,9 +161,10 @@ EEGBCI motor imagery
====================
:func:`mne.datasets.eegbci.load_data`

The EEGBCI dataset is documented in :footcite:`SchalkEtAl2004`. The data set is
available at PhysioNet :footcite:`GoldbergerEtAl2000`. The dataset contains
64-channel EEG recordings from 109 subjects and 14 runs on each subject in EDF+
The EEGBCI dataset is documented in :footcite:`SchalkEtAl2004` and on the
`PhysioNet documentation page <https://physionet.org/content/eegmmidb/1.0.0/>`_.
The data set is available at PhysioNet :footcite:`GoldbergerEtAl2000`.
It 64-channel EEG recordings from 109 subjects and 14 runs on each subject in EDF+
format. The recordings were made using the BCI2000 system. To load a subject,
do::

Expand Down
7 changes: 4 additions & 3 deletions examples/decoding/decoding_csp_eeg.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,9 @@

See https://en.wikipedia.org/wiki/Common_spatial_pattern and
:footcite:`Koles1991`. The EEGBCI dataset is documented in
:footcite:`SchalkEtAl2004` and is available at PhysioNet
:footcite:`GoldbergerEtAl2000`.
:footcite:`SchalkEtAl2004` and on the
`PhysioNet documentation page <https://physionet.org/content/eegmmidb/1.0.0/>`_.
The dataset is available at PhysioNet :footcite:`GoldbergerEtAl2000`.
"""
# Authors: Martin Billinger <[email protected]>
#
Expand Down Expand Up @@ -48,7 +49,7 @@
eegbci.standardize(raw) # set channel names
montage = make_standard_montage("standard_1005")
raw.set_montage(montage)
raw.annotations.rename(dict(T1="hands", T2="feet"))
raw.annotations.rename(dict(T1="hands", T2="feet")) # as documented on PhysioNet
raw.set_eeg_reference(projection=True)

# Apply band-pass filter
Expand Down
48 changes: 29 additions & 19 deletions mne/datasets/eegbci/eegbci.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@ def data_path(url, path=None, force_update=False, update_path=None, *, verbose=N

This is a low-level function useful for getting a local copy of a remote EEGBCI
dataset :footcite:`SchalkEtAl2004`, which is also available at PhysioNet
:footcite:`GoldbergerEtAl2000`.
:footcite:`GoldbergerEtAl2000`. Metadata, such as the meaning of event markers
may be obtained from the
`PhysioNet documentation page <https://physionet.org/content/eegmmidb/1.0.0/>`_.

Parameters
----------
Expand Down Expand Up @@ -92,8 +94,9 @@ def data_path(url, path=None, force_update=False, update_path=None, *, verbose=N

@verbose
def load_data(
subject,
subjects,
runs,
*,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this I'd be tempted just to live with the misnomer subject=[...] rather than switch to subjects=[...] but the deprecation also won't be so terrible:

def load_data(
    subjects=None,
    runs=None,
    *,
    subject=None,
    ...
):
    if subject is not None:
        subjects = subjects
        warn(..., FutureWarning)
    del subject

And then you'll need to ensure that subjects and runs are not None, i.e., the user has actually supplied values for them (raise an error if None). Then once deprecation is complete we can remove the =None defaults safely.

path=None,
force_update=False,
update_path=None,
Expand All @@ -103,12 +106,14 @@ def load_data(
"""Get paths to local copies of EEGBCI dataset files.

This will fetch data for the EEGBCI dataset :footcite:`SchalkEtAl2004`, which is
also available at PhysioNet :footcite:`GoldbergerEtAl2000`.
also available at PhysioNet :footcite:`GoldbergerEtAl2000`. Metadata, such as the
meaning of event markers may be obtained from the
`PhysioNet documentation page <https://physionet.org/content/eegmmidb/1.0.0/>`_.

Parameters
----------
subject : int
The subject to use. Can be in the range of 1-109 (inclusive).
subjects : int | list of int
The subjects to use. Can be in the range of 1-109 (inclusive).
runs : int | list of int
The runs to use (see Notes for details).
path : None | path-like
Expand Down Expand Up @@ -163,6 +168,9 @@ def load_data(

t0 = time.time()

if not hasattr(subjects, "__iter__"):
subjects = [subjects]

if not hasattr(runs, "__iter__"):
runs = [runs]

Expand Down Expand Up @@ -198,20 +206,22 @@ def load_data(
# fetch the file(s)
data_paths = []
sz = 0
for run in runs:
file_part = f"S{subject:03d}/S{subject:03d}R{run:02d}.edf"
destination = Path(base_path, file_part)
data_paths.append(destination)
if destination.exists():
if force_update:
destination.unlink()
else:
continue
if sz == 0: # log once
logger.info("Downloading EEGBCI data")
fetcher.fetch(file_part)
# update path in config if desired
sz += destination.stat().st_size
for subject in subjects:
for run in runs:
file_part = f"S{subject:03d}/S{subject:03d}R{run:02d}.edf"
destination = Path(base_path, file_part)
data_paths.append(destination)
if destination.exists():
if force_update:
destination.unlink()
else:
continue
if sz == 0: # log once
logger.info("Downloading EEGBCI data")
fetcher.fetch(file_part)
# update path in config if desired
sz += destination.stat().st_size

_do_path_update(path, update_path, config_key, name)
if sz > 0:
_log_time_size(t0, sz)
Expand Down
Loading