Skip to content

Commit

Permalink
update 2024
Browse files Browse the repository at this point in the history
  • Loading branch information
LiesIsLeuk committed Aug 28, 2023
1 parent c42202d commit ab6c9d9
Show file tree
Hide file tree
Showing 10 changed files with 30 additions and 181 deletions.
43 changes: 11 additions & 32 deletions content/dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ weight: 80
---


The training set can be downloaded here, using the password which will be provided to all registered teams: [ICASSP-2023-eeg-decoding-challenge-dataset](https://kuleuven-my.sharepoint.com/:f:/g/personal/lies_bollens_kuleuven_be/EkaIjOmoPIRHmYLdLK8b2VQBY_2ouqNSnHHTHyRl3Zn-2w?e=KhX7d0)
The training set can be downloaded here, using the password which will be provided to all registered teams: [ICASSP-2024-eeg-decoding-challenge-dataset](https://kuleuven-my.sharepoint.com/:f:/g/personal/lies_bollens_kuleuven_be/EkaIjOmoPIRHmYLdLK8b2VQBY_2ouqNSnHHTHyRl3Zn-2w?e=KhX7d0)

For more details concerning the dataset, we refer to [the dataset paper](https://www.biorxiv.org/content/10.1101/2023.07.24.550310v1).

Expand All @@ -29,7 +29,7 @@ electrode (CMS) and current return path (DRL). The data is measured at a samplin
the spatial resolution is low, with only 64 electrodes for billions of neurons. All 64 electrodes are placed according to international 10-20
standards.

The dataset contains data from 85 young, normal-hearing subjects (all hearing thresholds <= 25 dB Hl), with Dutch as their native
The dataset contains data from 105 young, normal-hearing subjects (all hearing thresholds <= 25 dB Hl), with Dutch as their native
language. Subjects indicating any neurological or hearing-related medical history were excluded from the study. The study was approved by
the Medical Ethics Committee UZ KU Leuven/Research (KU Leuven, Belgium). All identifiable subject information has been removed from the dataset.

Expand All @@ -43,40 +43,19 @@ longer than 15 minutes. In this case, they are split into two trials presented c



# Training set
The training set contains EEG responses from 71 subjects. These subjects are numbered from sub-01 to sub-071. As shown in the figure above, each
subject listens to between 6 and 9 trials, each of around 15 minutes in length. Due to measuring errors, not all trials for all subjects have been
included in the training set. Subjects are divided into groups, depending on which stimuli they listened to. Each such group contains between
2 and 27 subjects. All subjects of all groups listen to a reference story, Audiobook 1.

In total, the training set contains 508 trials, from 71 subjects, using 57 different stimuli. The total amount of minutes of data amounts to
7216 minutes (120 hours). Both tasks share the training set. Data is structured in a folder per subject, and the trials are named chronologically.
Each EEG trial file contains a pointer to the stimulus used to generate the specific brain EEG response and reference
to the subject identifier. The auditory stimuli are provided in a separate folder stimuli.
The dataset contains data from 105 young, normal-hearing subjects (all hearing thresholds <= 25 dB Hl), with Dutch as their native language. Subjects indicating any neurological or hearing-related medical history were excluded from the study. The study was approved by the Medical Ethics Committee UZ KU Leuven/Research (KU Leuven, Belgium). All identifiable subject information has been removed from the dataset.

Each subject listened to between 8 and 10 trials, each of approximately 15 minutes in length. The order of the trials is randomized between participants. All the stimuli are single-speaker stories spoken in Flemish (Belgian Dutch) by a native Flemish speaker. We vary the stimuli between subjects ( between each 2 to 26 subjects) to have a wide range of unique speech material. The stimuli are either podcast or audiobooks. Some audiobooks are longer than 15 minutes. In this case, they are split into two trials presented consecutively to the subject.

# Test set
The test set consists of two parts: held-out stories and held-out subjects. These sets are split into two parts, ensuring that the test sets of the
two tasks do not overlap. Both test sets will be released to the participants on January 6, 2023. However, the ground truth labels will only be available to the public after the competition is
over.
- **Test Set 1 (held-out stories)** contains data for the 71 subjects seen in training. We held out one story for each group of subjects, which never
occurs in the training set, amounting to a total of 944 minutes.
- **Test Set 2 (held-out subjects)** contains data for 14 subjects (sub-72 to sub-85) that are not in the training set, further referred to as held-out
subjects, for a total of 1260 minutes. The data for these subjects were acquired using the same protocol as for the other 71 subjects.

## Training set
The training set contains data from 85 subjects and is equal to the training + test set from the ICASSP 2023 Auditory EEG competition. In total, the training set contains 655 trials( of 15 minutes each) , from 85 subjects, using 72 different stimuli, for a total of 9420 minutes ( 157 hours).

## Test set
The test set contains data from 20 subjects, which have been newly measured for the ICASSP 2024 auditory EEG competition. All subjects, as well as the stimuli, are never seen in the training set. The test set contains a total of 20 subjects, 15 different stimuli, for a total of 2315 minutes of data ( 38 hours).
The test sets will be released to the public on November 15,2023.

# Preprocessing

We provide two versions of the dataset. The first data version is the raw EEG data, which has been downsampled from 8192 Hz to 1024 Hz.
The second version of the dataset has been preprocessed in MATLAB. First, the EEG signal was downsampled from 8192 Hz to 1024 Hz,
and artefacts were removed using a multichannel Wiener filter. Then, the EEG signal was re-referenced to a common average. Finally, the
EEG signal was downsampled to 64 Hz. These steps are commonly used in EEG signal processing, and the preprocessed version can be used
directly in machine learning models. However, challenge participants are free to perform their own preprocessing on both versions of the
datasets.
For the regression task, we define a specific version of the envelope, which will be used for evaluating the final outputs. We estimate the envelope
using a gammatone filter bank with 28 subbands, spaced by equivalent bandwidth with center frequencies of 50 Hz to 5 kHz. Subsequently,
the absolute value of each sample in the filters is taken, followed by exponentiation with 0.6. Then, all subbands are averaged to obtain one
speech envelope. Finally, the resulting envelope is downsampled to 64 Hz. We provide code to create these envelope representations.
We provide two different versions of the dataset. The first data version is the raw EEG data, which has been downsampled from 8192 Hz to 1024 Hz. The second version of the data has been preprocessed ( we provide code to replicate these steps) First, the artefacts are removed, using a multichannel Wiener filter. Then, the EEG signal is re-referenced to a common average and finally, the EEG signal is downsampled to 64 Hz. hese steps are commonly used in EEG signal processing, and the preprocessed version can be used directly in machine learning models. However, challenge participants are free to perform their own preprocessing on both versions of the datasets.

# Ethics
Before commencing the EEG experiments, all participants read and signed an informed consent form approved by the
Expand Down
6 changes: 3 additions & 3 deletions content/homepage_sections/general_description.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ to predict the EEG signal from the stimulus or to decode the stimulus from the E
signal-to-noise ratio of the EEG, this is a challenging problem, and several non-linear methods have been
proposed to improve upon the linear regression methods.
In the Auditory-EEG challenge, teams will compete to build the best model to relate speech to EEG. We
provide a large auditory EEG dataset containing data from 85 subjects who listen on average to 108 minutes
of single-speaker stimuli for a total of 157 hours of data. We define two tasks:
provide a large auditory EEG dataset containing data from 105 subjects who listen on average to 108 minutes
of single-speaker stimuli for a total of around 200 hours of data. We define two tasks:

**Task 1 match-mismatch**: given 3 or 5 segments of speech and a segment of EEG, which segment of speech
matches the EEG?

**Task 2 regression**: reconstruct the speech envelope from the EEG.
**Task 2 regression**: reconstruct the mel spectorgram from the EEG.
We provide the dataset, code for preprocessing the EEG and for creating commonly used stimulus
representations, and two baseline methods.

Expand Down
8 changes: 4 additions & 4 deletions content/registration.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ [email protected] with the names of the team members, emails, and

# Guidelines for participants

- Participants can submit their predictions up to five times. The latest received submission counts as the official score.
- Participants can submit their predictions up to two times. The latest received submission counts as the official score.
- The Audio-EEG challenge features two separate tasks. Participants can submit to either one track or both. Results should be accom-
panied by a 2-page paper describing the proposed method
- The top 5 ranked teams will be invited to submit a 2-page paper, to be presented at ICASSP-2023, which should be submitted before
- The top 5 ranked teams will be invited to submit a 2-page paper, to be presented at ICASSP-2024, which should be submitted before
the camera-ready deadline. In addition, they will receive an invitation to submit a full paper about their work to the IEEE Open Journal
of Signal Processing (OJ-SP). Submitted OJ-SP papers will undergo peer review by the OJ-SP Editorial Board in collaboration with
the IEEE SPS CDC committee and ICASSP-2023 GC Chairs.
the IEEE SPS CDC committee and ICASSP-2024 GC Chairs.
- Winners will be selected according to the best performance for every single task. One winner for each task will be selected
- The top 5 teams will be determined as follows: the top 2 teams for track 1 and the top two teams for track 2. The fifth team will be
chosen as the third-ranking team in the task with the most submissions.
Expand All @@ -32,7 +32,7 @@ labels/predictions.
- Additionally, we encourage all teams to publically share their code at the end of the contest.
- The use of external data (both training data and/or pretrained models) is allowed, on the following conditions:
1. The datasets, or pretrained models, should be publicly and freely available. Participants wishing to use data/models that are publicly available, but not free, should contact the organisers to discuss if the data can be used.
2. Only datasets/pretrained models that have been made publicly available before the start of this challenge, i.e. **put online before November 21, 2022**, are allowed to be used.
2. Only datasets/pretrained models that have been made publicly available before the start of this challenge, i.e. **put online before August 30, 2023**, are allowed to be used.
3. Upon submission of their results, teams should explicitly mention which extra datasets/model weights they have used to generate their predictions.
4. You can still use a publicly available pretrained model even if the data which the model is trained on is not open to public.
- Participants can fine-tune their model on each test subject to have a subject specific model (this can only be done for subjects that are also present in the train set).
62 changes: 0 additions & 62 deletions content/task1/test_set.md

This file was deleted.

33 changes: 12 additions & 21 deletions content/task2/description.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,49 +11,40 @@ weight: 80
Task 2 is a regression problem: To reconstruct the stimulus from the EEG. After reconstruction, a metric to measure the similarity is used
between the reconstructed stimulus and the original stimulus. In this task, we use the Pearson correlation.

For this task, the stimulus representation is defined as the envelope, as described in the preprocessing section and as defined by the provided code.
For this task, the stimulus representation is defined as the mel spectrogram, as described in the preprocessing section and as defined by the provided code.
Participants are free to create their methods. However, remember that the stimulus
objective is fixed, as defined by the python file envelope.py.
objective is fixed, as defined by the python file mel.py.

The code for this task can be found on our [github repository](https://github.com/exporl/auditory-eeg-challenge-2023-code)
The code for this task can be found on our [github repository](https://github.com/exporl/auditory-eeg-challenge-2024-code)

# Baseline

As a first baseline, we include a linear backward model. The linear model
reconstructs the speech envelope from EEG by using a linear transformation across all
reconstructs the mel spectrogram from EEG by using a linear transformation across all
channels and a certain time (the integration window). We use an integration window of 400ms.
We train subject-dependent models, i.e. there is one model per subject.


As a second baseline, we include the [Very Large Augmented Auditory Inference (VLAAI) network](https://www.biorxiv.org/content/10.1101/2022.09.28.509945v2). The VLAAI network consists of
multiple (N) blocks, consisting of 3 different parts. The first part is a CNN stack, a convolutional neural network. This CNN consists of M=4
convolutional layers. The second part is a simple, fully connected layer of 64 units, which recombines the output filters of the CNN stack. The
last part is the output context layer. This special layers enhances the predictions made by the model up until that point, by taking the previous
context into account and combining it with the current sample. At the end of each block except the last, a skip connection is present with the
original EEG input. After the last block, the linear layer at the top of the VLAAI model combines the filters of the output context layer into a
single speech envelope. When applied to the training and test sets of the challenge, an average correlation score of 0.19 is obtained.



# Evaluation Criteria
The test set for the regression task contains half the data from test set 1 and half from test set 2. All stimuli are held-out stimuli, i.e., they
The test set for the regression task contains half the data from the test set. All stimuli are held-out stimuli, i.e., they
do not appear in the training set. We have split up the stimuli into several smaller segments of 60 seconds and made these available with a
segment ID and a subject ID for each segment.

For each segment of 60 seconds, we expect a reconstructed envelope, which will then be compared to the original envelope,
as calculated by the provided envelope script, using Pearson correlation. We will use the scipy.stats.pearsonr
function to calculate the correlation value for each segment.
For each segment of 60 seconds, we expect a reconstructed mel spectrogram, which will then be compared to the original mel spectrogram,
as calculated by the provided mel script, using Pearson correlation. We will use the scipy.stats.pearsonr
function to calculate the correlation value for each segment and average the correlation across bands.

Afterwards, the mean correlation value per subject is calculated. Then, we calculate the mean correlation values over all subjects for test set 1 and test set
2 and add both scores to obtain a final score, which will serve as the final ranking value.
Afterwards, the mean correlation value per subject is calculated. Then, we calculate the mean correlation values over all subjects to obtain a final score, which will serve as the final ranking value.


Participants should submit a json dictionary file for the test set to an online form on our website, which contains the reconstructed
envelopes for all EEG segments. Afterward, we will calculate the score mentioned above and update this in the online leaderboard. Each
entry in the submitted dictionary should be of the form (EEG ID) : (Reconstructed Envelope).
mel spectrograms for all EEG segments. Afterward, we will calculate the score mentioned above and update this in the online leaderboard. Each
entry in the submitted dictionary should be of the form (EEG ID) : (Reconstructed Mel).

A correlation value of 0 will be taken in case of
absent EEG ID entries. The reconstructed envelope should be of dimensions 1 x 3840 (i.e., 60 seconds of data at the prescribed sample rate
absent EEG ID entries. The reconstructed envelope should be of dimensions 10 x 3840 (i.e., 60 seconds of data at the prescribed sample rate
of 64 Hz).

{{< figure src="../../images/score_regression.png" title=" " >}}
5 changes: 0 additions & 5 deletions content/task2/leaderboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,3 @@ weight: 80

{{< /chart_2 >}}



__Challenge winners (top 2 teams):__
1. HappyQuoka (0.1589)
2. TheBrainwaveBandits (0.1535)
Loading

0 comments on commit ab6c9d9

Please sign in to comment.