Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New datasets and parameter confirmation #32

Merged
merged 21 commits into from
Apr 8, 2018
Merged

Conversation

vinay-jayaram
Copy link
Collaborator

This PR includes data mentioned in 1 from #1 (checked items) as well as a review of the epoching parameters that were mentioned as defaults. In particular:

  1. Many datasets had inconsistent start and end times when compared with the publications. I do see how cutting the beginning/end of trials eliminates ramp up/down effects, but it's not applied consistently (datasets like Alex MI use everything) and so I went through the publications and changed all epoching parameters to the cues described in the papers.

  2. We need to standardize how epochs are given to the data -- I would suggest a function set_epoch that every dataset needs to implement, so that paradigms can go through and change the epoching if they want to.

  3. Weibo 2014 and Zhou 2016 datasets are added and have been tested with a function in moabb.tests.datasets that just runs through all subjects and confirms they can be downloaded and loaded properly.

Vinay Jayaram added 3 commits April 5, 2018 13:05
Many changes also detailed in pull request
1. Added task_interval to all datasets so we know how imagination time
and epoching interval are related

2. Added docs to BaseDataset so it's less confusing

3. Re-added ImageryNClass since LvR is too restrictive for our current
datasets

4. Made paradigms look at the selected_events and not event_id
parameters since we may only care about e.g. motor imageries or
cognitive imagery

5. Separated filter bank and single-filter paradigms
@vinay-jayaram
Copy link
Collaborator Author

I solved the epoching issue by simply giving each dataset a parameter that specifies when the imagery starts and stops with respect to the events, so the interval is always relative to imagery start and not event (since events are dataset-specific and sometimes have baselines and other things)

@alexandrebarachant
Copy link
Member

Many datasets had inconsistent start and end times when compared with the publications. I do see how cutting the beginning/end of trials eliminates ramp up/down effects, but it's not applied consistently (datasets like Alex MI use everything) and so I went through the publications and changed all epoching parameters to the cues described in the papers.

Good call on this one. my initial intention was to skip some of the ERP related to the cue, but i have been sloppy. better to put the true events timing.

We need to standardize how epochs are given to the data -- I would suggest a function set_epoch that every dataset needs to implement, so that paradigms can go through and change the epoching if they want to.

We could do that using the interval from the paradigm. in that case the interval is relative to the timing of the dataset. for example

  • interval = [1, None] will epoch 1 second after the event, until the end.
  • interval = [1, 3] will take from 1 to 3 second
  • interval = [1, -1] from one to -1 (not sure about this one)

to implement this, we can use the function epochs.crop() from mne.

Weibo 2014 and Zhou 2016 datasets are added and have been tested with a function in moabb.tests.datasets that just runs through all subjects and confirms they can be downloaded and loaded properly.

We have this in tests,.download.py i put this out of the regular test suit to avoid travis downloading a bunch of data every time. I yet have to find the solution to skip those test on travis but run them localy.

I'n going to review your PR in detail, just tell me when you are done.

if not isinstance(subjects, list):
raise(ValueError("subjects must be a list"))

self.subject_list = subjects
self.n_sessions = sessions_per_subject
self.event_id = events
self.selected_events = events.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this is a good way to proceed. The paradigm should not modify the dataset object.
This expose to corner case where the same dataset object is used with different paradigm

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm. I will think of another solution

@@ -69,21 +69,22 @@ def process_raw(self, raw, dataset):

# get event id
if self.events is None:
event_id = dataset.event_id
event_id = dataset.selected_events
Copy link
Member

@alexandrebarachant alexandrebarachant Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment. If we restrict to a certain set of events, this should be done by the paradigm. the dataset is context independent.

# skip raw if no event found
return

# get interval
if self.interval is None:
tmin, tmax = dataset.interval
else:
# TODO: make this work with varying offsets since not all data starts at 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use epochs.crop() after epoching

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if someone wants to do a baseline though? Wouldn't it be better to define the window while making the Epochs object, instead of making it with the full task interval and then cropping it after?

FILES.append('https://dataverse.harvard.edu/api/access/datafile/2499179')


def eeg_data_path(base_path, subject):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow avoid this code duplication ?



class Weibo2014(BaseDataset):
"""Weibo 2014 Motor Imagery dataset"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be good if we can start documenting what contain each dataset.

We can copy / paste and modify the dataset description from the publication.

We should also add a Reference section where we cite the paper, so it appears in the doc. for example : https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/ensemble/weight_boosting.py#L364



class Zhou2016(BaseDataset):
"""Zhou 2016 Imagery dataset"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem, lets document this properly

@@ -0,0 +1,85 @@
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont really need a docstring module for this one. its not parsed by the doc.
you can still let this doc here, but put everything you want to see on the doc in the class docstring

@@ -11,3 +11,5 @@
from .openvibe_mi import OpenvibeMI
from .bbci_eeg_fnirs import BBCIEEGfNIRS
from .upper_limb import UpperLimb
from .Weibo2014 import Weibo2014
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, add them in docs/source/datasets.rst

Number of sessions per subject

events: dict of string: int
String codes for events matched with labels in the stim channel. Currently imagery codes codes can include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also have elbow and other stuff.

I'm wondering if we should start using MNE hierarchical event definition.

for example, you can define an event as hand/left and hand/right, which allow to select all hand event after epoching by doing Epochs['hand']. But this is another discussion

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we lose nothing by adding it so why not, I'll go through and change all left_hand to hand/left etc --although there is one more level to worry about, of imagined vs actual

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do imagined/hand/left, etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we can skip this for now, and see how we can deal with that later.

self.code = code
self.interval = interval
if task_interval is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still wondering if this is the right way to proceed.
We can set this to be the default interval, i.e. interval from the start to the end of MI.

This type of thing is very specific to MI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it generalizes OK, the case of ERPs is just that task_interval[0] is 0 and task_interval[1] is the length of time until the next trigger

@@ -125,8 +126,20 @@ def scoring(self):
return 'accuracy'


class MotorImagery(BaseMotorImagery):

def __init__(self, bp_low=8, bp_high=32, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use fmin and fmax, to be consistent with MNE.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Vinay Jayaram added 3 commits April 5, 2018 16:34
1. Use new function used_events to choose events for a paradigm within the
paradigm

2. Fixed bug in Physionet where hand and feet imagery couldn't be used together

3. Added documentation to new datasets and put them in .rst file

4. Simplified download code in Weibo2014
@vinay-jayaram
Copy link
Collaborator Author

vinay-jayaram commented Apr 5, 2018

all fixes except the changing of event codes implemented.


# pick events, based on event_id
try:
events = mne.pick_events(events, include=list(event_id.values()))
except RuntimeError:
# not all runs have all events. Reduce event dict to only occurring events
events_in_run = np.unique(events[:, 2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not necessary if we set the parameter on_missing='ignore' in the mne.Epochs call

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will check this out

Abstract
------------

Independent component analysis (ICA) as a promising spatial filtering method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, i did not express myself very clearly. I was thinking about the part of the article that describes the dataset.
The goal is to describe the typo of motor imagery, the timing, the task, etc.

For example, this part :

Throughout the experiment, the subject sat in a comfortable armchair facing a computer screen. The duration of each trial was 10 s as shown in Fig 1. A trial started by a short beep indicating 1 s preparation time, and followed by a red arrow pointing randomly to three directions (left, right, or bottom) lasting for 5 s and then presented a black screen for 4 s. The subject was instructed to immediately perform the imagination tasks of the left hand, right hand or foot movement respectively according to the cue direction, and try to relax during the black screen.

and

Every subject went through three sessions, each of which contained two consecutive runs with several minutes inter-run breaks, and each run comprised 75 trials (25 trials per class). The intervals between two sessions varied from several days to several months.


class Weibo2014(BaseDataset):
"""Weibo 2014 Motor Imagery dataset [1]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do the same,

In order to investigate the differences of the EEG patterns as well as cognitive process between simple limb motor imagery and compound limb motor imagery, seven kinds of mental tasks have been designed, involving three tasks of simple limb motor imagery (left hand, right hand, feet), three tasks of compound limb motor imagery combining hand with hand/foot (both hands, left hand combined with right foot, right hand combined with left foot) and rest state.

At the beginning of each trial (8 seconds), a white circle appeared at the center of the monitor. After 2 seconds, a red circle (preparation cue) appeared for 1 second to remind the subjects of paying attention to the character indication next. Then red circle disappeared and character indication (‘Left Hand’, ‘Left Hand & Right Foot’, et al) was presented on the screen for 4 seconds, during which the participants were asked to perform kinesthetic motor imagery rather than a visual type of imagery while avoiding any muscle movement. After 7 seconds, ‘Rest’ was presented for 1 second before next trial (Fig. 1(a)). The experiments were divided into 9 sections, involving 8 sections consisting of 60 trials each for six kinds of MI tasks (10 trials for each MI task in one section) and one section consisting of 80 trials for rest state. The sequence of six MI tasks was randomized. Intersection break was about 5 to 10 minutes.

update_path=None, verbose=None):
if subject not in self.subject_list:
raise(ValueError("Invalid subject number"))
key = 'MNE_DATASETS_WEIBO2014_PATH'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually got a weird warning about the non standard MNE key. Can we track down this kind of thing and change the keys for new datasets.

update_path=None, verbose=None):
if subject not in self.subject_list:
raise(ValueError("Invalid subject number"))
key = 'MNE_DATASETS_ZHOU2016_PATH'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem, lets change the key for MNE standard key (whatever the standard is)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM, this is a warning because it's not in the list a pre-approved config name. https://github.com/mne-tools/mne-python/blob/master/mne/utils.py#L1478

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try and deal with this at some point though, as the list keeps growing...maybe our own config file?

raise(ValueError("Invalid subject number"))
key = 'MNE_DATASETS_ZHOU2016_PATH'
path = _get_path(path, key, "Zhou 2016")
_do_path_update(path, True, key, "Zhou 2016")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have any other option than forcing the path ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what we do everywhere -- it should be another PR I think, revamping the download system

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep we should. Let's keep this for another PR (not a priority)

Number of sessions per subject

events: dict of string: int
String codes for events matched with labels in the stim channel. Currently imagery codes codes can include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do imagined/hand/left, etc

Number of sessions per subject

events: dict of string: int
String codes for events matched with labels in the stim channel. Currently imagery codes codes can include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we can skip this for now, and see how we can deal with that later.

paradigm: ['p300','imagery']
Defines what sort of dataset this is (currently only imagery is implemented)

task_interval: list of 2 entries or None
Copy link
Member

@alexandrebarachant alexandrebarachant Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being picky, but this change make me inconfortable, and i'm trying to avoid complicating the API more than necessary.

I do not think we need to have a double definition of time interval. The time interval should correspond to the interval of the task i.e. the motor imagery. so i would actually replace the interval by your task task_interval.

The actual timing of the trial does not bring us anything.
We changed the API significantly, and it is the paradigm itself that is doing the epoching, so we don't event need to expose tmin and tmax in the dataset anymore. If the user want to play with cropping the epoch, it is more natural to do it at the level of the paradigm (exactly like events or channels).

To my point of view, datasets are immuable objects, they are here to abstract the data as it has been recorded.

does it make sense ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think that does make the most sense. I'll change that everywhere

@@ -642,49 +642,52 @@ def data_path(self, subject, path=None, force_update=False,
class BNCI2014001(MNEBNCI):
"""BNCI 2014-001 Motor Imagery dataset"""

def __init__(self, tmin=3.5, tmax=5.5):
def __init__(self, tmin=0.5, tmax=4):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to this : https://lampx.tugraz.at/~bci/database/001-2014/description.pdf
the motor imagery start at 3 second after the beginning of the trial (trigger).
I dont get you new your new parameters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i get a decrease in perf with this parameters

Copy link
Member

@alexandrebarachant alexandrebarachant Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, the beep is at the trigger , the cue at 2 s, and the end around 6 s (epoch average at based on MNE epoching from the events)

image

import mne

from moabb.datasets import *
from moabb.paradigms import LeftRightImagery

dataset = BNCI2014001()

subject = 2
data = dataset.get_data([subject])
raw = mne.concatenate_raws(list(data[subject]['session_T'].values()))

raw.filter(1, 20)
events = mne.find_events(raw)

raw.pick_types(eeg=True)

events_id = dataset.event_id
epochs = mne.Epochs(raw, events, event_id=events_id,
                    tmin=-2, tmax=8, baseline=None, preload=True)


epochs.average().plot()

Copy link
Collaborator Author

@vinay-jayaram vinay-jayaram Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry--you're probably right, I was confused because I didn't know if the trigger was at the cue or the beep (I think in the eeg/fnirs one it's the cue...)

Vinay Jayaram added 4 commits April 6, 2018 12:56
- use on_missing

- intervals specified by paradigm are relative to imagery start

- added min interval search to utils

- paradigm datasets property now use interval search
@vinay-jayaram
Copy link
Collaborator Author

Making the BaseMotorImagery class abstract breaks the paradigm exploration script -- but I think it's reasonable to make it so, since just "MotorImagery" isn't specific enough for a paradigm, since it's not obvious how events should be handled. I'd say we switch to FilterBankImageryNClass and ImageryNClass


Data Acquisition
----------------------------------------
EEG and NIRS data was collected in an ordinary bright room. EEG data was recorded by a multichannel BrainAmp EEG amplifier with thirty active electrodes (Brain Products GmbH, Gilching, Germany) with linked mastoids reference at 1000 Hz sampling rate. The EEG amplifier was also used to measure the electrooculogram (EOG), electrocardiogram (ECG) and respiration with a piezo based breathing belt. Thirty EEG electrodes were placed on a custom-made stretchy fabric cap (EASYCAP GmbH, Herrsching am Ammersee, Germany) and placed according to the international 10-5 system (AFp1, AFp2, AFF1h, AFF2h, AFF5h, AFF6h, F3, F4, F7, F8, FCC3h, FCC4h, FCC5h, FCC6h, T7, T8, Cz, CCP3h, CCP4h, CCP5h, CCP6h, Pz, P3, P4, P7, P8, PPO1h, PPO2h, POO1, POO2 and Fz for ground electrode).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, Try to limit the line length at 79 char.
I'm soon gonna activate flake8 on travis, to enforce coding style, and this will make the test crash

second 7.5 the screen went blank and a random interval between 1.0 and 2.0
seconds was added to the trial.

[1] R. Leeb, F. Lee, C. Keinrath, R. Scherer, H. Bischof, G. Pfurtscheller.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chech the docstring syntax for what i did in Zhou2016. if you do it right, it create a link when you reference [1]_

BTW, thanks for adding those docstring !

you can look how it render by generating the doc (cd docs & make html). the output files are in docs/build/html

@alexandrebarachant
Copy link
Member

alexandrebarachant commented Apr 6, 2018

Making the BaseMotorImagery class abstract breaks the paradigm exploration script -- but I think it's reasonable to make it so, since just "MotorImagery" isn't specific enough for a paradigm, since it's not obvious how events should be handled. I'd say we switch to FilterBankImageryNClass and ImageryNClass

I'm thinking that we could just rename ImageryNClass to MotorImagery, and use it as our main most generic class. We could make something that use all events if n_classes is None, which will cover all the usecases.

@vinay-jayaram
Copy link
Collaborator Author

vinay-jayaram commented Apr 6, 2018

The number of classes varies so much though--I'm not sure it would be a valid comparison (and it would ruin the statistics) if we let them have different numbers. I'd force them to provide n_classes (or at least have the default be something as it is now), but keep it the same otherwise. In the future we can add logic that restricts the classes to be motor imagery but as of now that would be a huge pain

Vinay Jayaram added 2 commits April 6, 2018 16:19
BaseMotorImagery is still abstract
@@ -58,6 +60,10 @@ def verify(self, dataset):

# we should verify list of channels, somehow

@abc.abstractmethod
def used_events(self, dataset):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is not super clear, i would suggest to rename this function to get_event_id()
but it would be helpful to document this abstract method so we know what we should implement in child class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah if you have a better name go ahead and change it, I didn't put too much thought into that. paradigm_specific_events() maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would call it get_events() or something like that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good 👍

Copy link
Member

@alexandrebarachant alexandrebarachant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this one. This is really really good :)

Just one addition, can you make sure the doc in docs/sources/paradigms.rst is up to date with the names of the paradigm classes.

You can merge after that.

As a side note, i have been more picky than usual on the PR review. I know adding documentation and other non code related stuff does not always seems a good use of the time at the moment, but it pays off big time in the future (proper documentation = More user & less issues to deal with on github)

# skip raw if no event found
return

# get interval
if self.interval is None:
tmin, tmax = dataset.interval
else:
tmin, tmax = self.interval
tmin, tmax = [t + dataset.interval[0] for t in self.interval]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a something to define only one part of the interval.
for example [1, None] will just start from 1 to the end of the MI period.
This usecase allow to cut out the cue that usually contain a lot of predictive power and bias the results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of changing the input parameter interval to tmin and tmax. I can make the change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes actually thats easier and you're right about the None. It would be great if you did that :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, im doing it

@alexandrebarachant
Copy link
Member

Okay, I made the change, but i'm not done yet. I will take this occasion to document all paradigms.
please, do not merge yet.

quick question. are SinglePass and FilterBank supposed to be used ? How do they differs from repectively MotorImagery and FilterBankMotorImagery ?

@vinay-jayaram
Copy link
Collaborator Author

I would say it only exists if people want paradigms for non-motor imagery (there's some word association/navigation/positive memories datasets out there) so they know if it's a single pass or filter bank paradigm.

@vinay-jayaram
Copy link
Collaborator Author

yes -- they don't implement used_events() so they are still abstract classes, and theres less of a chance of weird mistakes where someone inherits from FBMotorImagery or MotorImagery and forgets to update the datasets or something (we should incidentally remove the datasets property from BaseMotorImagery if it isn't gone already)

@alexandrebarachant
Copy link
Member

alexandrebarachant commented Apr 7, 2018

Putting here a list of thing to do before merging. I will take care of them tomorrow

  • Review dataset documentation, make sure they render properly on the doc and have consistant citations format.
  • rename some datasets (BBCIEEG, Upperlimb, GigaDB) to have a citation-like name (e.g. Zouh2014)
  • split BBCIEEG dataset in two (dataset A and B), since motor imagery and mental arithmetic are recorded on separate session, it should not be possible to have them together.
  • properly document paradigm child class. make sure they render correctly. remove **kargs to make sure parameters shows up in the doc.
  • flake 8 should pass for datasets.

My goal is to be done with those dataset, and not comme back on them again.

@vinay-jayaram
Copy link
Collaborator Author

I added a function to tests/datasets that runs through a dataset and checks (kinda) that it conforms to the interface. We should make that better and document it so we have an easier time adding new datasets in the future.

I have a flake8 checker when I code but do you think we could increase the maximum line size? It's a bit of a pain at 80

@alexandrebarachant
Copy link
Member

I get a lot of error E501 with your code. The number of char is actually limited to 79.

If you work alone, its fine to pick the length you want, but in our case we have to conform with what most people use.

@@ -4,6 +4,22 @@
from moabb.datasets.fake import FakeDataset


def _run_tests_on_dataset(d):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is more or less a duplicate of what i have in tests/download.py.
this require to have a local copy of the datasets. I did not find a way to skip this from travis, so i did not add it to the test suit.

@alexandrebarachant
Copy link
Member

Okay, i'm done, there is just one thing to do : document the paradigms.

I'm still unsure about our current list of paradigms.
The motorImagery is both multiclass and binary classif. Some algorithm, like CSP, are supposed to be 2 class only.
anyway, lets document this and merge.

@vinay-jayaram
Copy link
Collaborator Author

I'm not happy with the structure myself, but I don't think CSP as two-class is as much a problem, since it can always be used one vs rest. Plus it would be interesting to see how that actually performs, I think?

@alexandrebarachant
Copy link
Member

This is actually not a problem because both MNE and pyriemann implementation of CSP handle multiclass. What i was saying is that we have something not ideal because pipeline are supposed to declare their list of compatible paradigm, and if we have a pipeline that is not compatible with multiclass, then we have an issue.
That is was why i originally implemented a TwoClassMotorImagery and a MultiClassMotorImagery paradigm.

@vinay-jayaram
Copy link
Collaborator Author

It wouldn't be hard to do that now either (just change LeftRight) but I don't like where it's heading, there are just so many possible ways of structuring an analysis...

could we force the pipelines to specify their requirements and not the exact paradigm name, and then have each paradigm generate a 'requirements met' list from its arguments? E.g. instead of calling FilterBank(...) just have flag 'multiple_bandpass' as True or something? I'd say that's another PR though

@alexandrebarachant
Copy link
Member

I agree, there is just too many possible combination here. I think you are right and we need to start having a more structured requirement list on the pipeline side instead of declaring the paradigm.

This is another PR.
Okay. lets merge this one an i will revisit the paradigm structure another day.

@vinay-jayaram vinay-jayaram merged commit 9ad04c3 into master Apr 8, 2018
@vinay-jayaram vinay-jayaram deleted the new_datasets branch April 8, 2018 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants