Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with result caching if evaluation is interrupted #653

Open
toncho11 opened this issue Oct 3, 2024 · 6 comments
Open

Problem with result caching if evaluation is interrupted #653

toncho11 opened this issue Oct 3, 2024 · 6 comments

Comments

@toncho11
Copy link
Contributor

toncho11 commented Oct 3, 2024

I am running MOABB benchmark or just using the normal evaluation and the process is interrupted:

  • by the user
  • or there is an error (memory allocation in my case)

For example, if the interruption was at 3/64 subject then running again with overwrite=false will not process the other 64-3 subjects but it will simply use the 3 processed subjects in the final result as if the dataset was of 3 subjects (and not 64).
Actually the same happens when you use less subjects manually than the total number of subjects for the dataset.

I think this can lead to false results if you are not aware of this behavior.

@PierreGtch
Copy link
Collaborator

Indeed, we should save the results of the benchmark immediately after computation to avoid this issue. I tried implementing this, but something else was breaking.I could not find what. If you have time, you can take over this PR :) #422

@tomMoral
Copy link
Collaborator

tomMoral commented Oct 7, 2024

MAybe a way to fix this with not too much code would be to rely on joblib.Memory? (just an uninformed guess).
Happy to help if you think it could be valid.

@PierreGtch
Copy link
Collaborator

Indeed, replacing the for loops in the evaluations with joblib would be great! We started to discuss it here #481

@toncho11
Copy link
Contributor Author

toncho11 commented Oct 8, 2024

I think it saves the results or it marks the first processed subjects as such, but it does not continue processing the rest of the subjects after the interruption. It decides that the dataset is already entirely processed. Is there some flag for that? There should be a check that counts how many subjects were cached and how many needs to be further processed. There can even be a user message: "3 subjects already cached, proceeding with N-3" where N is the total number of subjects for this dataset.

@PierreGtch
Copy link
Collaborator

@toncho11 do you have a minimal example to reproduce this behavior?

@toncho11
Copy link
Contributor Author

toncho11 commented Oct 9, 2024

It is a subtle problem that I am trying to explain here.

  • Run the benchmark for P300 for example
  • You can press "Stop" in for example "Spyder" in the middle of a dataset processing (2-3 subjects already processed for example). Memorize the current dataset at which you have stopped processing.
  • Switch to "overwrite" false.
  • Re-run the benchmark and it will skip the dataset at which you were last entirely. It will take the results only from the subjects processed, not including the unprocessed subjects. And this is the problem. It keeps going to the next dataset even if there are unprocessed subjects left in the dataset.

Same goes for an error like calculation or memory allocation.
It is a silent problem where your result becomes false.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants