Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up benchmarking & testing against the current MATLAB version #19

Open
3 of 7 tasks
alexmorley opened this issue May 25, 2020 · 3 comments
Open
3 of 7 tasks
Assignees

Comments

@alexmorley
Copy link
Collaborator

alexmorley commented May 25, 2020

NB the results from this are likely to be hard to interpret without #18

Rough steps required:

  • Run eMouse simulation in MATLAB
  • Run Kilosort2 MATLAB version, serialize results to phy format
  • Run Kilosort2 python version, serialize results to phy format
  • Determine metrics for similarity - which units found, how many spike times they share etc. IN PROGRESS
  • Script to automatically do all of the above automatically. IN PROGRESS
  • @rossant suggestion --> to find the divergence point in the implementation, compare the outputs of the different steps: preprocessing, main loop, postprocessing, when each step receives the same inputs in both MATLAB and Python IN PROGRESS
  • Test using the process from above but pulling the eMouse simulation file and the MATLAB results from the internet (or using locally saved files) - this test should be independent of MATLAB and will serve as a regression / parity test.
@alexmorley alexmorley self-assigned this May 25, 2020
@rossant
Copy link
Collaborator

rossant commented May 26, 2020

Additional notes:

  • identify specific datasets where there is a discrepancy between MATLAB and Python (I think @jaib1 has some ?)
  • to find the divergence point in the implementation, compare the outputs of the different steps: preprocessing, main loop, postprocessing, when each step receives the same inputs in both MATLAB and Python
  • once there is a good match between all tested datasets, add these datasets to the automated testing suite

@marius10p
Copy link
Collaborator

It would be great if @jaib1 could redo his comparisons after we port the modified Cuda kernels from @jenniferColonell, which make the algorithm deterministic.

@alexmorley
Copy link
Collaborator Author

to find the divergence point in the implementation, compare the outputs of the different steps: preprocessing, main loop, postprocessing, when each step receives the same inputs in both MATLAB and Python

I'm doing this now. I have set up a test script that uses the matlab engine API to run the various steps of the sorting alongside the pykilosort version (https://uk.mathworks.com/help/matlab/matlab_external/get-started-with-matlab-engine-for-python.html). It's a little faster to iterate than relying on file-based checkpoints but I haven't actually nailed down where the differences are coming from yet.

Am going to have another dig tomorrow evening. Will keep this issue up to date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants