Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to save sorted spikes time stamp in spikeinterface #3599

Open
venkatbits opened this issue Jan 8, 2025 · 14 comments
Open

How to save sorted spikes time stamp in spikeinterface #3599

venkatbits opened this issue Jan 8, 2025 · 14 comments
Labels
question General question regarding SI

Comments

@venkatbits
Copy link

Hi All

I have searched extensively in the spike interface documentation and web but unable to get this information on how to save all the sorted spike times hence raising this issue

I am running the following commands on my data

sorting_spycir2 = ss.run_sorter(sorter_name="spykingcircus2", recording=recording_seg, output_folder="C:/Users//Desktop/spike/folder_spykingcircus2_all_chanels")

folder = ' C:/Users//Desktop/spike/waveforms_spycir2_all'
we_spycir2_all= si.extract_waveforms(recording_seg, sorting_spycir2, folder, load_if_exists=None, ms_before=1, ms_after=2., max_spikes_per_unit=500, n_jobs=1, chunk_size=30000)
print(we_spycir2_all)

sorting_analyzer = si.create_sorting_analyzer(sorting=sorting_spycir2, recording=recording_seg)
sorting_analyzer.compute(['random_spikes', 'waveforms', 'templates', 'noise_levels'])
sorting_analyzer.compute(['spike_locations'])

export_report(sorting_analyzer=sorting_analyzer, output_folder='C:/Users/Desktop/ spike /folder_spykingcircus_allchanels_report')
In the above generated report I could not find the time stamps of each detected spike

I want to save all the detected/sorted spikes time stamps. Can you kindly help with this
thanks
Venkat

@yger
Copy link
Collaborator

yger commented Jan 8, 2025

the time/sorted spikes are stored in the sorting objects. You can get them as numpy arrays by doing

sorting_spycir2.get_all_spike_trains()

@venkatbits
Copy link
Author

Thanks very much, it worked, I am sorry to bother can you kindly help me with how to save this as a text or excel sheet I am new to python

@zm711
Copy link
Collaborator

zm711 commented Jan 8, 2025

Does it have to be an text or csv. You can save the sorting object and then you'll always have the spike times. Are you planning to switch to a different programming environment?

@venkatbits
Copy link
Author

venkatbits commented Jan 8, 2025 via email

@alejoe91 alejoe91 added the question General question regarding SI label Jan 8, 2025
@zm711
Copy link
Collaborator

zm711 commented Jan 8, 2025

To save it as a sorting object all you have to do is type:

sorting.save(xx)

where the xx will be the format you want and the location to save it to.

For the excel or text how do you need the data organized? The easiest would honestly be just a couple columns of the label and the spike time and then you could index into the excel to get the spike train for each neuron. Again that would require some programming experience.

I guess the other way would be to save each neuron as a separate column/row of an excel/txt file. This would be super messy and not as storage friendly. What is the experience level of your collaborators? Or how do they want to interact with the data?

@venkatbits
Copy link
Author

venkatbits commented Jan 9, 2025 via email

@zm711
Copy link
Collaborator

zm711 commented Jan 9, 2025

@venkatbits thanks for that info. It actually changes a lot.

So at a fundamental level a sorting contains two vectors of information-- the spike times (ie when the spikes occurred) and the spike labels (we call them unit ids in spikeinterface, but some people prefer cluster ids or calling them neuron ids). Our sorting object also has other information like segment info (which doesn't matter for monosegment things) as well as some metadata. What we don't explicitly have is "channel 4". This is because one unit/neuron can generate spikes on multiple channels. So if you're just counting spikes on channel 4 and on channel 3 then they will likely be some of the same neurons and so you would be double counting spikes unfairly (unless you channels are super isolated).

So our strategy is to generate unit locations based on one of three computational strategies during post processing to give you the location of your unit. But this isn't spikes/channel. Rather the unit location is one more piece of information to go along with each unit it. (I can share docs on how to use our analyzer if you are interested).

But in your case if you care about spikes/channel then we really need to ask why? Are you're channels completed isolated such that each channel can't "see" what is on other channels? In this case you don't even necessarily need to spike sort. You could just threshold the data yourself (or use our peak detection tools).

Or rather than just think about spikes along channel four we could just give you all the spike times and then your collaborator can look at his spike times and see which ones match? So you are trying to manually validate SC2?

Sorry we have to ask so many questions but how we organize the data for your collaborator really depends on what you want to do with the data.

@venkatbits
Copy link
Author

venkatbits commented Jan 9, 2025 via email

@zm711
Copy link
Collaborator

zm711 commented Jan 9, 2025

yeah of course. @samuelgarcia or @alejoe91 will detect_peaks work on one channel only? Or @yger do you have any opinions on whether SC2 will actually work with monotrode data?

In this case though your current sorting object should all be on channel 4, so if you want to save whatever you do have then you could take the spike_vector with:

import pandas as pd

# this collects the information as two long balanced vectors
spike_vector = sorting.to_spike_vector()
spike_times = spike_vector["sample_index"]
spike_indices = spike_vector["unit_index"]

# this makes sure the numbers are coordinated between the sorting object and the excel
spike_unit_ids = np.array([sorting.unit_ids[unit] for unit in spike_indices])

# dataframes are similar to excels
df = pd.DataFrame({"Spike Times": spike_times, "Unit Id": spike_unit_ids})
# this will create a csv, but you will need to fill out the location and name to fit with what you want
# for example name of file and location. See the Pandas documentation for arguments you need to add
df.to_csv() 

@venkatbits
Copy link
Author

venkatbits commented Jan 9, 2025 via email

@jakeswann1
Copy link
Contributor

I have used the detect_peaks() function with the 'by_channel' method to look at all detected peaks channel-by-channel in the past, to get a rough sense for how much a spike sorter might be missing, then created a sorting object from this and exported to phy to do some super manual cluster cutting, where essentially each channels detected peaks were treated as a 'unit' initially. This seemed to work well for that particular purpose, and I can share some code if that would be useful

@venkatbits
Copy link
Author

venkatbits commented Jan 11, 2025 via email

@samuelgarcia
Copy link
Member

Yes detect_peaks works with one unique channel.

And this should be faster

unit_indices = spike_vector["unit_index"]
spike_unit_ids = sorting.unit_ids[unit_indices]

@venkatbits
Copy link
Author

venkatbits commented Jan 13, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question regarding SI
Projects
None yet
Development

No branches or pull requests

6 participants