How many peaks should I use for the input of scATAC-seq data? #15

Smilenone · 2024-07-27T18:08:13Z

I found it very slow when I used a 30k scATACK-seq data with top 50k peaks, how many peaks should I use for the input of scATAC-seq data?

PeterZZQ · 2024-07-28T21:23:38Z

Yes, the running time of the model depends on the number of features (especially the peaks) you used in the data, because scDART builds a larger neural network when the number of peaks is larger. That is why we did some peak filtering before running the model.

To improve the running speed of the model, you can

reduce the size of each mini-batch when training scDART.
select the highly variable peaks and reduce the peak number
Bin the closely located peaks into a larger peak and reduce the overall peak numbers.

There is no recommended number of peaks for scATAC-seq data, fewer peaks can make the model run faster but can also cause the loss of important biological information. There is definitely a trade-off and it heavily depends on the sequencing quality of your scATAC-seq data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many peaks should I use for the input of scATAC-seq data? #15

How many peaks should I use for the input of scATAC-seq data? #15

Smilenone commented Jul 27, 2024

PeterZZQ commented Jul 28, 2024

How many peaks should I use for the input of scATAC-seq data? #15

How many peaks should I use for the input of scATAC-seq data? #15

Comments

Smilenone commented Jul 27, 2024

PeterZZQ commented Jul 28, 2024