This repository includes the code for implementing the method of the paper Fast Principal Component Analysis for Cryo-EM Images, arXiv preprint, 2022.
Our code provides a fast implementation of covariance estimation based on the recent Fourier-Bessel expansion method FLE. Based on the covariance estimation, it offers a fast, ab-initio and unsupervised method to jointly correct CTFs and denoise the cryo-EM images (called fast covariance Wiener filtering). As an option, it can also estimate amplitude contrasts of individual images, using the method of Ab-initio Contrast Estimation and Denoising of Cryo-EM Images. Our covariance estimation method is very fast compared to the existing ones, especially when there are many distinct CTFs. See below for runtime comparison and an example of denoised experimental image.
Our code relies on two packages: ASPIRE-python
for single particle reconstruction and FLE
for fast Fourier-Bessel expansion. Please see the two repositories for the installation details. FLE
only allows double precision operations for higher accuracy and better numerical stability. Alternatively, one can use fle_2d_single.py
provided in this repository for single precision operations, though it's not expected to be as accurate as FLE
. To use fle_2d_single.py
, one should put it in the same directory of the file jn_zeros_n=3000_nt=2500.mat
.
After installing all the dependencies, run demo code demo_synthetic_data_white_noise.py
and demo_synthetic_data_colored_noise.py
for synthetic simulations with white and colored noise respectively. To run demo code demo_experimental_data_10081.py
for experimental data, download picked particle data from EMPIAR-10081 (use the one of size 13.7 GB) to the directory ./data/Particles/micrographs/
, and use the star file provided in ./data/particle_stacks/
, not the one on EMPIAR-10081.
Our covariance solver offers a variety of options:
default_options = {
"whiten": True,
"noise_psd": None,
"store_noise_psd": True,
"noise_var": None,
"radius": 0.9,
"batch_size": 1000,
"single_pass": True,
"single_whiten_filter": False,
"flip_sign": False,
"correct_contrast": False,
"subtract_background": False,
"dtype": np.float64,
"verbose": True
}
Here we briefly explain each one:
whiten
: whether or not whiten the image
noise_psd
: if choose None
, then noise power spectrum (we assume psd as radial functions) will be estimated. The radial function representation (1-D data) largely reduces the memory requirement, and it also reduces the noise in the psd estimation by averaging over the rings (psd on the rings are estimated by NUFFT).
store_noise_psd
: whether stores noise psd (1-D radial functions)
noise_var
: noise variance, automatically set as 1 if whiten is true
radius
: radius of mask used to estimate noise psd and background mean/variance. maximum value is 1.
batch_size
: batch size for loading image data; useful when the number of images is large.
single_pass
: whether estimate mean and covariance together (so only one pass over data), otherwise estimate mean and then covariance (two passes over data)
single_whiten_filter
: whether use only one whiten filter for all images. If choose False
, then whiten image by defocus groups.
flip_sign
: whether flip sign when loading raw images. A common choice is True
for raw experimental images.
correct_contrast
: whether correct and estimate image amplitude contrast.
subtract_background
: whether estimate and subtract background mean
dtype
: single or double precision
verbose
: whether output progress when running code
Our covariance solver also allows flexible input. All of the following three methods can be used to call our covariance estimator and denoiser. If some arguments are missing, they will be automatically estimated.
Example 1: run in three steps, two passes over data for covariance estimation
mean_est = fast_pca.estimate_mean()
_, covar_est = fast_pca.estimate_mean_covar(mean_est=mean_est)
results = fast_pca.denoise_images(mean_est=mean_est, covar_est=covar_est, denoise_options=denoise_options)
Example 2: single pass over data (combine mean and covariance estimation)
mean_est, covar_est = fast_pca.estimate_mean_covar()
results = fast_pca.denoise_images(mean_est=mean_est, covar_est=covar_est, denoise_options=denoise_options)
Example 3: combine all steps in one-line of code (equivalent to example 2)
results = fast_pca.denoise_images(denoise_options=denoise_options)