Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve integration #120

Open
fe4960 opened this issue May 22, 2024 · 2 comments
Open

How to improve integration #120

fe4960 opened this issue May 22, 2024 · 2 comments

Comments

@fe4960
Copy link

fe4960 commented May 22, 2024

Hello,

Thanks for developing this great software. It has helped me a lot for integration of unpaired snRNA and snATAC. I recently run another dataset with my previous script following the scglue tutorial. The only difference between my script and the tutorial is that I used the 20% of top ranking variable peaks identified by episcanpy and 5000 hvg, which work quite well for all my previous scGlue analysis, since I have many cells and peaks. However, the result from the most recent analysis seems not optimal.

X_umap_BC_combined_label_v2

In the plot above, the "Unknown" label indicates the snATAC cells, while the rest labels are the snRNA cell types. It looks like snATAC doesn't integrate with snRNA well.

Here is the output of integration steps. Could you help take a look and suggest how to improve the integration? Thanks very much! @Jeff1995

[INFO] fit_SCGLUE: Pretraining SCGLUE model...
[INFO] autodevice: Using GPU 1 as computation device.
[INFO] check_graph: Checking variable coverage...
[INFO] check_graph: Checking edge attributes...
[INFO] check_graph: Checking self-loops...
[INFO] check_graph: Checking graph symmetry...
[INFO] SCGLUEModel: Setting graph_batch_size = 156794
[INFO] SCGLUEModel: Setting max_epochs = 48
[INFO] SCGLUEModel: Setting patience = 4
[INFO] SCGLUEModel: Setting reduce_lr_patience = 2
[INFO] SCGLUETrainer: Using training directory: "glue/pretrain"
[INFO] SCGLUETrainer: [Epoch 10] train={'g_nll': 0.418, 'g_kl': 0.001, 'g_elbo': 0.419, 'x_rna_nll': 0.254, 'x_rna_kl': 0.005, 'x_rna_elbo': 0.259, 'x_atac_nll': 0.056, 'x_atac_kl': 0.0, 'x_atac_elbo': 0.056, 'dsc_loss': 0.693, 'vae_loss': 0.332, 'gen_loss': 0.298}, val={'g_nll': 0.417, 'g_kl': 0.001, 'g_elbo': 0.418, 'x_rna_nll': 0.254, 'x_rna_kl': 0.005, 'x_rna_elbo': 0.259, 'x_atac_nll': 0.057, 'x_atac_kl': 0.0, 'x_atac_elbo': 0.057, 'dsc_loss': 0.694, 'vae_loss': 0.333, 'gen_loss': 0.298}, 648.6s elapsed
Epoch 00012: reducing learning rate of group 0 to 2.0000e-04.
Epoch 00012: reducing learning rate of group 0 to 2.0000e-04.
[INFO] LRScheduler: Learning rate reduction: step 1
Epoch 00019: reducing learning rate of group 0 to 2.0000e-05.
Epoch 00019: reducing learning rate of group 0 to 2.0000e-05.
[INFO] LRScheduler: Learning rate reduction: step 2
[INFO] SCGLUETrainer: [Epoch 20] train={'g_nll': 0.416, 'g_kl': 0.001, 'g_elbo': 0.417, 'x_rna_nll': 0.253, 'x_rna_kl': 0.005, 'x_rna_elbo': 0.258, 'x_atac_nll': 0.056, 'x_atac_kl': 0.0, 'x_atac_elbo': 0.056, 'dsc_loss': 0.692, 'vae_loss': 0.331, 'gen_loss': 0.296}, val={'g_nll': 0.416, 'g_kl': 0.001, 'g_elbo': 0.417, 'x_rna_nll': 0.253, 'x_rna_kl': 0.005, 'x_rna_elbo': 0.258, 'x_atac_nll': 0.057, 'x_atac_kl': 0.0, 'x_atac_elbo': 0.057, 'dsc_loss': 0.691, 'vae_loss': 0.332, 'gen_loss': 0.297}, 651.7s elapsed
Epoch 00022: reducing learning rate of group 0 to 2.0000e-06.
Epoch 00022: reducing learning rate of group 0 to 2.0000e-06.
[INFO] LRScheduler: Learning rate reduction: step 3
Epoch 00025: reducing learning rate of group 0 to 2.0000e-07.
Epoch 00025: reducing learning rate of group 0 to 2.0000e-07.
[INFO] LRScheduler: Learning rate reduction: step 4
[INFO] EarlyStopping: Restoring checkpoint "21"...
[INFO] EarlyStopping: Restoring checkpoint "21"...
[INFO] fit_SCGLUE: Estimating balancing weight...
[INFO] estimate_balancing_weight: Clustering cells...
[INFO] estimate_balancing_weight: Matching clusters...
[INFO] estimate_balancing_weight: Matching array shape = (28, 29)...
[INFO] estimate_balancing_weight: Estimating balancing weight...
[INFO] fit_SCGLUE: Fine-tuning SCGLUE model...
[INFO] check_graph: Checking variable coverage...
[INFO] check_graph: Checking edge attributes...
[INFO] check_graph: Checking self-loops...
[INFO] check_graph: Checking graph symmetry...
[INFO] SCGLUEModel: Setting graph_batch_size = 156794
[INFO] SCGLUEModel: Setting align_burnin = 8
[INFO] SCGLUEModel: Setting max_epochs = 48
[INFO] SCGLUEModel: Setting patience = 4
[INFO] SCGLUEModel: Setting reduce_lr_patience = 2
[INFO] SCGLUETrainer: Using training directory: "glue/fine-tune"
[INFO] SCGLUETrainer: [Epoch 10] train={'g_nll': 0.423, 'g_kl': 0.001, 'g_elbo': 0.424, 'x_rna_nll': 0.255, 'x_rna_kl': 0.005, 'x_rna_elbo': 0.26, 'x_atac_nll': 0.056, 'x_atac_kl': 0.0, 'x_atac_elbo': 0.056, 'dsc_loss': 0.675, 'vae_loss': 0.333, 'gen_loss': 0.3}, val={'g_nll': 0.422, 'g_kl': 0.001, 'g_elbo': 0.423, 'x_rna_nll': 0.254, 'x_rna_kl': 0.005, 'x_rna_elbo': 0.259, 'x_atac_nll': 0.057, 'x_atac_kl': 0.0, 'x_atac_elbo': 0.057, 'dsc_loss': 0.684, 'vae_loss': 0.333, 'gen_loss': 0.299}, 665.4s elapsed
Epoch 00012: reducing learning rate of group 0 to 2.0000e-04.
Epoch 00012: reducing learning rate of group 0 to 2.0000e-04.
[INFO] LRScheduler: Learning rate reduction: step 1
Epoch 00018: reducing learning rate of group 0 to 2.0000e-05.
Epoch 00018: reducing learning rate of group 0 to 2.0000e-05.
[INFO] LRScheduler: Learning rate reduction: step 2
[INFO] EarlyStopping: Restoring checkpoint "16"...
[INFO] EarlyStopping: Restoring checkpoint "16"...

@Jeff1995
Copy link
Collaborator

Hi @fe4960! What are the specific difference data-wise between the recent run and the previous runs? Does it contain a significantly larger number of cells and peaks?

Based on the training log only, it appears that the model is converging pretty fast (learning rate reduction in just 12 epochs, but this could also be due to large cell number). Could you compare the training log with previous runs that worked better and see if this is a premature convergence problem? If that was the case setting a larger patience value could be helpful.

@fe4960
Copy link
Author

fe4960 commented Jun 19, 2024

Hi @Jeff1995 , Thanks a lot for the suggestion! I am wondering what is the difference between the parameter "patience" and "reduce_lr_patience" . Indeed, the number of cells significant increased. Should I increase both "patience" and "reduce_lr_patience" values, e.g. "patience" : 20, "reduce_lr_patience" : 16, ? Thanks again.

Additionally, here are tensorbroad plot of glue for another dataset, which contain a large number of cells and did not integrated well.
Screenshot 2024-06-18 at 8 27 25 PM
Screenshot 2024-06-18 at 8 27 15 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants