Skip to content

Commit

Permalink
Update README and UMAP file
Browse files Browse the repository at this point in the history
  • Loading branch information
Scott Campit committed Jan 27, 2022
1 parent 3be4339 commit 508e2ea
Show file tree
Hide file tree
Showing 3 changed files with 117 additions and 35 deletions.
37 changes: 20 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,34 @@
# Modeling the metabolic changes of the epithelial-to-mesenchymal transition
# Constraint-based modeling identifies cell-state specific metabolic vulnerabilities during the epithelial to mesenchymal transition

## Summary
This repository identifies metabolic enzymes that are essential to the epithelial-to-mesenchymal transition (EMT) in the context of lung adenocarcinoma.

**Three analyses are performed:**
This repository contains the code from the paper Constraint-based modeling identifies cell-state specific metabolic vulnerabilities during the epithelial to mesenchymal transition by Campit, S.E., Keshamouni, V.G., and Chandrasekaran, S.

1. Enrichment and differential expression of multiple lung adenocarcinoma omics datasets (Bulk RNA-Seq, single-cell RNA-Seq, Proteomics, and more)
2. Constraint-based metabolic reconstruction and analysis for metabolic flux analysis and fitness evaluation from gene and reaction knockouts.
3. Hypothesis generation from differential flux and growth sensitivty analysis
**Key analyses contained in notebooks:**

1. Data preprocessing for transcriptomics, proteomics, single-cell transcriptomics, CERES Score data, and other =omics datasets.
2. Constraint-based metabolic reconstruction and analysis code for simulating metabolic fluxes and growth resulting from gene and reaction knockout.
3. Statistical analyses for assessing differences between groups.

## Programming languages used in this analysis

* MATLAB version R2020b Update 4
* R version 4.03
* Python version 3.8.6

## Getting Started


## TO DO:
- [ ] Create Docker container for dependencies
- [ ] Clean up all code base further
- [ ] Edit all notebooks
- [ ] Update static website for short and graphical representation of paper

## Usage
COMING SOON
Three programming languages (Python / R / MATLAB) were used, based on availability of scientific libraries and strengths in specific tasks. Thus, we would recommend the following workflow to perform the entire analysis end-to-end. We will point to specific directories and scripts that are numbered by usage.

1. Exploratory data analysis and general understanding of data distributions: `notebooks/r/01_EDA/*.Rmd`
2. Preprocessing bulk -omics data for COBRA: `notebooks/r/02_DifferentialExpression/*.Rmd`
3. Preprocessing single-cell omics data for COBRA: `notebooks/r/03_Preprocess/*.Rmd`
4. Performing MAGIC data imputation for single-cell COBRA analysis: `notebooks/python/magic.ipynb`
5. Constraint-based reconstruction and analysis for bulk -omics data: `notebooks/matlab/01_bulk_analysis/RECON1/*.mlx`
6. Constraint-based reconstruction and analysis for single-cell -omics data: `notebooks/matlab/02_single_cell_analysis/recon1_scCOBRA.mlx`
7. Generating FBA-UMAP profiles: `notebooks/r/05_Embeddings/*.Rmd`
8. Statistical analyses: Google Colab notebooks can be found [here](https://drive.google.com/drive/folders/1kCNsrULvzgaTEH3387mAx7KbB_dJSO4p).

Note that there are additional QA/QC scripts and notebooks available as well.

## Contributing
Contributions to make this analysis better, more robust, and easier to follow are greatly appreciated. Here are the steps we ask of you:
Expand All @@ -40,4 +43,4 @@ Contributions to make this analysis better, more robust, and easier to follow ar
Distributed under the GNU License. See `LICENSE` for more information.

## Contact
For questions regarding the code deposited in this repository, please reach out to Scott Campit via email at: scampit [at] umich [dot] edu or via Twitter at @secampit.
For questions regarding the code deposited in this repository, please reach out to Scott Campit via email at: scampit [at] umich [dot] edu or via Twitter at [at] secampit.
38 changes: 31 additions & 7 deletions notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -188,22 +188,42 @@ to_use = names(mnorm_ko) %in% c("GLCt1", "HEX1", "PGI", "PFK", "FBA", "GAPD", "P
glycolysis_ko = mnorm_ko[, to_use]
merged_meanko = cbind(time, glycolysis_ko)
tmp = melt(merged_meanko, id=c('a549.meta.data.Time'))
tmp2 = tmp[tmp$variable == "ENO", ]
minmax <- function(x){(x-min(x))/(max(x)-min(x))}
tmp$density = minmax(tmp$value)
library(dplyr)
library(tidyr)
library(ggplot2)
tmp %>%
ggplot(aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
geom_density(alpha=0.4) +
geom_density(aes(y=..scaled.., alpha=0.4)) +
facet_wrap(~variable, ncol=4) +
#scale_y_log10()
labs(x="KO Growth Score", y="Density")
```

```{r}
set.seed(1234)
df = data.frame(value =round(c(rnorm(200,
mean=100,
sd=7))))
# import libraries ggplot2
library(ggplot2)
# create density plot
ggplot(df, aes(x=value)) + geom_density()
```

```{r}
ggplot(tmp, aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
geom_density(aes(y=..scaled.., alpha=0.4)) +
labs(x="KO Growth Score", y="Density")
```

### C. Merge with UMAP embedding
This combines the reaction ko data with the UMAP embedding.
```{r}
Expand Down Expand Up @@ -791,12 +811,16 @@ library(ggplot2)
tmp %>%
ggplot(aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
geom_density(alpha=0.4) +
geom_density(aes(y=..scaled.., alpha=0.4)) +
facet_wrap(~variable, ncol=4) +
#scale_y_log10()
labs(x="Glycolysis flux profile (individual reactions)", y="Density")
labs(x="Flux profiles", y="Density")
```

```{r}
ggplot(tmp, aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
geom_density(aes(y=..scaled.., alpha=0.4)) +
labs(x="Glycolysis flux profile (all)", y="Density")
```

### C. Merge with UMAP embedding
Expand Down
Loading

0 comments on commit 508e2ea

Please sign in to comment.