diff --git a/README.md b/README.md
index 4231ae5..d29e09e 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,14 @@
-# Modeling the metabolic changes of the epithelial-to-mesenchymal transition
+# Constraint-based modeling identifies cell-state specific metabolic vulnerabilities during the epithelial to mesenchymal transition
## Summary
-This repository identifies metabolic enzymes that are essential to the epithelial-to-mesenchymal transition (EMT) in the context of lung adenocarcinoma.
-**Three analyses are performed:**
+This repository contains the code from the paper Constraint-based modeling identifies cell-state specific metabolic vulnerabilities during the epithelial to mesenchymal transition by Campit, S.E., Keshamouni, V.G., and Chandrasekaran, S.
- 1. Enrichment and differential expression of multiple lung adenocarcinoma omics datasets (Bulk RNA-Seq, single-cell RNA-Seq, Proteomics, and more)
- 2. Constraint-based metabolic reconstruction and analysis for metabolic flux analysis and fitness evaluation from gene and reaction knockouts.
- 3. Hypothesis generation from differential flux and growth sensitivty analysis
+**Key analyses contained in notebooks:**
+
+ 1. Data preprocessing for transcriptomics, proteomics, single-cell transcriptomics, CERES Score data, and other =omics datasets.
+ 2. Constraint-based metabolic reconstruction and analysis code for simulating metabolic fluxes and growth resulting from gene and reaction knockout.
+ 3. Statistical analyses for assessing differences between groups.
## Programming languages used in this analysis
@@ -15,17 +16,19 @@ This repository identifies metabolic enzymes that are essential to the epithelia
* R version 4.03
* Python version 3.8.6
-## Getting Started
-
-
-## TO DO:
-- [ ] Create Docker container for dependencies
-- [ ] Clean up all code base further
-- [ ] Edit all notebooks
-- [ ] Update static website for short and graphical representation of paper
-
## Usage
-COMING SOON
+Three programming languages (Python / R / MATLAB) were used, based on availability of scientific libraries and strengths in specific tasks. Thus, we would recommend the following workflow to perform the entire analysis end-to-end. We will point to specific directories and scripts that are numbered by usage.
+
+ 1. Exploratory data analysis and general understanding of data distributions: `notebooks/r/01_EDA/*.Rmd`
+ 2. Preprocessing bulk -omics data for COBRA: `notebooks/r/02_DifferentialExpression/*.Rmd`
+ 3. Preprocessing single-cell omics data for COBRA: `notebooks/r/03_Preprocess/*.Rmd`
+ 4. Performing MAGIC data imputation for single-cell COBRA analysis: `notebooks/python/magic.ipynb`
+ 5. Constraint-based reconstruction and analysis for bulk -omics data: `notebooks/matlab/01_bulk_analysis/RECON1/*.mlx`
+ 6. Constraint-based reconstruction and analysis for single-cell -omics data: `notebooks/matlab/02_single_cell_analysis/recon1_scCOBRA.mlx`
+ 7. Generating FBA-UMAP profiles: `notebooks/r/05_Embeddings/*.Rmd`
+ 8. Statistical analyses: Google Colab notebooks can be found [here](https://drive.google.com/drive/folders/1kCNsrULvzgaTEH3387mAx7KbB_dJSO4p).
+
+Note that there are additional QA/QC scripts and notebooks available as well.
## Contributing
Contributions to make this analysis better, more robust, and easier to follow are greatly appreciated. Here are the steps we ask of you:
@@ -40,4 +43,4 @@ Contributions to make this analysis better, more robust, and easier to follow ar
Distributed under the GNU License. See `LICENSE` for more information.
## Contact
-For questions regarding the code deposited in this repository, please reach out to Scott Campit via email at: scampit [at] umich [dot] edu or via Twitter at @secampit.
+For questions regarding the code deposited in this repository, please reach out to Scott Campit via email at: scampit [at] umich [dot] edu or via Twitter at [at] secampit.
diff --git a/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.Rmd b/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.Rmd
index 399e99d..dc67851 100644
--- a/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.Rmd
+++ b/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.Rmd
@@ -188,7 +188,10 @@ to_use = names(mnorm_ko) %in% c("GLCt1", "HEX1", "PGI", "PFK", "FBA", "GAPD", "P
glycolysis_ko = mnorm_ko[, to_use]
merged_meanko = cbind(time, glycolysis_ko)
tmp = melt(merged_meanko, id=c('a549.meta.data.Time'))
-tmp2 = tmp[tmp$variable == "ENO", ]
+
+minmax <- function(x){(x-min(x))/(max(x)-min(x))}
+
+tmp$density = minmax(tmp$value)
library(dplyr)
library(tidyr)
@@ -196,14 +199,31 @@ library(ggplot2)
tmp %>%
ggplot(aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
- geom_density(alpha=0.4) +
+ geom_density(aes(y=..scaled.., alpha=0.4)) +
facet_wrap(~variable, ncol=4) +
- #scale_y_log10()
-
labs(x="KO Growth Score", y="Density")
```
+```{r}
+set.seed(1234)
+df = data.frame(value =round(c(rnorm(200,
+ mean=100,
+ sd=7))))
+
+# import libraries ggplot2
+library(ggplot2)
+
+# create density plot
+ggplot(df, aes(x=value)) + geom_density()
+```
+
+```{r}
+ggplot(tmp, aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
+geom_density(aes(y=..scaled.., alpha=0.4)) +
+labs(x="KO Growth Score", y="Density")
+```
+
### C. Merge with UMAP embedding
This combines the reaction ko data with the UMAP embedding.
```{r}
@@ -791,12 +811,16 @@ library(ggplot2)
tmp %>%
ggplot(aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
- geom_density(alpha=0.4) +
+ geom_density(aes(y=..scaled.., alpha=0.4)) +
facet_wrap(~variable, ncol=4) +
- #scale_y_log10()
+ labs(x="Glycolysis flux profile (individual reactions)", y="Density")
- labs(x="Flux profiles", y="Density")
+```
+```{r}
+ggplot(tmp, aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
+geom_density(aes(y=..scaled.., alpha=0.4)) +
+labs(x="Glycolysis flux profile (all)", y="Density")
```
### C. Merge with UMAP embedding
diff --git a/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.nb.html b/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.nb.html
index e654387..70523c6 100644
--- a/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.nb.html
+++ b/notebooks/r/05_Embeddings/03_MAGIC_UMAP_99.nb.html
@@ -2007,7 +2007,33 @@
Create density map profiles for glycolysis
-
+
+
+
+
+
+
+
+set.seed(1234)
+df = data.frame(value =round(c(rnorm(200,
+ mean=100,
+ sd=7))))
+
+# import libraries ggplot2
+library(ggplot2)
+
+# create density plot
+ggplot(df, aes(x=value)) + geom_density()
+
+
+
+
+
+
+
+
+
+
@@ -2037,12 +2063,11 @@ i. Mean Normalization
ii. Z-score
-
-merged_zko = merge(umap_data, zko, by.x='row.names', by.y='row.names')
+
+merged_zko = merge(umap_data, zko, by.x='row.names', by.y='row.names')
+rownames(merged_zko) = merged_zko$Row.names
+merged_zko = merged_zko[, -1]
-
-Error in as.data.frame(y) : object 'zko' not found
-
@@ -2085,13 +2110,12 @@ D. Visualize KO Data
Okay, we’re going to select 4 reactions from the top 100, plus enolase.
-
+
reactions_to_visualize = c("FACOAL204i", "CSm", "ACCOACm", "ENMAN6g", "NNATn")
reaction_names = rxnmap[rxnmap$rxn_ids %in% reactions_to_visualize, ]
reaction_names = reaction_names$rxn_name
#reaction_names[4] = "Enolase"
-#reaction_names[3] = "Nicotinamide-nucleotide adenylyltransferase"
-
+#reaction_names[3] = "Nicotinamide-nucleotide adenylyltransferase"
@@ -2639,8 +2663,39 @@ Create density map profiles for glycolysis
First, change the row to map to time.
+
+time = data.frame(a549@meta.data$Time)
+to_use = names(mnorm_flux) %in% c("GLCt1", "HEX1", "PGI", "PFK", "FBA", "GAPD", "PGK", "PGM", "ENO", "PYK", "PYRt2m", "PDHm")
+glycolysis_ko = mnorm_flux[, to_use]
+merged_meanko = cbind(time, glycolysis_ko)
+tmp = melt(merged_meanko, id=c('a549.meta.data.Time'))
+tmp2 = tmp[tmp$variable == "ENO", ]
+
+library(dplyr)
+library(tidyr)
+library(ggplot2)
+
+tmp %>%
+ ggplot(aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
+ geom_density(aes(y=..scaled.., alpha=0.4)) +
+ facet_wrap(~variable, ncol=4) +
+ labs(x="Glycolysis flux profile (individual reactions)", y="Density")
+
+
-
+
+
+
+
+
+
+
+ggplot(tmp, aes(x=value, color=a549.meta.data.Time, fill=a549.meta.data.Time)) +
+geom_density(aes(y=..scaled.., alpha=0.4)) +
+labs(x="Glycolysis flux profile (all)", y="Density")
+
+
+
@@ -2939,7 +2994,7 @@ v. Visualize Z-score KO

