add figure READMEs and bash scripts (#70)

WayScience · Aug 19, 2024 · cf7a665 · cf7a665
1 parent 74bee7e
commit cf7a665
Show file tree

Hide file tree

Showing 13 changed files with 144 additions and 4 deletions.
diff --git a/3.figures/README.md b/3.figures/README.md
@@ -0,0 +1,11 @@
+# Generate manuscript figures
+After evaluation results are extracted, we generate figures describing the results of our experiment.
+There are a total of six figures (four main and two supplemental).
+All figure PNGs are found in the [figures](./figures/) folder.
+
+1. [*Main figure 1*](./main_figure_1/): This figure describes our workflow and displays an image montage of the wildtype and null *NF1* genotype single cells, which are hard to distinguish just by eye.
+2. [Main figure 2](./main_figure_2/): This figure shows how subtle the morphological differences are between *NF1* genotypes at both the well-population and single-cell levels, which supports are reasoning to pursue a machine learning methodology.
+3. [Main figure 3](./main_figure_3/): This figure shows the results of the model evaluations (precision-recall, accuracy, and confusion matrices) as extracted in the second module of this repository.
+4. [Main figure 4](./main_figure_4/): This figure looks at the feature importances of the model when predicting *NF1* genotype. There are two image montages that show six example single-cells (three with the highest values of the feature and three with the lowest), one for each of the top features for predicting each genotype.
+5. [Supplemental figure 1](./supp_figure_1/): This figure is an extension of main figure 2, which facets the plot by plate to show that the subtle differences between *NF1* genotype are consistent. 
+6. [Supplemental figure 2](./supp_figure_2/): This figure shows the distributions of FOVs across blur (PowerLogLogSlope) and saturation (PercentMaximal) metrics and where the thresholds were assigned to detect poor-quality images.
diff --git a/3.figures/main_figure_1/README.md b/3.figures/main_figure_1/README.md
@@ -7,3 +7,9 @@ To generate the first main figure of the manuscript, there are 5 steps to follow
 3. While the crops are still in ImageJ, we manually stack the channels together into one composite image (make sure to keep source images). Then, we add 25 uM scales to each crop using 3.1065 uM/pixel in the `Analyze > Set Scale... module` (as identified from the metadata of the raw images). All crops are saved as PNGs back into the same folder.
 4. [2.create_image_montage.ipynb](./2.create_image_montage.ipynb): Using the updated and colored crops, we can now merge them together to make an image montage figure that labels each crop per channel and per genotype.
 5. [3.main_figure_1.ipynb](./3.main_figure_1.ipynb): Patch together the workflow image and image montage into one main figure.
+
+All steps can be ran with the bash script using the command below:
+
+```bash
+source main_figure_1.sh
+```
diff --git a/3.figures/main_figure_1/main_figure_1.sh b/3.figures/main_figure_1/main_figure_1.sh
@@ -0,0 +1,20 @@
+#!/bin/bash
+
+# initialize the correct shell for your machine to allow conda to work (see README for note on shell names)
+conda init bash
+# activate the python based analysis env
+conda activate nf1_analysis
+
+# convert all notebooks to script files into the scripts folder
+jupyter nbconvert --to script --output-dir=scripts/ *.ipynb
+
+# run the notebook for finding single-cell crops
+python scripts/1.find_sc_crops.py
+
+# deactivate python env and activate R env
+conda deactivate
+conda activate nf1_figures
+
+# run notebooks to generate image montage and main figure 1
+Rscript scripts/2.create_image_montage.r
+Rscript scripts/3.main_figure_1.r
diff --git a/3.figures/main_figure_2/README.md b/3.figures/main_figure_2/README.md
@@ -0,0 +1,12 @@
+# Creating main figure 2 - Morphology differences at single-cell and well-population levels
+
+To generate the second main figure of the manuscript, there are 2 steps to follow:
+
+1. [correlation_t_test.ipynb](./correlation_t_test.ipynb): For Panel C of the figure, we perform a t-test to evaluate if the means of the Pearson's correlation distributions (either wells are same or different genotype) are significantly different.
+2. [main_figure_2.ipynb](./main_figure_2.ipynb): Generate counts, UMAP, and density plot of the correlations and patch together to make one figure.
+
+All steps can be ran with the bash script using the command below:
+
+```bash
+source main_figure_2.sh
+```
diff --git a/3.figures/main_figure_2/main_figure_2.sh b/3.figures/main_figure_2/main_figure_2.sh
@@ -0,0 +1,19 @@
+#!/bin/bash
+
+# initialize the correct shell for your machine to allow conda to work (see README for note on shell names)
+conda init bash
+# activate the python based analysis env
+conda activate nf1_analysis
+
+# convert all notebooks to script files into the scripts folder
+jupyter nbconvert --to script --output-dir=scripts/ *.ipynb
+
+# run the notebook for running t-test
+python scripts/correlation_t_test.py
+
+# deactivate python env and activate R env
+conda deactivate
+conda activate nf1_figures
+
+# run notebooks to generate main figure 2
+Rscript scripts/main_figure_2.r
diff --git a/3.figures/main_figure_3/README.md b/3.figures/main_figure_3/README.md
@@ -0,0 +1,11 @@
+# Creating main figure 3 - Model evaluation results
+
+To generate the third main figure of the manuscript, there is one step to follow:
+
+1. [main_figure_3.ipynb](./main_figure_3.ipynb): Load in evaluation results, generate plots, and patch together to make one figure.
+
+All steps can be ran with the bash script using the command below:
+
+```bash
+source main_figure_3.sh
+```
diff --git a/3.figures/main_figure_3/main_figure_3.sh b/3.figures/main_figure_3/main_figure_3.sh
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+# initialize the correct shell for your machine to allow conda to work (see README for note on shell names)
+conda init bash
+# activate the R based analysis env
+conda activate nf1_figures
+
+# convert all notebooks to script files into the scripts folder
+jupyter nbconvert --to script --output-dir=scripts/ *.ipynb
+
+# run notebooks to generate main figure 3
+Rscript scripts/main_figure_3.r
diff --git a/3.figures/main_figure_4/README.md b/3.figures/main_figure_4/README.md
@@ -3,9 +3,12 @@
 To generate the fourth main figure of the manuscript, there are 4 steps to follow:
 
 1. [1.find_sc_crops_top_feat.ipynb](./1.find_sc_crops_top_feat.ipynb): Find the 6 top representative single cells (3 max and 3 min) for each of the two most weighted features, specifically for the Null genotype  (as the top WT feature was a correlation feature which we decided was harder to visualize).
-
 2. We manually stack the channels together into one composite image where blue is nuclei, red is actin, green is ER, and magenta is mitochondria. Then, we add 25 uM scales to each crop using 3.1065 uM/pixel in the `Analyze > Set Scale... module` (as identified from the metadata of the raw images). The composite images are saved as PNGs back into the same folder.
-
 3. [2.generate_image_montage.ipynb](./2.generate_image_montage.ipynb): Using the composite single cell crops, we can now merge them together to make an image montage figure that labels each crop per feature and as either the min/max of the feature.
-
 4. [3.main_figure_4.ipynb](./3.main_figure_4.ipynb): Patch together the coefficient plots and image montage into one main figure.
+
+All steps can be ran with the bash script using the command below:
+
+```bash
+source main_figure_4.sh
+```
diff --git a/3.figures/main_figure_4/main_figure_4.sh b/3.figures/main_figure_4/main_figure_4.sh
@@ -5,7 +5,7 @@ conda init bash
 # activate the python based analysis env
 conda activate nf1_analysis
 
-# convert all notebooks to python files into the scripts folder
+# convert all notebooks to script files into the scripts folder
 jupyter nbconvert --to script --output-dir=scripts/ *.ipynb
 
 # run the notebook for finding single-cell crops

diff --git a/3.figures/supp_figure_1/README.md b/3.figures/supp_figure_1/README.md
@@ -0,0 +1,11 @@
+# Creating supplemental figure 1 - Plate facet morphology differences at single-cell and well-population levels
+
+To generate the first supplemental figure of the manuscript, there is one step to follow:
+
+1. [SuppFigure1_splitbyplate.ipynb](./SuppFigure1_splitbyplate.ipynb): Generate counts, UMAPs, and density plots that are facetted by plate, and patch the plots together to make one figure.
+
+All steps can be ran with the bash script using the command below:
+
+```bash
+source supp_figure_1.sh
+```
diff --git a/3.figures/supp_figure_1/supp_figure_1.sh b/3.figures/supp_figure_1/supp_figure_1.sh
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+# initialize the correct shell for your machine to allow conda to work (see README for note on shell names)
+conda init bash
+# activate the R based analysis env
+conda activate nf1_figures
+
+# convert all notebooks to script files into the scripts folder
+jupyter nbconvert --to script --output-dir=scripts/ *.ipynb
+
+# run notebooks to generate supplemental figure 1
+Rscript scripts/SuppFigure1_splitbyplate.r
diff --git a/3.figures/supp_figure_2/README.md b/3.figures/supp_figure_2/README.md
@@ -0,0 +1,11 @@
+# Creating supplemental figure 2 - Image quality control distributions
+
+To generate the second supplemental figure of the manuscript, there is one step to follow:
+
+1. [SuppFigure2_qualitycontrol.ipynb](./SuppFigure2_qualitycontrol.ipynb): Generate distribution plots for blur and saturation metrics across plate, and patch the plots together to make one figure.
+
+All steps can be ran with the bash script using the command below:
+
+```bash
+source supp_figure_2.sh
+```
diff --git a/3.figures/supp_figure_2/supp_figure_2.sh b/3.figures/supp_figure_2/supp_figure_2.sh
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+# initialize the correct shell for your machine to allow conda to work (see README for note on shell names)
+conda init bash
+# activate the R based analysis env
+conda activate nf1_figures
+
+# convert all notebooks to script files into the scripts folder
+jupyter nbconvert --to script --output-dir=scripts/ *.ipynb
+
+# run notebooks to generate supplemental figure 2
+Rscript scripts/SuppFigure2_qaulitycontrol.r