Skip to content

Commit

Permalink
Fix typos (#48)
Browse files Browse the repository at this point in the history
* Correct indexing of modalities

* Fix order of words and articles

* Fix typo: involes -> involves

* Add space

* Fix typo: howver -> however

* Fix typo: mehods -> methods
  • Loading branch information
VladimirShitov authored Aug 25, 2023
1 parent 1256e54 commit 05f971a
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
8 changes: 4 additions & 4 deletions fundamentals/architecture.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ flowchart TD

1. [Ingestion](#ingestion): Convert raw sequencing data or count tables into MuData data for further processing.
2. [Splitting modalities](#sec-splitting): Creating several MuData objects, one per modality, out of a multimodal input sample.
3. [Unimodal Single Sample Processing](#sec-single-sample): tools applied to each modality of samples individually. Mostly involes the selection of true from false cells.
3. [Unimodal Single Sample Processing](#sec-single-sample): tools applied to each modality of samples individually. Mostly involves the selection of true from false cells.
4. [Unimodal Multi Sample Processing](#sec-multisample-processing): steps that require information from all samples together. Processing is still performed per-modality.
5. [Merging](#sec-merging): Creating one MuData object from several unimodal MuData input files.
6. [Initializing Integration](#sec-initializing-integration): Performs dimensionality reduction and cell type clustering on non-integrated samples. These are popular steps that would otherwise be executed manually or they provide input for downstream integration methods.
Expand Down Expand Up @@ -341,7 +341,7 @@ In order to perform demultiplexing, several tools have been made available in th
* [BCL Convert](../components/modules/demux/bcl_convert.qmd): general demultiplexing software by Illumina.
* Cellranger's [mkfastq](../components/modules/demux/cellranger_mkfastq.qmd): a wrapper around BCL Convert that provides extra convenience features for the processing of 10X single-cell data.

The alignment of reads from the FASTQ files to an appropriate genome reference is called mapping. The result of the mapping process are tables that count the number of times a read has been mapped to a certain feature and metadata information for the cells (observations) and features. There are different format that can be used to store this information together. Since OpenPipeline uses [MuData](./concepts.qmd#sec-common-file-format) as a common file format throughout its pipelines, a conversion to MuData is included in the mapping pipelines.The choice between workflows for mapping is dependant on your single-cell library provider and technology:
The alignment of reads from the FASTQ files to an appropriate genome reference is called mapping. The result of the mapping process are tables that count the number of times a read has been mapped to a certain feature and metadata information for the cells (observations) and features. There are different format that can be used to store this information together. Since OpenPipeline uses [MuData](./concepts.qmd#sec-common-file-format) as a common file format throughout its pipelines, a conversion to MuData is included in the mapping pipelines. The choice between workflows for mapping is dependant on your single-cell library provider and technology:

* For DB Genomics libraries, the [BD Rhapsody](../components/workflows/ingestion/bd_rhapsody.qmd) pipeline can be used.
* For 10X based libraries, either [cellranger count](../components/workflows/ingestion/cellranger_mapping.qmd) or [cellranger multi](../components/workflows/ingestion/cellranger_multi.qmd) is provided. For more information about the differences between the two and when to use which mapping software, please consult the [10X genomics website](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi#when-to-use-multi).
Expand Down Expand Up @@ -409,7 +409,7 @@ The removal of cells based on basic count statistics is split up into two parts:

Flagging cells for removal involved adding a boolean column to the `.obs` dataframe. After the cells have been flagged for removal, the cells are actually filtered using [do_filter](../components/modules/filter/do_filter.qmd), which reads the values in `.obs` and removed the cells labeled `True`. This applies the general phylosophy of "separation of concerns": one component is responsible for labeling the cells, another for removing them. This keeps the codebase for a single component small and its functionality testable.

The next and final step in the single-sample gene expression processing is doublet detection using [filter_with_scrublet](../components/modules/filter/filter_with_scrublet.qmd). Like `filter_with_counts`, it will not remove cells but add a column to `.obs` (which have the name `filter_with_scrublet` by default). The single-sample GEX workflow will not remove not be removed during the processing (hence no `do_filter`). Howver, you can choose to remove them yourself before doing your analyses by applying a filter with the column in `.obs` yourself.
The next and final step in the single-sample gene expression processing is doublet detection using [filter_with_scrublet](../components/modules/filter/filter_with_scrublet.qmd). Like `filter_with_counts`, it will not remove cells but add a column to `.obs` (which have the name `filter_with_scrublet` by default). The single-sample GEX workflow will not remove not be removed during the processing (hence no `do_filter`). However, you can choose to remove them yourself before doing your analyses by applying a filter with the column in `.obs` yourself.

~~~{.d2 layout=elk}
direction: right
Expand Down Expand Up @@ -687,7 +687,7 @@ style: {
~~~

## Integration Mehods {#sec-integration-methods}
## Integration Methods {#sec-integration-methods}
Integration is the alignment of cell types across samples. There exist three different types of integration methods, based on the degree of integration across modalities:

1. Unimodal integration across batches. For example: [scVI](../components/modules/integrate/scvi.qmd), [scanorama](../components/modules/integrate/scanorama.qmd), [harmony](../components/modules/integrate/harmonypy.qmd)
Expand Down
8 changes: 4 additions & 4 deletions fundamentals/concepts.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ MuData
│ ├─ .obsm
│ ├─ .varm
│ ├─ .uns
│ ├─ modality_1 (AnnData Object)
│ ├─ modality_2 (AnnData Object)
├─ .var
├─ .obs
├─ .obms
Expand All @@ -63,8 +63,8 @@ MuData
* `.X` and `.layers`: matrices storing the measurements with the columns being the variables measured and the rows being the observations (cells in most cases).
* `.var`: metadata for the variables (i.e. annotation for the columns of `.X` or any matrix in `.layers`). The number of rows in the .var datafame (or the length of each entry in the dictionairy) is equal to the number of columns in the measurement matrices.
* `.obs`: metadata for the observations (i.e. annotation for the rows of `.X` or any matrix in `.layers`). The number of rows in the .obs datafame (or the length of each entry in the dictionairy) is equal to the number of rows in the measurement matrices.
* `varm`: multi-dimensional the variable annotation. A key-dataframe mapping where the number of rows in each dataframe is equal to the number of columns in the measurement matrices.
* `obsm`: multi-dimensional the observation annotation. A key-dataframe mapping where the number of rows in each dataframe is equal to the number of rows in the measurement matrices.
* `varm`: the multi-dimensional variable annotation. A key-dataframe mapping where the number of rows in each dataframe is equal to the number of columns in the measurement matrices.
* `obsm`: the multi-dimensional observation annotation. A key-dataframe mapping where the number of rows in each dataframe is equal to the number of rows in the measurement matrices.
* `.uns`: A mapping where no restrictions are enforced on the dimensions of the data.

# Modularity and a language independent framework 🔳
Expand All @@ -73,4 +73,4 @@ TODO

# A graphical interface 📺

TODO
TODO

0 comments on commit 05f971a

Please sign in to comment.