DGE2

Introduction

DGE2 is a nextflow pipeline built using code and infrastructure developed and maintained by the nf-core initative. It was developed to perform differential gene expression analysis after the data has been preprocessed with the nf-core/rnaseq pipeline (v3+) with default star_salmon alignment.

Takes salmon quantification files and a metadata file as input
Performs differential gene expression analysis over a specific design or if one is not specified, over all possible designs from the metadata file
Generates summary plots (PCA, volcano, heatmap) and txt files, as well as a summary HTML report
Runs gene set enrichment analysis on the preRanked list of genes from the DGE results

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

If you have run the nf-core/rnaseq pipeline with default aligner (star/salmon), you should have a results/star_salmon/ folder with several additional folders and files, including a quant.sf file for each sample, plus a tx2gene.tsv file with the correspondence between transcript and gene identifiers:

results/star_salmon/SAMPLE_1/quant.sf
results/star_salmon/SAMPLE_2/quant.sf
results/star_salmon/SAMPLE_3/quant.sf
results/star_salmon/SAMPLE_4/quant.sf
results/star_salmon/SAMPLE_5/quant.sf
results/star_salmon/SAMPLE_6/quant.sf
results/star_salmon/tx2gene.tsv
[... other files and folders...]

In the above example, you would pass the results/ folder to the DGE2 pipeline using the --inputdir argument

Additionally, you will need to prepare a metadata.txt file that looks as follows:

SampleID	Levels  Status
SAMPLE_1	high  ctr
SAMPLE_2	high  ctr
SAMPLE_3	med  ctr
SAMPLE_4	low  case
SAMPLE_5	low  case
SAMPLE_6	low  case

This should be a txt file where the first column are the sample IDs, and the other (1 or more) columns displays the conditions for each sample. The samples must match those in the results/star_salmon inputdir.

Now, you can run the pipeline using:

nextflow run lconde-ucl/DGE2 \
   -profile <docker/singularity/.../institute> \
   --inputdir <PATH/TO/INPUTDIR/> \
   --metadata <PATH/TO/METADATA> \
   --outdir <OUTDIR>

For more details and further functionality, please refer to the usage documentation

Pipeline output

The pipeline produces text files and plots with the DGE and GSEA results, as well as an HTML report that contains a summary of the DGE results. For more details about the output files and reports, please refer to the output documentation.

Credits

DGE2 was developed by Lucia Conde in 2024. This is a DSL2 version of an older (DSL1) DGE pipeline developed in 2019

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

This pipeline uses code and infrastructure developed and maintained by the nf-core initative, and reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Additional references of tools and data used in this pipeline are in CITATIONS

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.devcontainer		.devcontainer
.github		.github
assets		assets
conf		conf
docs		docs
lib		lib
modules		modules
subworkflows/local		subworkflows/local
workflows		workflows
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.nf-core.yml		.nf-core.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
.prettierrc.yml		.prettierrc.yml
CHANGELOG.md		CHANGELOG.md
CITATIONS.md		CITATIONS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
modules.json		modules.json
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
pyproject.toml		pyproject.toml
tower.yml		tower.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DGE2

Introduction

Usage

Pipeline output

Credits

Contributions and Support

Citations

About

Releases 1

Packages

Languages

License

lconde-ucl/DGE2

Folders and files

Latest commit

History

Repository files navigation

DGE2

Introduction

Usage

Pipeline output

Credits

Contributions and Support

Citations

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages