Skip to content

Introduction

Jakob Russel edited this page Feb 11, 2021 · 5 revisions

About

DAtest is a package for comparing different differential abundance methods used in microbial marker-gene (e.g. 16S rRNA), RNA-seq and protein/metabolite abundance analysis.

There are many methods for testing differential abundance and no gold standard, but results can vary a lot between the different statistical methods. This package aims at aiding the analyst in choosing a method for a specific dataset based on empirical testing.

The method goes as follows:

  1. Shuffle predictor variable (E.g. case vs. control)
  2. Spike in data for some randomly chosen features (OTU/ASV/gene/protein/metabolite), such that they are associated with the shuffled predictor
  3. Apply methods, and check:
    • whether they can find the spike-ins
    • whether the false discovery rate is controlled

Cite

Please cite the following publication if you use the DAtest package:

Russel et al. (2018) DAtest: A framework for choosing differential abundance or expression method. biorXiv

If relevant, remember to also cite the method you end up using for your final analysis (See implemented methods for links).

Assumptions

The method assumes that most features are NOT associated with the predictor. It is therefore advised to first run an ordination (PCA/PCoA), PERMANOVA or similar and if there is clear separation associated with the predictor, DAtest should not be used to choose a method. In this case, only the False Positive Rate (FPR) can be trusted, and it can be used to filter methods with a high FPR.

Clone this wiki locally