Extra features

Pre-process data

It is advised to filter low abundant features before running any analyses. preDA is a convenient function for doing this. It accepts the same input as testDA. It is grouping all the counts of the filtered features in a single feature named "Others". This is ensures that the total counts per sample is retained, such that filtering doesn't affect non-filtered features after normalization. Filtering can be based on number of samples the features are present in, the total number of reads for the features, the mean relative abundance of the features, or a combination of all three.

data.new <- preDA(data, min.samples = 2, min.reads = 10, min.abundance = 0)

The above retains features that are at least present in two samples and at least have at total count of 10 across all samples.

Plot p-value histograms for all the non-spiked features

# When mytest is the output from the testDA function:
plot(mytest, p = TRUE)

Post-hoc tests for linear models and ANOVAs

For anova and linear models we can run post-hoc tests for all pairwise comparisons of multi-class predictor/covars.

### Apply TukeyHSD for each feature for a selected variable and output the adjusted p-values:
# Works on "aov", "lao", "lao2", "aoc", "aoa"
results <- DA.aov(data, predictor, allResults = TRUE)
res.tukey <- DA.TukeyHSD(results, variable = "predictor") # variable can also be the name of a covar

### Apply lsmeans for each feature for a selected variable and output the adjusted p-values:
# This requires the lsmeans package.
# Works on "poi", "neb", "lrm", "llm", "llm2", "qpo", "znb", "zpo", "lmc", "lma"
results <- DA.lrm(data, predictor, allResults = TRUE)
res.lsm <- DA.lsmeans(results, variable = "predictor") # variable can also be the name of a covar

# For paired "lrm", "llm", "llm2", "lma", "lmc" the original predictor variable has to be supplied. For example:
results <- DA.lrm(data, predictor = mypred, paired = SubjectID,  allResults = TRUE)
res.lsm <- DA.lsmeans(results, variable = "predictor", predictor = mypred)

# and if covars are used, they also need to be supplied for paired "lrm", "llm", "llm2", "lma", "lmc". 
# For example:
results <- DA.lrm(data, predictor = mypred, paired = SubjectID, covars = list(covar1 = mycovar),  allResults = TRUE)
res.lsm <- DA.lsmeans(results, variable = "predictor", predictor = mypred, covars = list(covar1 = mycovar))

Test significance of covars

For linear models the drop1/anova functions can be used to test significance of the predictor and covars variables:

### Apply drop1 for each feature and output the adjusted p-values:
# Works on "zpo", "znb", "qpo", "neb", "poi". Non-paired "lrm", "llm", "llm2", "lma", "lmc"
results <- DA.lrm(data, predictor, allResults = TRUE)
res.drop1 <- DA.drop1(results)

### Apply anova for each feature and output the adjusted p-values:
# Works on "lrm", "llm", "llm2", "lma", "lmc". Non-paired "neb"
results <- DA.lrm(data, predictor, allResults = TRUE)
res.anova <- DA.anova(results)

Adding user-defined methods

"zzz" (DA.zzz) is a placeholder for user-defined methods. You have to supply it with a function whose input is: A count_table (data.frame, samples are columns, rownames indicate features), a predictor (vector), a paired variable (factor), and a covars argument (named list with vectors). It is OK if your function doesn't use paired/covars, but they have to be there in the arguments. The output from the user-defined function should be a data.frame that at least includes: The names of the features ("Feature"), nominal p-values ("pval"), and adjusted p-values ("pval.adj"), and name of method ("Method").

See example below on how to include a simple t-test on relative abundances:

# Define our function
myfun <- function(count_table, predictor, paired, covars){ # These fours arguments should not be altered
  
  # Relative abundance
  rel <- apply(count_table, 2, function(x) x/sum(x))
  
  # t-test function
  ## Wrapping this function in tryCatch(..., error = function(e){NA}) 
  ## ensures that our main function won't fail if t.test fails on some features
  tfun <- function(x){
    tryCatch(t.test(x ~ predictor)$p.value, error = function(e){NA}) 
  }
  
  # P-values for each feature
  pvals <- apply(rel, 1, tfun)
  
  # Collect and return data
  df <- data.frame(Feature = rownames(count_table),
                   pval = pvals)
  df$pval.adj <- p.adjust(df$pval, method = "fdr")
  df$Method <- "My own t-test"
  return(df)
  
}

# Test that it works on real data
res <- DA.zzz(data, predictor, FUN = myfun)

# Use it in the testDA function
mytest <- testDA(data, predictor, tests = c("zzz","ttt","wil"), args = list(zzz = list(FUN = myfun)))

# Several user-defined methods can be included by simply putting numbers after "zzz" in both the "tests" and "args" arguments:
mytest <- testDA(data, predictor, tests = c("zzz","zzz2","ttt","wil"), args = list(zzz = list(FUN = myfun),
                                                                                   zzz2 = list(FUN = myfun2)))

Caution: If "zzz" is in the tests argument, there will be no checks of whether any of the supplied methods are suitable for the data. If e.g. your predictor is quantitative and "ttt" is in the tests argument, it will try to run a t-test and fail miserably.

Passing arguments to the different tests

Additional arguments can be passed to the internal functions with the args argument. It should be structured as a list with elements named by the tests:

E.g. passing to the DA.per function that it should only run 1000 iterations:

mytest <- testDA(...,args = list(per=list(noOfIterations=1000)))

Include that the log t.test should use a pseudocount of 0.1:

mytest <- testDA(...,args = list(per=list(noOfIterations=1000), ltt=list(delta=0.1)))

Additional arguments are simply seperated by commas.

Below is an overview of which functions get the arguments that are passed to a specific test:

per - Passed to DA.per
bay - Passed to getPriors, getLikelihoods and DA.bay
adx - Passed to aldex and DA.adx
wil - Passed to wilcox.test and DA.wil
ttt - Passed to t.test and DA.ttt
ltt - Passed to t.test and DA.ltt
tta - Passed to t.test and DA.lta
ttc - Passed to t.test and DA.ltc
ltt2 - Passed to t.test and DA.ltt2
neb - Passed to glm.nb, glmer.nb and DA.neb
erq - Passed to calcNormFactors, estimateDisp, glmQLFit, glmQLFTest and DA.erq
ere - Passed to calcNormFactors, estimateCommonDisp, estimateTagwiseDisp, exactTest and DA.ere
msf - Passed to fitFeatureModel and DA.msf
zig - Passed to fitZig and DA.zig
ds2 - Passed to DESeq and DA.ds2
ds2x - Passed to DESeq and DA.ds2x
lim - Passed to eBayes, lmFit and DA.lim
lia - Passed to eBayes, lmFit and DA.lia
lic - Passed to eBayes, lmFit and DA.lic
lli - Passed to eBayes, lmFit and DA.lli
lli2 - Passed to eBayes, lmFit and DA.lli2
kru - Passed to kruskal.test and DA.kru
aov - Passed to aov and DA.aov
aoa - Passed to aov and DA.aoa
aoc - Passed to aov and DA.aoc
lao - Passed to aov and DA.lao
lao2 - Passed to aov and DA.lao2
lrm - Passed to lm, lme and DA.lrm
lma - Passed to lm, lme and DA.lma
lmc - Passed to lm, lme and DA.lmc
llm - Passed to lm, lme and DA.llm
llm2 - Passed to lm, lme and DA.llm2
rai - Passed to raida and DA.rai
spe - Passed to cor.test and DA.spe
pea - Passed to cor.test and DA.pea
poi - Passed to glm, glmer and DA.poi
qpo - Passed to glm and DA.qpo
vli - Passed to voom, eBayes, lmFit and DA.vli
zpo - Passed to zeroinfl and DA.zpo
znb - Passed to zeroinfl and DA.znb
fri - Passed to friedman.test and DA.fri
qua - Passed to quade.test and DA.qua
sam - Passed to SAMseq and DA.sam
mva - Passed to manyglm and summary.manyglm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly