-
Notifications
You must be signed in to change notification settings - Fork 9
Extra features
It is advised to filter low abundant features before running any analyses. preDA
is a convenient function for doing this. It accepts the same input as testDA
. It is grouping all the counts of the filtered features in a single feature named "Others". This is ensures that the total counts per sample is retained, such that filtering doesn't affect non-filtered features after normalization. Filtering can be based on number
of samples the features are present in, the total number of reads for
the features, the mean relative abundance of the features, or a
combination of all three.
data.new <- preDA(data, min.samples = 2, min.reads = 10, min.abundance = 0)
The above retains features that are at least present in two samples and at least have at total count of 10 across all samples.
# When mytest is the output from the testDA function:
plot(mytest, p = TRUE)
For anova and linear models we can run post-hoc tests for all pairwise
comparisons of multi-class predictor
/covars
.
### Apply TukeyHSD for each feature for a selected variable and output the adjusted p-values:
# Works on "aov", "lao", "lao2", "aoc", "aoa"
results <- DA.aov(data, predictor, allResults = TRUE)
res.tukey <- DA.TukeyHSD(results, variable = "predictor") # variable can also be the name of a covar
### Apply lsmeans for each feature for a selected variable and output the adjusted p-values:
# This requires the lsmeans package.
# Works on "poi", "neb", "lrm", "llm", "llm2", "qpo", "znb", "zpo", "lmc", "lma"
results <- DA.lrm(data, predictor, allResults = TRUE)
res.lsm <- DA.lsmeans(results, variable = "predictor") # variable can also be the name of a covar
# For paired "lrm", "llm", "llm2", "lma", "lmc" the original predictor variable has to be supplied. For example:
results <- DA.lrm(data, predictor = mypred, paired = SubjectID, allResults = TRUE)
res.lsm <- DA.lsmeans(results, variable = "predictor", predictor = mypred)
# and if covars are used, they also need to be supplied for paired "lrm", "llm", "llm2", "lma", "lmc".
# For example:
results <- DA.lrm(data, predictor = mypred, paired = SubjectID, covars = list(covar1 = mycovar), allResults = TRUE)
res.lsm <- DA.lsmeans(results, variable = "predictor", predictor = mypred, covars = list(covar1 = mycovar))
For linear models the drop1
/anova
functions can be used to test
significance of the predictor
and covars
variables:
### Apply drop1 for each feature and output the adjusted p-values:
# Works on "zpo", "znb", "qpo", "neb", "poi". Non-paired "lrm", "llm", "llm2", "lma", "lmc"
results <- DA.lrm(data, predictor, allResults = TRUE)
res.drop1 <- DA.drop1(results)
### Apply anova for each feature and output the adjusted p-values:
# Works on "lrm", "llm", "llm2", "lma", "lmc". Non-paired "neb"
results <- DA.lrm(data, predictor, allResults = TRUE)
res.anova <- DA.anova(results)
"zzz" (DA.zzz
) is a placeholder for user-defined methods. You have to
supply it with a function whose input is: A count_table
(data.frame,
samples are columns, rownames indicate features), a predictor
(vector), a paired
variable (factor), and a covars
argument (named
list with vectors). It is OK if your function doesn't use
paired
/covars
, but they have to be there in the arguments. The
output from the user-defined function should be a data.frame that at
least includes: The names of the features ("Feature"), nominal p-values
("pval"), and adjusted p-values ("pval.adj"), and name of method ("Method").
See example below on how to include a simple t-test on relative abundances:
# Define our function
myfun <- function(count_table, predictor, paired, covars){ # These fours arguments should not be altered
# Relative abundance
rel <- apply(count_table, 2, function(x) x/sum(x))
# t-test function
## Wrapping this function in tryCatch(..., error = function(e){NA})
## ensures that our main function won't fail if t.test fails on some features
tfun <- function(x){
tryCatch(t.test(x ~ predictor)$p.value, error = function(e){NA})
}
# P-values for each feature
pvals <- apply(rel, 1, tfun)
# Collect and return data
df <- data.frame(Feature = rownames(count_table),
pval = pvals)
df$pval.adj <- p.adjust(df$pval, method = "fdr")
df$Method <- "My own t-test"
return(df)
}
# Test that it works on real data
res <- DA.zzz(data, predictor, FUN = myfun)
# Use it in the testDA function
mytest <- testDA(data, predictor, tests = c("zzz","ttt","wil"), args = list(zzz = list(FUN = myfun)))
# Several user-defined methods can be included by simply putting numbers after "zzz" in both the "tests" and "args" arguments:
mytest <- testDA(data, predictor, tests = c("zzz","zzz2","ttt","wil"), args = list(zzz = list(FUN = myfun),
zzz2 = list(FUN = myfun2)))
Caution: If "zzz" is in the tests
argument, there will be no
checks of whether any of the supplied methods are suitable for the data.
If e.g. your predictor
is quantitative and "ttt" is in the tests
argument, it will try to run a t-test and fail miserably.
Additional arguments can be passed to the internal functions with the
args
argument. It should be structured as a list with elements named
by the tests:
E.g. passing to the DA.per function that it should only run 1000 iterations:
mytest <- testDA(...,args = list(per=list(noOfIterations=1000)))
Include that the log t.test should use a pseudocount of 0.1:
mytest <- testDA(...,args = list(per=list(noOfIterations=1000), ltt=list(delta=0.1)))
Additional arguments are simply seperated by commas.
Below is an overview of which functions get the arguments that are passed to a specific test:
- per - Passed to
DA.per
- bay - Passed to
getPriors
,getLikelihoods
andDA.bay
- adx - Passed to
aldex
andDA.adx
- wil - Passed to
wilcox.test
andDA.wil
- ttt - Passed to
t.test
andDA.ttt
- ltt - Passed to
t.test
andDA.ltt
- tta - Passed to
t.test
andDA.lta
- ttc - Passed to
t.test
andDA.ltc
- ltt2 - Passed to
t.test
andDA.ltt2
- neb - Passed to
glm.nb
,glmer.nb
andDA.neb
- erq - Passed to
calcNormFactors
,estimateDisp
,glmQLFit
,glmQLFTest
andDA.erq
- ere - Passed to
calcNormFactors
,estimateCommonDisp
,estimateTagwiseDisp
,exactTest
andDA.ere
- msf - Passed to
fitFeatureModel
andDA.msf
- zig - Passed to
fitZig
andDA.zig
- ds2 - Passed to
DESeq
andDA.ds2
- ds2x - Passed to
DESeq
andDA.ds2x
- lim - Passed to
eBayes
,lmFit
andDA.lim
- lia - Passed to
eBayes
,lmFit
andDA.lia
- lic - Passed to
eBayes
,lmFit
andDA.lic
- lli - Passed to
eBayes
,lmFit
andDA.lli
- lli2 - Passed to
eBayes
,lmFit
andDA.lli2
- kru - Passed to
kruskal.test
andDA.kru
- aov - Passed to
aov
andDA.aov
- aoa - Passed to
aov
andDA.aoa
- aoc - Passed to
aov
andDA.aoc
- lao - Passed to
aov
andDA.lao
- lao2 - Passed to
aov
andDA.lao2
- lrm - Passed to
lm
,lme
andDA.lrm
- lma - Passed to
lm
,lme
andDA.lma
- lmc - Passed to
lm
,lme
andDA.lmc
- llm - Passed to
lm
,lme
andDA.llm
- llm2 - Passed to
lm
,lme
andDA.llm2
- rai - Passed to
raida
andDA.rai
- spe - Passed to
cor.test
andDA.spe
- pea - Passed to
cor.test
andDA.pea
- poi - Passed to
glm
,glmer
andDA.poi
- qpo - Passed to
glm
andDA.qpo
- vli - Passed to
voom
,eBayes
,lmFit
andDA.vli
- zpo - Passed to
zeroinfl
andDA.zpo
- znb - Passed to
zeroinfl
andDA.znb
- fri - Passed to
friedman.test
andDA.fri
- qua - Passed to
quade.test
andDA.qua
- sam - Passed to
SAMseq
andDA.sam
- mva - Passed to
manyglm
andsummary.manyglm