Skip to content

Commit

Permalink
Merge pull request #21 from AliYoussef96/dev_jeba
Browse files Browse the repository at this point in the history
corrections to the package
  • Loading branch information
AliYoussef96 authored Oct 5, 2024
2 parents 09f1a44 + ace8176 commit c331977
Show file tree
Hide file tree
Showing 35 changed files with 1,436 additions and 711 deletions.
57 changes: 57 additions & 0 deletions .github/workflows/rworkflows.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: rworkflows
'on':
push:
branches:
- master
- main
- devel
- RELEASE_**
pull_request:
branches:
- master
- main
- devel
- RELEASE_**
jobs:
rworkflows:
permissions: write-all
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.os }} (${{ matrix.config.r }})
container: ${{ matrix.config.cont }}
strategy:
fail-fast: ${{ false }}
matrix:
config:
- os: ubuntu-latest
bioc: devel
r: auto
cont: ghcr.io/bioconductor/bioconductor_docker:devel
rspm: ~
- os: macOS-latest
bioc: devel
r: auto
cont: ~
rspm: ~
- os: windows-latest
bioc: devel
r: auto
cont: ~
rspm: ~
steps:
- uses: neurogenomics/rworkflows@master
with:
run_bioccheck: ${{ false }}
run_rcmdcheck: ${{ true }}
as_cran: ${{ true }}
run_vignettes: ${{ true }}
has_testthat: ${{ true }}
run_covr: ${{ true }}
run_pkgdown: ${{ true }}
has_runit: ${{ false }}
has_latex: ${{ false }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run_docker: ${{ false }}
DOCKER_TOKEN: ${{ secrets.DOCKER_TOKEN }}
runner_os: ${{ runner.os }}
cache_version: cache-v1
docker_registry: ghcr.io
18 changes: 14 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,18 @@ Authors@R:
person(given = "Eleanor", family = "Coffey", role = c("aut", "ths"),
email = "[email protected]",
comment = c(ORCID = "0000-0002-9717-5610")))
Description: Differential expression analysis is a prevalent method utilised in the examination of diverse biological data.
The reproducibility-optimized test statistic (ROTS) modifies a t-statistic based on the data's intrinsic characteristics and ranks features according to their statistical significance for differential expression between two or more groups (f-statistic).
Focussing on proteomics and metabolomics, the current ROTS implementation cannot account for technical or biological covariates such as MS batches or gender differences among the samples.
Consequently, we developed LimROTS, which employs a reproducibility-optimized test statistic utilising the limma methodology to simulate complex experimental designs.
Description: Differential expression analysis is a prevalent method utilised in
the examination of diverse biological data.The
reproducibility-optimized test statistic (ROTS) modifies a
t-statistic based on the data's intrinsic characteristics and ranks
features according to their statistical significance for
differential expression between two or more groups (f-statistic).
Focussing on proteomics and metabolomics, the current ROTS
implementation cannot account for technical or biological
covariates such as MS batches or gender differences among
the samples.Consequently, we developed LimROTS, which employs a
reproducibility-optimized test statistic utilising the limma
methodology to simulate complex experimental designs.
License: Artistic-2.0
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
Expand All @@ -40,8 +48,10 @@ Imports:
utils,
stats,
doRNG,
magick,
dplyr
Suggests:
BiocStyle,
ggplot2,
testthat (>= 3.0.0),
knitr,
Expand Down
129 changes: 86 additions & 43 deletions R/Bootstrap_functions.r
Original file line number Diff line number Diff line change
@@ -1,43 +1,62 @@
#' Generate Bootstrap Samples with Optional Pairing
#'
#' This function generates bootstrap samples from the input metadata. It samples with replacement
#' within each group defined in the metadata, and optionally adjusts for paired groups.
#' This function generates bootstrap samples from the input metadata. It samples
#' with replacement within each group defined in the metadata, and optionally
#' adjusts for paired groups.
#'
#' @param B Integer. The number of bootstrap samples to generate.
#' @param meta.info Data frame. Metadata containing sample information, where each row corresponds to a sample.
#' @param group.name Character. The name of the column in `meta.info` that defines the grouping variable for the samples.
#' @param paired Logical. If `TRUE`, the function ensures the bootstrap samples are paired between two groups.
#' @param meta.info Data frame. Metadata containing sample information, where
#' each row corresponds to a sample.
#' @param group.name Character. The name of the column in `meta.info` that
#' defines the grouping variable for the samples.
#' @param paired Logical. If `TRUE`, the function ensures the bootstrap samples
#' are paired between two groups.
#'
#' @details
#' The function works by resampling the row names of the metadata for each group separately. If `paired` is `TRUE`,
#' it assumes there are exactly two groups and samples the second group based on the positions of the first group to maintain pairing.
#' The function works by resampling the row names of the metadata for each group
#' separately. If `paired` is `TRUE`, it assumes there are exactly two groups
#' and samples the second group based on the positions of the first group to
#' maintain pairing.
#'
#' @return A matrix of dimension \code{B} x \code{n}, where \code{n} is the number of samples. Each row corresponds
#' to a bootstrap sample, and each entry is a resampled row name from the metadata.
#' @return A matrix of dimension \code{B} x \code{n}, where \code{n} is the
#' number of samples. Each row corresponds to a bootstrap sample, and each
#' entry is a resampled row name from the metadata.
#'
#' @export
#' @examples
#' # Example usage:
#' set.seed(123)
#' meta.info <- data.frame(group = rep(c("A", "B"), each = 5), row.names = paste0("Sample", 1:10))
#' bootstrapS(B = 10, meta.info = meta.info, group.name = "group", paired = FALSE)
#' meta.info <- data.frame(
#' group = rep(c("A", "B"), each = 5),
#' row.names = paste0("Sample", 1:10)
#' )
#' bootstrapS(
#' B = 10, meta.info = meta.info, group.name = "group",
#' paired = FALSE
#' )
#'
#' # Paired bootstrap sampling
#' bootstrapS(B = 10, meta.info = meta.info, group.name = "group", paired = TRUE)
#' bootstrapS(
#' B = 10, meta.info = meta.info, group.name = "group",
#' paired = TRUE
#' )
#'
bootstrapS <- function(B, meta.info, group.name, paired) {
groups <- meta.info[, group.name]
bootsamples <- matrix(nrow = B, ncol = length(groups))
for (i in seq_len(B)) {
for (g in unique(groups)) {
g.names <- row.names(meta.info)[which(groups == g)]
bootsamples[i, which(groups == g)] <- sample(g.names, length(g.names), replace = TRUE)
bootsamples[i, which(groups == g)] <-
sample(g.names, length(g.names), replace = TRUE)
}
}
if (paired) {
for (i in seq_len(B)) {
g.names1 <- bootsamples[i, which(groups == unique(groups)[1])]
g.names2 <- match(g.names1, row.names(meta.info)) + length(g.names1)
bootsamples[i, which(groups == unique(groups)[2])] <- row.names(meta.info)[g.names2]
bootsamples[i, which(groups == unique(groups)[2])] <-
row.names(meta.info)[g.names2]
}
}
return(bootsamples)
Expand All @@ -46,26 +65,32 @@ bootstrapS <- function(B, meta.info, group.name, paired) {

#' Generate Permutated Samples
#'
#' This function generates permuted samples by shuffling the row names of the metadata.
#' This function generates permuted samples by shuffling the row names of the
#' metadata.
#'
#' @param meta.info Data frame. Metadata containing sample information, where each row corresponds to a sample.
#' @param meta.info Data frame. Metadata containing sample information, where
#' each row corresponds to a sample.
#' @param B Integer. The number of permutations to generate.
#'
#' @details
#' The function creates a matrix where each row is a permuted version of the row names from `meta.info`.
#' This can be used to generate null distributions or perform randomization-based tests.
#' The function creates a matrix where each row is a permuted version of the
#' row names from `meta.info`.This can be used to generate null distributions
#' or perform randomization-based tests.
#'
#' @return A matrix of dimension \code{B} x \code{n}, where \code{n} is the number of samples (i.e., rows in `meta.info`).
#' Each row is a permutation of the row names of the metadata.
#' @return A matrix of dimension \code{B} x \code{n}, where \code{n} is the
#' number of samples (i.e., rows in `meta.info`).Each row is a permutation of
#' the row names of the metadata.
#'
#' @export
#' @examples
#' # Example usage:
#' set.seed(123)
#' meta.info <- data.frame(group = rep(c("A", "B"), each = 5), row.names = paste0("Sample", 1:10))
#' meta.info <- data.frame(
#' group = rep(c("A", "B"), each = 5),
#' row.names = paste0("Sample", 1:10)
#' )
#' permutatedS(meta.info = meta.info, B = 10)
permutatedS <- function(meta.info, B)
{
permutatedS <- function(meta.info, B) {
persamples <- matrix(nrow = B, ncol = nrow(meta.info))
for (i in seq_len(B)) {
persamples[i, ] <- sample(row.names(meta.info))
Expand All @@ -77,32 +102,42 @@ permutatedS <- function(meta.info, B)

#' Generate Stratified Bootstrap Samples for limRots
#'
#' This function generates stratified bootstrap samples based on the groupings and additional factors in the metadata.
#' The function ensures that samples are drawn proportionally based on strata defined by the interaction of factor columns in the metadata.
#' This function generates stratified bootstrap samples based on the groupings
#' and additional factors in the metadata. The function ensures that samples
#' are drawn proportionally based on strata defined by the interaction of
#' factor columns in the metadata.
#'
#' @param B Integer. The number of bootstrap samples to generate.
#' @param meta.info Data frame. Metadata containing sample information, where each row corresponds to a sample. Factor columns in `meta.info` are used to define strata for sampling.
#' @param group.name Character. The name of the column in `meta.info` that defines the grouping variable for the samples.
#' @param meta.info Data frame. Metadata containing sample information,
#' where each row corresponds to a sample. Factor columns in `meta.info`
#' are used to define strata for sampling.
#' @param group.name Character. The name of the column in `meta.info` that
#' defines the grouping variable for the samples.
#'
#' @details
#' The function works by first identifying the factors in the `meta.info` data frame that are used to create strata for sampling.
#' Within each group defined by `group.name`, the function samples according to the strata proportions, ensuring that samples are drawn
#' from the correct groups and strata in a proportional manner.
#' The function works by first identifying the factors in the `meta.info` data
#' frame that are used to create strata for sampling. Within each group defined
#' by `group.name`, the function samples according to the strata proportions,
#' ensuring that samples are drawn from the correct groups and strata in a
#' proportional manner.
#'
#' @return A matrix of dimension \code{B} x \code{n}, where \code{n} is the number of samples. Each row corresponds
#' to a bootstrap sample, and each entry is a resampled row name from the metadata, stratified by group and additional factors.
#' @return A matrix of dimension \code{B} x \code{n}, where \code{n} is the
#' number of samples. Each row corresponds to a bootstrap sample, and each
#' entry is a resampled row name from the metadata, stratified by group and
#' additional factors.
#'
#' @export
#' @examples
#' # Example usage:
#' set.seed(123)
#' meta.info <- data.frame(group = rep(c(1, 2), each = 5),
#' meta.info <- data.frame(
#' group = rep(c(1, 2), each = 5),
#' batch = rep(c("A", "B"), 5),
#' row.names = paste0("Sample", 1:10))
#' row.names = paste0("Sample", 1:10)
#' )
#' meta.info$batch <- as.factor(meta.info$batch)
#' bootstrapSamples.limRots(B = 10, meta.info = meta.info, group.name = "group")
bootstrapSamples.limRots <- function(B, meta.info, group.name)
{
bootstrapSamples.limRots <- function(B, meta.info, group.name) {
labels <- as.numeric(meta.info[, group.name])
samples <- matrix(nrow = B, ncol = length(labels))
for (i in seq_len(B)) {
Expand All @@ -112,7 +147,8 @@ bootstrapSamples.limRots <- function(B, meta.info, group.name)
meta.info.factors <- c()
for (j in seq_len(ncol(meta.info))) {
if (is.factor(meta.info.pos[, j])) {
meta.info.factors <- c(meta.info.factors, colnames(meta.info.pos)[j])
meta.info.factors <-
c(meta.info.factors, colnames(meta.info.pos)[j])
}
}
if (is.null(meta.info.factors)) {
Expand All @@ -124,14 +160,21 @@ bootstrapSamples.limRots <- function(B, meta.info, group.name)
)
return(samples)
}
meta.info.factors <- meta.info.factors[meta.info.factors != group.name]
meta.info.pos$stratum <- interaction(meta.info.pos[, meta.info.factors])
meta.info.factors <-
meta.info.factors[meta.info.factors != group.name]
meta.info.pos$stratum <-
interaction(meta.info.pos[, meta.info.factors])
stratum_sizes <- table(meta.info.pos$stratum)
stratum_samples <- round(length(pos) * prop.table(stratum_sizes))
sampled_indices <- unlist(lapply(names(stratum_samples), function(stratum) {
stratum_indices <- row.names(meta.info.pos)[which(meta.info.pos$stratum == stratum)]
sample(stratum_indices, stratum_samples[stratum], replace = TRUE)
}))
sampled_indices <-
unlist(lapply(names(stratum_samples), function(stratum) {
stratum_indices <-
row.names(meta.info.pos)[which(meta.info.pos$stratum ==
stratum)]
sample(stratum_indices, stratum_samples[stratum],
replace = TRUE
)
}))
samples[i, pos] <- sampled_indices
}
}
Expand Down
Loading

0 comments on commit c331977

Please sign in to comment.