Follow development at github.
This package executes Workflow Description Language (WDL) files from within R. Compute platforms currently supported by the Broad cromwell workflow engine include:
- Local execution (good for testing)
- Sun GridEngine Clusters (and probably other HPC schedulers)
- HTCondor
- Google Compute Engine
- Apache Spark
require(devtools)
devtools::install_github('seandavi/wdlRunR')
This package leverages all the typical data munging and analysis capabilities of R and Bioconductor, but adds the ability to orchestrate nearly arbitrarily large and complex workflows described using WDL (that are portable and written outside of this package).
Features of this package include:
- With appropriate backend (Google, for example), scale to huge computational capacity
- Submit single or batches of jobs
- Monitor jobs
- Retrieve metadata from submitted, completed, and running jobs
- Review log files from completed and failed jobs
- Track inputs and outputs of jobs
- Optional "caching" of jobs to avoid costly recomputation costs
python create-genomics-ami.py \
--user-data cromwell-genomics-ami.cloud-init.yaml \
--key-pair-name EveryDay \
--scratch-mount-point /cromwell_root \
--profile default \
--ami-description "AMI for use with Cromwell"
TODO: Do this with packer....
// aws.conf
include required(classpath("application"))
aws {
application-name = "cromwell"
auths = [{
name = "default"
scheme = "default"
}]
#
# be sure to set this!!
#
region = "us-east-1"
}
engine {
filesystems {
s3 { auth = "default" }
}
}
backend {
default = "AWSBATCH"
providers {
AWSBATCH {
actor-factory = "cromwell.backend.impl.aws.AwsBatchBackendLifecycleActorFactory"
config {
#
# Change this to an EXISTING bucket
# Cromwell does not create the bucket for you
#
root = "s3://<your-s3-bucket-name>/cromwell-execution"
auth = "default"
numSubmitAttempts = 3
numCreateDefinitionAttempts = 3
concurrent-job-limit = 16
default-runtime-attributes {
#
# You need to set up your AWS batch
# queues and compute environments.
# Then, paste in the Queue ARN,
# available from the AWS batch console
# under the queue details
#
queueArn: "<your-queue-arn>"
}
filesystems {
s3 {
auth = "default"
}
}
}
}
}
}
curl -X POST --header "Accept: application/json" \
"localhost:8000/api/workflows/v1" \
-F workflowSource=@get_ebi_fastq.wdl \
-F workflowInputs=@get_ebi_fastq.inputs