Skip to content

Advanced Notes

Sam Minot edited this page Apr 28, 2022 · 2 revisions

Hardware Requirements For Local Execution

The amount of computational resources (CPU and RAM) needed to run any analysis depends on the number and size of the genes and genomes being analyzed. However, there is no easy way to predict the precise requirements based on the inputs alone.

We have specified some general specifications for the computational requirements in each step within the workflow, but it may be necessary for a user to adjust those settings. All of the resource specifications are defined in the file nextflow.config. To provide alternate values simply (1) make a copy of this file on your own computer, (2) modify the settings listed in any field, and (3) specify this modified configuration file with the -c flag when running the workflow (e.g. nextflow run FredHutch/gig-map -c YOUR.nextflow.config). There is no need to fork the repo to change any of the settings in nextflow.config, since the file provided by the -c flag will always take priority over the configuration in the repo.

Using Singularity instead of Docker

The documentation in this wiki is oriented to a user who is using Docker to run all of the software containers needed by gig-map. For users who do not have access to Docker (e.g., users of a shared computing cluster) the best alternative may be Singularity. To help support this alternate use-case, we have provided more details here on using Singularity to run gig-map.

Configure Nextflow

The wonderful thing about Nextflow is that it can be used to run bioinformatics workflows on many different types of computational systems. However, with that power comes the responsibility for telling Nextflow how you would like it to run. Thankfully, this configuration process need only be performed once. We recommend creating a single file in your home directory (which you can find by running cd $HOME) named nextflow.config, which contains the configuration for your system. Whenever Nextflow runs, one of the first things it will do is look for a file in this location and follow those instructions for configuring the execution of whatever code is being run. By setting up your system configuration once, it will then be automatically reused for every subsequent run. But don't worry, you can always override or change this configuration in the future if you need to!

Example Configurations:

To help you get started, a set of example configurations are provided above. Please note that THESE FILES ARE INCOMPLETE and contain fields which must be filled out by the user (such as identifying the appropriate work directory, as described above). You can get started by saving one of the example configuration files to your home directory, and then modifying it as appropriate for your particular computational resources.