Skip to content

Nextflow pipeline for preparatory analyses for Whale

Notifications You must be signed in to change notification settings

heche-psb/whaleprep

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whaleprep

Arthur Zwaenepoel (2018-2019)

Nextflow pipeline and other utilities for preparing an analysis with Whale. The pipeline does:

  1. Alignment with PRANK
  2. MCMC with MrBayes
  3. CCD construction aith ALEobserve

The dependencies are nextflow, python3, PRANK, MrBayes and ALEobserve. The pipeline is configured to work with SGE cluster environments, and should be submitted from the head node.

If you haven't installed the nextflow or not familiar with nextflow, we recommend a brief tutorial on nextflow. To install the nextflow, try the following command

curl -s https://get.nextflow.io | bash

To run the pipeline in a virtualenv environment, do

virtualenv -p=python3 ENV_whaleprep
source $PATH/ENV_whaleprep/bin/activate
$PATH/nextflow whaleprep.nf --fasta <fastadir> --tools $PATH/whaleprep.py

where fastadir is directory with multifasta files of protein sequences for all gene families of interest (one file per family). Optional arguments currently are:

--tools         path to whaleprep.py
--out           output directory name
--ngen          number of MCMC generations used in MrBayes
--samplefreq    sample frequency for the MCMC 
--burnin        the number of samples to discard as burn-in in ALEobserve

Please make sure that there is no "-" in the gene name, otherwise the mrbayes might have error.

Notes on updates: the Bio.Alphabet module is removed from Biopython 1.78 (September 2020),see the document, the biopython<=1.77 is now required, please try

pip install biopython==1.77

About

Nextflow pipeline for preparatory analyses for Whale

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 62.5%
  • Python 37.5%