Skip to content

FastOMA Subpackages

Sina Majidian edited this page Jun 21, 2024 · 2 revisions

FastOMA benefits from four sub-packages, written in Python.

  1. fastoma-check-input

  2. fastoma-infer-roothogs

  3. fastoma-batch-roothogs

  4. fastoma-infer-subhogs

  5. fastoma-collect-subhogs

1) fastoma-check-input

$ fastoma-check-input  -h
usage: fastoma-check-input [-h] --proteomes PROTEOMES --species-tree SPECIES_TREE --out-tree OUT_TREE [--splice SPLICE] [--hogmap HOGMAP]
                           [--omamer_db OMAMER_DB] [-v]

checking parameters for FastOMA

optional arguments:
  -h, --help            show this help message and exit
  --proteomes PROTEOMES
                        Path to the folder containing the input proteomes
  --species-tree SPECIES_TREE
                        Path to the input species tree file in newick format
  --out-tree OUT_TREE   Path to output file for sanitised species tree.
  --splice SPLICE       Path to the folder containing the splice information files
  --hogmap HOGMAP       Path to the folder containing the hogmap files
  --omamer_db OMAMER_DB
                        Path to the omamer database
  -v                    Increase verbosity to info/debug

2) fastoma-infer-roothogs

 $  fastoma-infer-roothogs  -h


usage: fastoma-infer-roothogs [-h] --proteomes PROTEOMES [--splice SPLICE] [--hogmap HOGMAP] --out-rhog-folder OUT_RHOG_FOLDER [-v]

checking parameters for FastOMA

optional arguments:
  -h, --help            show this help message and exit
  --proteomes PROTEOMES
                        Path to the folder containing the input proteomes
  --splice SPLICE       Path to the folder containing the splice information files
  --hogmap HOGMAP       Path to the folder containing the hogmap files
  --out-rhog-folder OUT_RHOG_FOLDER
                        Folder where the roothog fasta files are written
  -v                    Increase verbosity to info/debug

3) fastoma-batch-roothogs

$  fastoma-batch-roothogs  -h

usage: fastoma-batch-roothogs [-h] --input-roothogs INPUT_ROOTHOGS --out-big OUT_BIG --out-rest OUT_REST [-v]

Analyse roothog families and create batches for analysis

optional arguments:
  -h, --help            show this help message and exit
  --input-roothogs INPUT_ROOTHOGS
                        folder where input roothogs are stored
  --out-big OUT_BIG     folder where the big single family hogs should be stored
  --out-rest OUT_REST   folder where the remaining families should be stored inbatch subfolder structure.
  -v                    incrase verbosity

4) fastoma-infer-subhogs

$ fastoma-infer-subhogs -h

usage: fastoma-infer-subhogs [-h] [--logger-level LOGGER_LEVEL] [--version] [--species-tree-checked SPECIES_TREE_CHECKED]
                             [--input-rhog-folder INPUT_RHOG_FOLDER] [--parallel | --no-parallel] [--fragment-detection | --no-fragment-detection]
                             [--low-so-detection | --no-low-so-detection]

This is FastOMA

optional arguments:
  -h, --help            show this help message and exit
  --logger-level LOGGER_LEVEL
  --version             Show version and exit.
  --species-tree-checked SPECIES_TREE_CHECKED
  --input-rhog-folder INPUT_RHOG_FOLDER
  --parallel, --no-parallel
  --fragment-detection, --no-fragment-detection
  --low-so-detection, --no-low-so-detection

5) fastoma-collect-subhogs

$ fastoma-collect-subhogs -h

usage: fastoma-collect-subhogs [-h] --pickle-folder PICKLE_FOLDER --roothogs-folder ROOTHOGS_FOLDER --gene-id-pickle-file GENE_ID_PICKLE_FILE
                               [--out OUT] [-v] [--roothog-tsv ROOTHOG_TSV] [--marker-groups-fasta MARKER_GROUPS_FASTA] --species-tree SPECIES_TREE
                               [--id-transform {UniProt,noop}]

collecting all computed HOGs and combine into a single orthoxml

optional arguments:
  -h, --help            show this help message and exit
  --pickle-folder PICKLE_FOLDER
                        folder containing the pickle files. Will be searched recursively
  --roothogs-folder ROOTHOGS_FOLDER
                        folder containing the omamer roothogs
  --gene-id-pickle-file GENE_ID_PICKLE_FILE
                        file containing the gene-id dictionary in pickle format
  --out OUT             output filename in orthoxml
  -v
  --roothog-tsv ROOTHOG_TSV
                        If specified, a tsv file with the given path will be produced containing the roothog assignments as TSV file. In addition, a
                        folder named RootHOGsFasta will be generatedwith one fasta file per inferred RootHOG.
  --marker-groups-fasta MARKER_GROUPS_FASTA
                        If specified, a folder named OrthologousFasta and a TSV file with the name provided in this argument will be generated that
                        contains single copy groups, i.e. groups which have at most one gene per species. Useful as phylogenetic marker genes to
                        reconstruct species trees.
  --species-tree SPECIES_TREE
                        Path to the species tree used to infer the hogs
  --id-transform {UniProt,noop}
                        ID transformer from fasta files to orthoxml / OrthologGroup protein IDs. By default, no transformation will be done. Existing
                        values are: noop: No transformation - entire ID of fasta header UniProt: '>sp|P68250|1433B_BOVIN' --> P68250