-
Notifications
You must be signed in to change notification settings - Fork 18
Details of arguments
Sina Majidian edited this page Oct 19, 2023
·
2 revisions
You can see the details of arguments of the read2tree package by running read2tree -h
.
usage: read2tree [-h] [--version] [--output_path OUTPUT_PATH]
--standalone_path STANDALONE_PATH [--reads READS [READS ...]]
[--read_type READ_TYPE] [--threads THREADS] [--split_reads]
[--split_len SPLIT_LEN] [--split_overlap SPLIT_OVERLAP]
[--split_min_read_len SPLIT_MIN_READ_LEN] [--sample_reads]
[--genome_len GENOME_LEN] [--coverage COVERAGE]
[--min_cons_coverage MIN_CONS_COVERAGE]
[--dna_reference DNA_REFERENCE] [--sc_threshold SC_THRESHOLD]
[--ngmlr_parameters NGMLR_PARAMETERS] [--check_mate_pairing]
[--debug] [--sequence_selection_mode SEQUENCE_SELECTION_MODE]
[-s SPECIES_NAME] [--tree] [--merge_all_mappings] [-r]
[--min_species MIN_SPECIES] [--single_mapping SINGLE_MAPPING]
[--ref_folder REF_FOLDER]
[--remove_species_mapping REMOVE_SPECIES_MAPPING]
[--remove_species_ogs REMOVE_SPECIES_OGS] [--keep_all_ogs]
[--ignore_species IGNORE_SPECIES]
read2tree is a pipeline allowing to use read data in combination with an OMA
standalone output run to produce high quality trees.
optional arguments:
-h, --help show this help message and exit
--version Show programme's version number and exit.
--output_path OUTPUT_PATH
[Default is current directory] Path to output
directory.
--standalone_path STANDALONE_PATH
[Default is current directory] Path to the folder where marker genes
(i.e. reference orthologous groups) in fasta format are located.
--reads READS [READS ...]
[Default is none] Reads to be mapped to reference. If
paired end add separated by space.
--read_type READ_TYPE
[Default is "short" reads] Type of reads to use for
mapping, either "short" or "long". Either ngm for short reads or ngmlr for long
will be used.
--threads THREADS [Default is 1] Number of threads for the mapping using
ngm / ngmlr!
--split_reads [Default is off] Splits reads as defined by split_len
(200) and split_overlap (0) parameters.
--split_len SPLIT_LEN
[Default is 200] Parameter for selection of read split
length can only be used in combinationwith with long
read option.
--split_overlap SPLIT_OVERLAP
[Default is 0] Reads are split with an overlap defined
by this argument.
--split_min_read_len SPLIT_MIN_READ_LEN
[Default is 200] Reads longer than this value are cut
into smaller values as defined by --split_len.
--sample_reads [Default is off] Splits reads as defined by split_len
(200) and split_overlap (0) parameters.
--genome_len GENOME_LEN
[Default is 2000000] Genome size in bp.
--coverage COVERAGE [Default is 10] coverage in X. Only considered if
--sample reads is selected.
--min_cons_coverage MIN_CONS_COVERAGE
[Default is 1] Minimum number of nucleotides at
column.
--dna_reference DNA_REFERENCE
[Default is None] Reference file that contains
nucleotide sequences (fasta, hdf5) with `.fa` extension. If not given it
will usethe RESTapi and retrieve sequences from
http://omabrowser.org directly. NOTE: internet
connection required!
--sc_threshold SC_THRESHOLD
[Default is 0.25; Range 0-1] Parameter for selection
of sequences from mapping by completeness compared to
its reference sequence (number of ACGT basepairs vs
length of sequence). By default, all sequences are
selected.
--ngmlr_parameters NGMLR_PARAMETERS
[Default is none] In case this parameters need to be
changed all 3 values have to be changed [x,subread-
length,R]. The standard is: ont,256,0.25.
Possibilities for these parameter can be found in the
original documentation of ngmlr.
--check_mate_pairing Check whether in case of paired end reads we have
consistent mate pairing. Setting this option will
automatically select the overlapping reads and do not
consider single reads.
--debug [Default is false] Changes to debug mode: * bam files
are saved!* reads are saved by mapping to OG
--sequence_selection_mode SEQUENCE_SELECTION_MODE
[Default is sc] Possibilities are cov and cov_sc for
mapped sequence.
-s SPECIES_NAME, --species_name SPECIES_NAME
[Default is name of read 1st file] Name of species for
mapped sequence.
--tree [Default is false] Compute tree, otherwise just output
concatenated alignment!
--merge_all_mappings [Default is off] In case multiple species were mapped
to the same reference this allows to merge this
mappings and build a tree with all included species!
-r, --reference [Default is off] Just generate the reference dataset
for mapping.
--min_species MIN_SPECIES
Min number of species in selected orthologous groups.
If not selected it will be estimated such that around
1000 OGs are available.
--single_mapping SINGLE_MAPPING
[Default is none] Single species file allowing to map
in a job array.
--ref_folder REF_FOLDER
[Default is none] Folder containing reference files
with sequences sorted by species.
--remove_species_mapping REMOVE_SPECIES_MAPPING
[Default is none] Remove species present in data set
after mapping step completed and only do analysis on
subset. Input is comma separated list without spaces,
e.g. XXX,YYY,AAA.
--remove_species_ogs REMOVE_SPECIES_OGS
[Default is none] Remove species present in data set
after mapping step completed to build OGs. Input is
comma separated list without spaces, e.g. XXX,YYY,AAA.
--keep_all_ogs [Default is on] Keep all orthologs after addition of
mapped seq, which means also the OGs that have no
mapped sequence. Otherwise only OGs are used that have
the mapped sequence for alignment and tree inference.
--ignore_species IGNORE_SPECIES
[Default is none] Ignores species part of the OMA
standalone pipeline. Input is comma separated list
without spaces, e.g. XXX,YYY,AAA.
read2tree (C) 2017-2022 David Dylus