-
Notifications
You must be signed in to change notification settings - Fork 25
Parameters to change the resulting profiles
There are many parameters that affect the way the profile is printed.
The following list of parameters can be used with motus profile
.
Check this page for more information on the printed profile.
As default we print the result as relative abundance. With -c
is possible to print the result as counts. For example, the result of motus profile -s test1_single.fastq -n test2 -c
is:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k mOTU -g 3 -c | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test2 -c
#consensus_taxonomy test2
Kandleria vitulina [ref_mOTU_v2_0001] 36
Methyloversatilis universalis [ref_mOTU_v2_0002] 0
Megasphaera genomosp. [ref_mOTU_v2_0003] 12
With -k
is possible to change the taxonomy level. For example, the result of motus profile -s test1_single.fastq -n test3 -k class
is:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k class -g 3 | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test3 -k class
#consensus_taxonomy test3
Mamiellophyceae 0.0000000000
Chthonomonadetes 0.0557659685
Cyanobacteria 0.0090374928
With -p
you add the NCBI taxonomy id to the profile. Hence you will have 3 columns now, where the second one is the NCBI id. For example, the result of motus profile -s test1_single.fastq -n test4 -p
is:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k mOTU -g 3 -p | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test4 -p
#consensus_taxonomy NCBI_tax_id test4
Kandleria vitulina [ref_mOTU_v2_0001] 1630 0.0688211617
Methyloversatilis universalis [ref_mOTU_v2_0002] 378211 0.0000000000
Megasphaera genomosp. [ref_mOTU_v2_0003] 699192 0.0234955832
With -q you can print the full rank taxonomy (up to the one selected with -k
). For example, the result of motus profile -s test1_single.fastq -n test5 -k class -q
is:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k class -g 3 | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test5 -k class -q
#consensus_taxonomy test5
k__Archaea|p__Nanoarchaeota|c__Nanoarchaeota class incertae sedis 0.0235010600
k__Archaea|p__Crenarchaeota|c__Thermoprotei 0.0005106200
k__Archaea|p__Crenarchaeota|c__Crenarchaeota class incertae sedis [YNPFFA] 0.0000000000
If you add -p
you will get the full rank of NCBI taxonomy ids. Calling motus profile -s test1_single.fastq -n test6 -k class -q -p
will produce:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k class -g 3 -p | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test6 -k class -q -p
#consensus_taxonomy NCBI_tax_id test6
k__Archaea|p__Nanoarchaeota|c__Nanoarchaeota class incertae sedis 2157|192989|NA 0.0235010600
k__Archaea|p__Crenarchaeota|c__Thermoprotei 2157|28889|183924 0.0005106200
k__Archaea|p__Crenarchaeota|c__Crenarchaeota class incertae sedis [YNPFFA] 2157|28889|NA 0.0000000000
You can print the result in BioBoxes format with -C
. Note that the mOTUs species definition and the NCBI species definition is not always congruent. As a result, you can decide three methods to save the result in CAMI format: "precision", where the discrepancies are deleted; "recall", where the relative abundances of the discrepancies are split and "parenthesis" where all the discrepancies are kept.
Check in the following examples what happen to the species Pseudomonas sp. GM67 (NCBI tax id:1144335) and Pseudomonas sp. GM60 (NCBI tax id:1144334) that in the mOTUs clustering are classified as belonging to the same species.
Calling motus profile -s test1_single.fastq -n test7 -C parenthesis
produces:
# Taxonomic Profiling Output
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k mOTU -g 3 -C parenthesis | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test7 -C parenthesis
@SampleID: test7
@Version:0.9.1
@Ranks:superkingdom|phylum|class|order|family|genus|species
@TaxonomyID: Sep 16 2015
@@TAXID RANK TAXPATH TAXPATHSN PERCENTAGE
2 superkingdom 2 Bacteria 100.0
...
28221 class 2|1224|28221 Bacteria|Proteobacteria|Deltaproteobacteria 2.20702
...
34029 species 2|1224|28216|80840||88|34029 Bacteria|Proteobacteria|Betaproteobacteria|Burkholderiales||Leptothrix|Leptothrix cholodnii 0.04191
(1144335/1144334) species 2|1224|1236|72274|135621|286|(1144335/1144334) Bacteria|Proteobacteria|Gammaproteobacteria|Pseudomonadales|Pseudomonadaceae|Pseudomonas|(Pseudomonas sp. GM67/Pseudomonas sp. GM60) 2.4000
Calling motus profile -s test1_single.fastq -n test7 -C precision
produces:
...
@@TAXID RANK TAXPATH TAXPATHSN PERCENTAGE
2 superkingdom 2 Bacteria 100.0
...
28221 class 2|1224|28221 Bacteria|Proteobacteria|Deltaproteobacteria 2.20702
...
34029 species 2|1224|28216|80840||88|34029 Bacteria|Proteobacteria|Betaproteobacteria|Burkholderiales||Leptothrix|Leptothrix cholodnii 0.04191
Calling motus profile -s test1_single.fastq -n test7 -C recall
produces:
...
@@TAXID RANK TAXPATH TAXPATHSN PERCENTAGE
2 superkingdom 2 Bacteria 100.0
...
28221 class 2|1224|28221 Bacteria|Proteobacteria|Deltaproteobacteria 2.20702
...
34029 species 2|1224|28216|80840||88|34029 Bacteria|Proteobacteria|Betaproteobacteria|Burkholderiales||Leptothrix|Leptothrix cholodnii 0.04191
1144335 species 2|1224|1236|72274|135621|286|(1144335/1144334) Bacteria|Proteobacteria|Gammaproteobacteria|Pseudomonadales|Pseudomonadaceae|Pseudomonas|(Pseudomonas sp. GM67/Pseudomonas sp. GM60) 1.2000
1144334 species 2|1224|1236|72274|135621|286|(1144335/1144334) Bacteria|Proteobacteria|Gammaproteobacteria|Pseudomonadales|Pseudomonadaceae|Pseudomonas|(Pseudomonas sp. GM67/Pseudomonas sp. GM60) 1.2000
You can print the result in BIOM format version 1.0 with -B
. Calling motus profile -s test1_single.fastq -n test8 -B
will produce a JSON file:
{
"id": "test8",
"format": "Biological Observation Matrix 1.0.0",
"format_url": "http://biom-format.org",
"type": "OTU table",
"generated_by": "motus v2.0.0",
"date": "2018-06-13T14:55:00",
"rows":[
{"id":"ref_mOTU_v2_0001", "metadata":{"name":"Kandleria vitulina",
"NCBI_id":"1630"}},
{"id":"ref_mOTU_v2_0002", "metadata":{"name":"Methyloversatilis universalis",
"NCBI_id":"378211"}},
With -u
is possible to print the full name of the mOTUs. For example, the result of motus profile -s test1_single.fastq -n test9 -u
is:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k mOTU -g 3 -u | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test9 -u
#mOTU consensus_taxonomy test9
...
ref_mOTU_v2_0033 Streptococcus mitis 0.0304056010
ref_mOTU_v2_0034 Escherichia albertii 0.0000394910
ref_mOTU_v2_0035 Escherichia sp. [C KTE11/KTE52/KTE96/KTE159/TW09308] 0.0037182409
Note that now we have 3 columns. The second column is the full name of the species. The result for mOTUs 35 without -u
is
Escherichia sp. [ref_mOTU_v2_0035] 0.0037182409
For visualization purposes we print a shorter version of the full name. In NCBI these five genomes (KTE11/KTE52/KTE96/KTE159/TW09308) are classified as Escherichia sp. (only genus level information). With mOTUs these five genomes are clustered and recognize that they belong to the same species.
With -e
is possible to print only the ref_mOTUs, all the meta_mOTUs will be added to -1. For example, the result of motus profile -s test1_single.fastq -n test10 -e
is:
# git tag version 2.0.0 | motus version 2.0.0 | map_tax 2.0.0 | gene database: nr2.0.0 | calc_mgc 2.0.0 -y insert.scaled_counts -l 75 | calc_motu 2.0.0 -k mOTU -g 3 -e | taxonomy: ref_mOTU_2.0.0 meta_mOTU_2.0.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test10 -e
#consensus_taxonomy test10
Kandleria vitulina [ref_mOTU_v2_0001] 0.0688211617
Methyloversatilis universalis [ref_mOTU_v2_0002] 0.0000000000
...
Thermoproteus uzoniensis [ref_mOTU_v2_5304] 0.0000000000
Paenibacillus sp. [ref_mOTU_v2_5305] 0.0030541740
-1 0.5016916385
With -A
is possible to print all levels together. It produces the same result as calling -q
at all 7 taxonomic levels.
For example, the result of motus profile -s test1_single.fastq -n test11 -A
is:
# git tag version 2.5.0 | motus version 2.5.0 | map_tax 2.5.0 | gene database: nr2.5.0 | calc_mgc 2.5.0 -y insert.scaled_counts -l 75 | calc_motu 2.5.0 -k mOTU -g 3 -A | taxonomy: ref_mOTU_2.5.0 meta_mOTU_2.5.0
# call: python mOTUs_v2/motus profile -s test1_single.fastq -n test11 -A
#mOTUs2_clade test11
k__Bacteria 1.0000000000
k__Bacteria|p__Proteobacteria 0.0783503519
k__Bacteria|p__Firmicutes 0.6122416678
k__Bacteria|p__Thermodesulfobacteria 0.0061668664
k__Bacteria|p__Actinobacteria 0.2441684405
...
k__Bacteria|p__Firmicutes|c__Erysipelotrichia|o__Erysipelotrichales|f__Erysipelotrichaceae|g__Kandleria|s__Kandleria vitulina [ref_mOTU_v25_04327] 0.0763622813
k__Bacteria|p__Thermodesulfobacteria|c__Thermodesulfobacteria|o__Thermodesulfobacteriales|f__Thermodesulfobacteriaceae|g__Thermodesulfobacterium|s__Thermodesulfobacterium commune [ref_mOTU_v25_05094] 0.0061668664
k__Bacteria|p__Firmicutes|c__Firmicutes class incertae sedis|o__Firmicutes order incertae sedis|f__Firmicutes fam. incertae sedis|g__Firmicutes gen. incertae sedis|s__Firmicutes species incertae sedis [meta_mOTU_v25_13597] 0.1128345600