-
Notifications
You must be signed in to change notification settings - Fork 25
Merge profiles into a table
Once you profiled all your samples, you can merge them in a unique file (a tab separated matrix).
There are two ways to merge profiles:
- providing a list of files
- merging all files contained in a directory
mOTUs 2.6 also release 11,164 profiles of public metagenomes and metatranscriptomes from 23 environments. The merge function allows the users to add also those profiles to their own data using the -a
flag.
You can merge files providing a comma separated list of files. Example with three files:
$ ls
sample_34.motus
sample_XY.motus
sampZ.motus
You can merge them with:
motus merge -i sample_34.motus,sample_XY.motus,sampZ.motus
If you save all your profile files in a directory, example:
$ ls results/
sample_34.motus
sample_XY.motus
sampZ.motus
You can merge all files contained in the directory with the -d
option:
motus merge -d results
You can merge your own profiles and append public profiles to it.
For example if you are interested in adding all profiles from the human
environment then simple use:
motus merge -d results -a human
Or if you want to add public profiles from human
and cattle
:
motus merge -d results -a human,cattle
Or simple adding all public profiles:
motus merge -d results -a all
A list of all environments covered in the public profiles can be retrieved by the help function:
motus merge
...
-a STR[,STR] Add public profiles from different environments. [all, air, bioreactor, bee, cat,
cattle, chicken, dog, fish, freshwater, human,
marine, mouse, pig, sheep, soil, termite, wastewater]
...
You can specify the output file with -o
. The merged file looks like:
# motus version 2.6.0 | merge 2.6.0 | info merged profiles: # git tag version 2.6.0 | motus version 2.6.0 | map_tax 2.6.0 | gene database: nr2.6.0 | calc_mgc 2.6.0 -y insert.raw_counts -l 75 | calc_motu 2.6.0 -k mOTU -g 3 -c | taxonomy: ref_mOTU_2.6.0 meta_mOTU_2.6.0
# call: python mOTUs_v2/motus merge -i test/test1.motus,test/test2.motus
#consensus_taxonomy sample1 sample2
Kandleria vitulina [ref_mOTU_v2_0001] 93 42
Methyloversatilis universalis [ref_mOTU_v2_0002] 0 0
Megasphaera genomosp. [ref_mOTU_v2_0003] 11 3
Streptococcus anginosus [ref_mOTU_v2_0004] 22 298
Streptococcus anginosus [ref_mOTU_v2_0005] 0 0
Note that the name of the samples is in the third header (#consensus_taxonomy sample1 sample2
).
You need to specify the name of the samples when profiling with the -n
option. For example:
motus profile -s test1.fq -n sample1 -o test1.motus