Skip to content

Merge profiles into a table

Hans-Joachim Ruscheweyh edited this page Mar 5, 2021 · 5 revisions

Once you profiled all your samples, you can merge them in a unique file (a tab separated matrix).

There are two ways to merge profiles:

  • providing a list of files
  • merging all files contained in a directory

mOTUs 2.6 also release 11,164 profiles of public metagenomes and metatranscriptomes from 23 environments. The merge function allows the users to add also those profiles to their own data using the -a flag.

Provide a list of files

You can merge files providing a comma separated list of files. Example with three files:

$ ls
sample_34.motus
sample_XY.motus
sampZ.motus

You can merge them with:

motus merge -i sample_34.motus,sample_XY.motus,sampZ.motus

Merge all files contained in a directory

If you save all your profile files in a directory, example:

$ ls results/
sample_34.motus
sample_XY.motus
sampZ.motus

You can merge all files contained in the directory with the -d option:

motus merge -d results

Merge with public profiles

You can merge your own profiles and append public profiles to it.

For example if you are interested in adding all profiles from the human environment then simple use:

motus merge -d results -a human

Or if you want to add public profiles from human and cattle:

motus merge -d results -a human,cattle

Or simple adding all public profiles:

motus merge -d results -a all

A list of all environments covered in the public profiles can be retrieved by the help function:

motus merge
...
   -a STR[,STR]  Add public profiles from different environments. [all, air, bioreactor, bee, cat,
		 cattle, chicken, dog, fish, freshwater, human,
		 marine, mouse, pig, sheep, soil, termite, wastewater]
...

Result

You can specify the output file with -o. The merged file looks like:

# motus version 2.6.0 | merge 2.6.0 | info merged profiles: # git tag version 2.6.0 |  motus version 2.6.0 | map_tax 2.6.0 | gene database: nr2.6.0 | calc_mgc 2.6.0 -y insert.raw_counts -l 75 | calc_motu 2.6.0 -k mOTU -g 3 -c | taxonomy: ref_mOTU_2.6.0 meta_mOTU_2.6.0 
# call: python mOTUs_v2/motus merge -i test/test1.motus,test/test2.motus
#consensus_taxonomy	sample1	sample2
Kandleria vitulina [ref_mOTU_v2_0001]	93	42
Methyloversatilis universalis [ref_mOTU_v2_0002]	0	0
Megasphaera genomosp. [ref_mOTU_v2_0003]	11	3
Streptococcus anginosus [ref_mOTU_v2_0004]	22	298
Streptococcus anginosus [ref_mOTU_v2_0005]	0	0

Note that the name of the samples is in the third header (#consensus_taxonomy sample1 sample2). You need to specify the name of the samples when profiling with the -n option. For example:

motus profile -s test1.fq -n sample1 -o test1.motus