Skip to content

output files

Sina Majidian edited this page Nov 21, 2022 · 6 revisions

 

In the folder 'tests/output' you should be able to find the following folders:

folder/file description
01_ref_ogs_aa contains the selected OGs with amino acid data
01_ref_ogs_dna contains the selected OGs with dna data
02_ref_dna contains the OGs reshuffeled by available species
03_align_aa contains mafft alignment of aa data
03_align_dna contains codon replacement of aa alignments
04_mapping_sample_1 contains the consensus sequences from the mapping
05_ogs_map_sample_1_aa contains the OGs with additional sequence sample_1
05_ogs_map_sample_1_dna contains the OGs with additional sequence sample_1
06_align_sample_1_aa contains the alignment with additional sequence sample_1
06_align_sample_1_dna contains the alignment with additional sequence sample_1
concat_sample_1_aa.phy concatenated alignments from 06 amino acid folder
concat_sample_1_dna.phy concatenated alignments from 06 dna folder
sample_1_all_cov.txt summary of average numbers of reads used for selected sequences
sample_1_all_sc.txt summary of average consensus length of reconstructed sequences

 

You can check the inferred species tree for the sample and five reference species in Newick format:

$cat  output/tree_sample_1.nwk
(sample_1:0.0106979811,((HUMAN:0.0041202790,GORGO:0.0272785216):0.0433094119,(XENLA:0.1715052824,MNELE:0.9177670816):0.1141311779):0.0613339433,RATNO:0.0123413734);

Note that we consider species names as 5-letter codes e.g. XENLA = Xenopus laevis. If you want to rerun your analysis, make sure that you moved/deleted the files. Otherwise, read2tree continues the progress of previous analysis.

For running on clusters, you can run the first step of read2tree such that folders 01, 02 and 03 are computed (this allows for mapping). This can be done using the '--reference' option. Since read2tree re-orders the OGs into the included species, it is possible to split the mapping step per species using multiple threads for the mapper. For this the '--single_mapping' option is available.

Hint: As read2tree exploits the progress package, the user can benefit from continuing unfinished runs. However, if you want to conduct a new analysis with different inputs, you need to remove output of previous runs or change the output_path.

The following is the folder structure for the test example:

Before run:

$ tree tests/
tests/
├── marker_genes
│   ├── OMAGroup_1032177.fa
│   ├── OMAGroup_1059464.fa
│   ├── OMAGroup_1064207.fa
│   ├── OMAGroup_1080404.fa
│   ├── OMAGroup_1103036.fa
│   ├── OMAGroup_1103803.fa
│   ├── OMAGroup_1107105.fa
│   ├── OMAGroup_638532.fa
│   ├── OMAGroup_648288.fa
│   ├── OMAGroup_741671.fa
│   ├── OMAGroup_742036.fa
│   ├── OMAGroup_778504.fa
│   ├── OMAGroup_783172.fa
│   ├── OMAGroup_799356.fa
│   ├── OMAGroup_852256.fa
│   ├── OMAGroup_852375.fa
│   ├── OMAGroup_852570.fa
│   ├── OMAGroup_853308.fa
│   ├── OMAGroup_853454.fa
│   └── OMAGroup_853960.fa
├── sample_1.fastq
├── sample_2.fastq
├── test_aligner.py
├── test_og.py
├── test_ogset.py
├── test_reads.py
├── test_seqCompleteness.py
└── test_use.py
1 directory, 28 files

and after running Read2Tree:

$ tree tests/
tests/
├── marker_genes
│   ├── OMAGroup_1032177.fa
│   ├── OMAGroup_1059464.fa
│   ├── OMAGroup_1064207.fa
│   ├── OMAGroup_1080404.fa
│   ├── OMAGroup_1103036.fa
│   ├── OMAGroup_1103803.fa
│   ├── OMAGroup_1107105.fa
│   ├── OMAGroup_638532.fa
│   ├── OMAGroup_648288.fa
│   ├── OMAGroup_741671.fa
│   ├── OMAGroup_742036.fa
│   ├── OMAGroup_778504.fa
│   ├── OMAGroup_783172.fa
│   ├── OMAGroup_799356.fa
│   ├── OMAGroup_852256.fa
│   ├── OMAGroup_852375.fa
│   ├── OMAGroup_852570.fa
│   ├── OMAGroup_853308.fa
│   ├── OMAGroup_853454.fa
│   └── OMAGroup_853960.fa
├── mplog.log
├── output
│   ├── 01_ref_ogs_aa
│   │   ├── OG1032177.fa
│   │   ├── OG1059464.fa
│   │   ├── OG1064207.fa
│   │   ├── OG1080404.fa
│   │   ├── OG1103036.fa
│   │   ├── OG1103803.fa
│   │   ├── OG1107105.fa
│   │   ├── OG638532.fa
│   │   ├── OG648288.fa
│   │   ├── OG741671.fa
│   │   ├── OG742036.fa
│   │   ├── OG778504.fa
│   │   ├── OG783172.fa
│   │   ├── OG799356.fa
│   │   ├── OG852256.fa
│   │   ├── OG852375.fa
│   │   ├── OG852570.fa
│   │   ├── OG853308.fa
│   │   ├── OG853454.fa
│   │   └── OG853960.fa
│   ├── 01_ref_ogs_dna
│   │   ├── OG1032177.fa
│   │   ├── OG1059464.fa
│   │   ├── OG1064207.fa
│   │   ├── OG1080404.fa
│   │   ├── OG1103036.fa
│   │   ├── OG1103803.fa
│   │   ├── OG1107105.fa
│   │   ├── OG638532.fa
│   │   ├── OG648288.fa
│   │   ├── OG741671.fa
│   │   ├── OG742036.fa
│   │   ├── OG778504.fa
│   │   ├── OG783172.fa
│   │   ├── OG799356.fa
│   │   ├── OG852256.fa
│   │   ├── OG852375.fa
│   │   ├── OG852570.fa
│   │   ├── OG853308.fa
│   │   ├── OG853454.fa
│   │   └── OG853960.fa
│   ├── 02_ref_dna
│   │   ├── GORGO_OGs.fa
│   │   ├── HUMAN_OGs.fa
│   │   ├── MNELE_OGs.fa
│   │   ├── RATNO_OGs.fa
│   │   └── XENLA_OGs.fa
│   ├── 03_align_aa
│   │   ├── OG1032177.phy
│   │   ├── OG1059464.phy
│   │   ├── OG1064207.phy
│   │   ├── OG1080404.phy
│   │   ├── OG1103036.phy
│   │   ├── OG1103803.phy
│   │   ├── OG1107105.phy
│   │   ├── OG638532.phy
│   │   ├── OG648288.phy
│   │   ├── OG741671.phy
│   │   ├── OG742036.phy
│   │   ├── OG778504.phy
│   │   ├── OG783172.phy
│   │   ├── OG799356.phy
│   │   ├── OG852256.phy
│   │   ├── OG852375.phy
│   │   ├── OG852570.phy
│   │   ├── OG853308.phy
│   │   ├── OG853454.phy
│   │   └── OG853960.phy
│   ├── 03_align_dna
│   │   ├── OG1032177.phy
│   │   ├── OG1059464.phy
│   │   ├── OG1064207.phy
│   │   ├── OG1080404.phy
│   │   ├── OG1103036.phy
│   │   ├── OG1103803.phy
│   │   ├── OG1107105.phy
│   │   ├── OG638532.phy
│   │   ├── OG648288.phy
│   │   ├── OG741671.phy
│   │   ├── OG742036.phy
│   │   ├── OG778504.phy
│   │   ├── OG783172.phy
│   │   ├── OG799356.phy
│   │   ├── OG852256.phy
│   │   ├── OG852375.phy
│   │   ├── OG852570.phy
│   │   ├── OG853308.phy
│   │   ├── OG853454.phy
│   │   └── OG853960.phy
│   ├── 04_mapping_sample_1
│   │   ├── GORGO_OGs_consensus.fa
│   │   ├── GORGO_OGs_cov.txt
│   │   ├── GORGO_OGs.fa.bam
│   │   ├── GORGO_OGs_sc.txt
│   │   ├── HUMAN_OGs_consensus.fa
│   │   ├── HUMAN_OGs_cov.txt
│   │   ├── HUMAN_OGs.fa.bam
│   │   ├── HUMAN_OGs_sc.txt
│   │   ├── MNELE_OGs_cov.txt
│   │   ├── MNELE_OGs.fa.bam
│   │   ├── RATNO_OGs_consensus.fa
│   │   ├── RATNO_OGs_cov.txt
│   │   ├── RATNO_OGs.fa.bam
│   │   ├── RATNO_OGs_sc.txt
│   │   ├── XENLA_OGs_consensus.fa
│   │   ├── XENLA_OGs_cov.txt
│   │   ├── XENLA_OGs.fa.bam
│   │   └── XENLA_OGs_sc.txt
│   ├── 05_ogs_map_sample_1_aa
│   │   ├── OG1032177.fa
│   │   ├── OG1059464.fa
│   │   ├── OG1064207.fa
│   │   ├── OG1080404.fa
│   │   ├── OG1103036.fa
│   │   ├── OG1103803.fa
│   │   ├── OG1107105.fa
│   │   ├── OG638532.fa
│   │   ├── OG648288.fa
│   │   ├── OG741671.fa
│   │   ├── OG742036.fa
│   │   ├── OG778504.fa
│   │   ├── OG783172.fa
│   │   ├── OG799356.fa
│   │   ├── OG852256.fa
│   │   ├── OG852375.fa
│   │   ├── OG852570.fa
│   │   ├── OG853308.fa
│   │   ├── OG853454.fa
│   │   └── OG853960.fa
│   ├── 05_ogs_map_sample_1_dna
│   │   ├── OG1032177.fa
│   │   ├── OG1059464.fa
│   │   ├── OG1064207.fa
│   │   ├── OG1080404.fa
│   │   ├── OG1103036.fa
│   │   ├── OG1103803.fa
│   │   ├── OG1107105.fa
│   │   ├── OG638532.fa
│   │   ├── OG648288.fa
│   │   ├── OG741671.fa
│   │   ├── OG742036.fa
│   │   ├── OG778504.fa
│   │   ├── OG783172.fa
│   │   ├── OG799356.fa
│   │   ├── OG852256.fa
│   │   ├── OG852375.fa
│   │   ├── OG852570.fa
│   │   ├── OG853308.fa
│   │   ├── OG853454.fa
│   │   └── OG853960.fa
│   ├── 06_align_sample_1_aa
│   │   ├── OG1032177.fa
│   │   ├── OG1059464.fa
│   │   ├── OG1064207.fa
│   │   ├── OG1080404.fa
│   │   ├── OG1103036.fa
│   │   ├── OG1103803.fa
│   │   ├── OG1107105.fa
│   │   ├── OG638532.fa
│   │   ├── OG648288.fa
│   │   ├── OG741671.fa
│   │   ├── OG742036.fa
│   │   ├── OG778504.fa
│   │   ├── OG783172.fa
│   │   ├── OG799356.fa
│   │   ├── OG852256.fa
│   │   ├── OG852375.fa
│   │   ├── OG852570.fa
│   │   ├── OG853308.fa
│   │   ├── OG853454.fa
│   │   └── OG853960.fa
│   ├── 06_align_sample_1_dna
│   │   ├── OG1032177.fa
│   │   ├── OG1059464.fa
│   │   ├── OG1064207.fa
│   │   ├── OG1080404.fa
│   │   ├── OG1103036.fa
│   │   ├── OG1103803.fa
│   │   ├── OG1107105.fa
│   │   ├── OG638532.fa
│   │   ├── OG648288.fa
│   │   ├── OG741671.fa
│   │   ├── OG742036.fa
│   │   ├── OG778504.fa
│   │   ├── OG783172.fa
│   │   ├── OG799356.fa
│   │   ├── OG852256.fa
│   │   ├── OG852375.fa
│   │   ├── OG852570.fa
│   │   ├── OG853308.fa
│   │   ├── OG853454.fa
│   │   └── OG853960.fa
│   ├── concat_sample_1_aa.phy
│   ├── concat_sample_1_dna.phy
│   ├── sample_1_all_cov.txt
│   ├── sample_1_all_sc.txt
│   └── tree_sample_1.nwk
├── sample_1.fastq
├── sample_2.fastq
├── test_aligner.py
├── test_og.py
├── test_ogset.py
├── test_reads.py
├── test_seqCompleteness.py
└── test_use.py
12 directories, 217 files
Clone this wiki locally