PhaseME is a tool set to assess the quality of the per read phasing information and help to reduce the errors during this process.
1- You require your VCF file to be read based phased, which can be generated by e.g. WhatsHap.
2- Run PhaseME using Python3 on Linux to obtain stats and improve the quality of phase blocks. The only requirement is Numpy.
tar -xzf precomputed/pairlist.tar.gz
python phaseme.py improver my.vcf output_prefix
If you only want to have the quality assessment report use quality
instead of improver
.
Please try our sample data to establish the correctness of the pipeline installation. This can be found in the folder example
.
python phaseme.py improver example/my.vcf example/out
The output will be a quality assessment report example/out/quality.csv
as well as an improved version of the input phased VCF example/out/improved.vcf
.
In quick start section the precomputed linkage information is used. Here, individual-specific linkage information is considered. For grasping the full advantage of PhaseME, few steps are needed prior using PhaseME.
1- You need to download 1000 Genomes reference panel haplotypes.
wget https://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3.tgz
Warning: These are more than 10Gb.
2- You need to download the Shapeit
3- Now, run PhasME as following.
python phaseme.py improver my.vcf output_prefix /path/to/shapeit /path/to/1000G/dataset
PhaseME can also assess and improve the phasings results using parental data instead of linkage information. The user should prepare a three-sample VCF including son, mother and father SNV in this order. This can be done using e.g. bcftools merge
. Prior to that you may need bgzip
, tabix
and bcftools index
on all three samples.
To obtain quality insights:
python phaseme.py quality example/trio.vcf example/out_trio_q trio
Once you want to improve phasing results:
python phaseme.py improver example/trio.vcf example/out_trio trio
For using PhaseME in MAC computer please check the folder mac.
Please see and cite our manuscript: "PhaseME: automatic assessment of phasing quality and phasing improvement", GigaSceince, 2020.
PhaseME has been registered in BioTools.