FOCUS: Fine mapping TWAS associations in a single ancestry group

The main aim of FOCUS is to fine-map TWAS associations at GWAS risk regions for single ancestry. FOCUS takes as input

GWAS summary statistics
reference LD
eQTL weight database.

Given these data, FOCUS can fine-map in a tissue-agnostic or tissue-prioritized approach.

The basic command for fine-mapping is

focus finemap SUMSTATS PLINK_REFLD WEIGHT_DB --locations RISK_REGION

where SUMSTATS is the GWAS summary file, PLINK_REFLD is the path to PLINK-formatted genotype data for computing reference LD, and WEIGHT_DB is the path to a FOCUS weight database. RISK_REGION is the path to independent genomic regions (we have generated some files for your use. see wiki Home). Help on all the options and functionality can be listed by entering

focus finemap --help

For example, the command to perform tissue-agnostic fine-mapping on chromosome 1 for GWAS summary data LDL_2010.clean.sumstats.gz using 1000G.EUR.QC.1 reference genotypes, and gtex_v7.db eQTL weights for risk regions 37:EUR generated by LDetect on GRCh37 for European ancestry is given as,

focus finemap LDL_2010.clean.sumstats.gz 1000G.EUR.QC.1 gtex_v7.db --locations 37:EUR --chr 1 --out LDL_2010.chr1

To take the tissue-prioritized approach the flag --tissue TISSUE is added

focus finemap LDL_2010.clean.sumstats.gz 1000G.EUR.QC.1 gtex_v7.db --locations 37:EUR --chr 1 --tissue LIVER --out LDL_2010.chr1

FOCUS has the ability to generate a figure for each region that contains the predicted expression correlation, TWAS summary statistics and PIP for each gene. To do this add the --plot flag.

focus finemap LDL_2010.clean.sumstats.gz 1000G.EUR.QC.1 gtex_v7.db --locations 37:EUR  --chr 1 --tissue LIVER --plot --out LDL_2010.chr1

Here is an example image illustrating the local correlation structure, TWAS p-values, and PIPs for each model

The output from the finemap operation is a table:

Column	Description
block	independent genomic region chrom:start-chrom:stop
ens_gene_id	Ensembl gene ID
ens_tx_id	Ensemble transcript ID
mol_name	Name of the gene/linc/pseudogene
tissue	Tissue the original expression was measured in
ref_name	Name of the QTL reference panel
type	Type of molecular feature (gene, lncRNA, lincRNA, pseudogene)
chrom	Chromosome
tx_start	Transcription start site
tx_stop	Transcription stop site
block_genes	number of genes in the region to set the prior probability for a gene to be causal
inference_pop1	Inference procedure for model (e.g., LASSO, BSLMM)
inter_z_pop1	intercept of z scores when regressing out average tagged pleiotropic associations, None if intercept = False
cv.R2_pop1	Cross-validation predictive Rsquared
cv.R2.pval_pop1	P-value of the Cross-validation
twas_z_pop1	Marginal TWAS Z score
pip_pop1	Marginal posterior inclusion probability
in_cred_set_pop1	Flag indicating whether or not model is included in the credible set
ldregion_pop1	LD regions from reference genome

We recommend using reference LD from LDSC.

We recommend using a multiple tissue, multiple eQTL reference panel weight database here. This combines GTExv7 weights from PrediXcan with METSIM, NTR, YFS, and CMC weights from FUSION software into a single usable database for FOCUS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FOCUS: Fine mapping TWAS associations in a single ancestry group

Clone this wiki locally