This repository accompanies the paper:
Millard, LAC, et al. Searching for the causal effects of BMI in over 300 000 individuals, using Mendelian randomization, bioRxiv, 2017.
I use the following language versions: R-3.3.1-ATLAS, Stata v14, and Matlab-r2015a, and the PHESANT package v0.15.
For details of PHESANT see our IJE software profile.
The code uses some environment variables that need to be set in your linux environment. I set some permanently (that I use across projects), and some temporarily (that are relevant to just this project).
I set the results directory and project data directory temporarily with:
export RES_DIR="${HOME}/2016-biobank-mr-phewas-bmi/results/sample500k"
export PROJECT_DATA="${HOME}/2016-biobank-mr-phewas-bmi/data/sample500"
I set the IEU shared UKB data directory, and the PHESANT code directory (i.e. my path to the code from the PHESANT git repository) permanently, by adding the following to my ~/.bash_profile
file.
export UKB_DATA="/path/to/ukb/data"
export PHESANT="/path/to/phesant/package"
The following script renames phenotypes to the correct format in phenotype file column header (as required by the PHESANT package).
These commands add an 'x' to the start of each phenotype name, and replaces '.' and '-' characters with '_' in the column headers of the phenotype file.
datadir="${PROJECT_DATA}/phenotypes/derived/"
origdir="${UKB_DATA_PHENO}/_latest/UKBIOBANK_Phenotypes_App_16729/data/"
head -n 1 ${origdir}data.21753.csv | sed 's/,"/,"x/g' | sed 's/-/_/g' | sed 's/\./_/g' > ${datadir}data.21753-phesant_header.csv
awk '(NR>1) {print $0}' ${origdir}data.21753.csv >> ${datadir}data.21753-phesant_header.csv
We perform a Mendelian randomization phenome-wide association study (MR-pheWAS) of BMI, using a BMI genetic score (on the full UKB sample).
There are 4 main steps:
-
Data preprocessing - constructing a BMI genetic risk score
See
1-BMI-genetic-score
directory. -
Generating confounder files to use as covariates in analyses
See
2-confounder-files
directory. -
Running MR-pheWAS using PHESANT
See
3-PHESANT
directory. -
Follow-up analyses on nervousness/anxiety phenotypes.
See `4-follow-up' directory.
The results directory has the following structure:
results-PHESANT-main-noCIs/
results-PHESANT-sensitivity-noCIs/
nervous-followup/
The data directory has the following structure and files:
bridging/
phenotypes/derived/
phenotypes/original/data.21753.csv
snps/derived/
qc/
participants-withdrawn.txt
The file participants-withdrawn.txt
contains a list of participant IDs that have withdrawn from the UK Biobank study.
The file data.21753.csv
is our phenotype file downloaded from UK Biobank.