Skip to content

Latest commit

 

History

History
58 lines (48 loc) · 2.78 KB

03_msmc.md

File metadata and controls

58 lines (48 loc) · 2.78 KB

MSMC Analysis

We used msmc2 to estimate changes in effective population size over time from deep sequenced genomes. Two such deep sequenced genomes were available, one from Magnetic Island (MI-1-4) and one from Fitzroy Island (FI-1-3). Genome-wide read coverage for these was assessed using the bedtools genomecov utility and is plotted below. Both genomes had a peak coverage depth of slightly less than 20x with a long tail of higher coverage regions.

Data was therefore prepared for msmc analysis as follows;

  • The genome was masked using the snpable suite of utilities. See 02_snpable.sh for details.
  • Only contigs larger than 1Mb were included
  • A mappability mask was generated using makeMappabilityMask.py from msmc-tools
  • SNPs were called using the bamCaller python script assuming a mean genome coverage of 20x. See 04_genomecov.sh for details
  • Inputs for a single run were generated with the script generate_multihetsep.py from msmc-tools
  • Inputs for bootstraps were generated using the script multihetsep_bootstrap.py from msmc-tools. 100 bootstraps were generated by taking 20 random chunks (per chromosome) of size 500kb and assembling these into 20 “chromosomes.”
  • The msmc2 program was run on each bootstrap using options appropriate for a single diploid sample (see script 07_bootstrap.sh)

In order to turn msmc outputs into a demographic history we assumed a mutation rate of 1.86e-08 (see 02_mutation_rates.md ), and a generation time of 5 years. The inferred population history is shown below.

With Climate Data

We used climate data taken from Bintanja and van de Wal 2008 to provide an estimate of global averaged sea level. Local factors are not taken into account but generally account for differences on the order of ~1m which is roughly 2 orders of magnitude less than the largest changes shown.

Finally we write out the bootstrap averaged data in msmc format for later us in simulating data under these demographic histories. (See 05_sf2_thresholds.md)