Analyses of introgression with Dsuite
Milan Malinsky, Hannes Svardal, 29 January 2020
Background and Aim
Admixture between populations and hybridisation between closely related species are common and a bifurcating tree is often insufficient to capture their evolutionary history ( example ). Patterson’s D, also known as ABBA-BABA, and related statistics are commonly used to assess evidence of gene flow between populations or closely related species in genomic datasets. They are based on examining patterns of allele sharing across populations or closely related species. In this exercise, we are going to use Dsuite, a software package that implements Patterson’s D and related statistics in a way that is straightforward to use and computationally efficient. It also implements some tools and statistics which are not available in any other software, for example the f_dM introduced by Milan in this crater lake cichlid study and Hannes’ ‘f-branch’ approach from our Lake Malawi cichlid genomics paper. While exploring Dsuite, we are also going to learn or revise concepts related to application, calculation, and interpretation of the D and of related statistics.
Requirements
- Dsuite: latest version is already preinstalled on the Amazon instances
Datasets
- Simulated data: 20 populations, 5 gene flow events, 2 individuals per species
- Lake Malawi cichlids: 73 closely related species, 135 individuals
- Lake Tanganyika cichlid dataset from yesterday’s SNAPP activity
Table of contents
- Exploring a simulated dataset
- Finding adaptive introgression in Malawi cichlids
- Finding gene-flow in Tanganyika cichlids
1. Exploring a simulated dataset
We have prepared a jupiter notebook that is going to guide you through the commands and output files encountered in a typical Dsuite analysis while exploring a simulated dataset with 20 populations, 5 gene flow events and 2 individuals per species. To start working with the jupyter notebook
- Connect to your amazon cloud instance (AMI).
- Either, use guacamole. In your web browser, go to the address
- http://ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com:8080/guacamole
where XXX-XXX-XXX-XXX is replaced by the Amazon instance IP address assigned to you. You can find that address at the web page. - username: popgen, password: same as always
- http://ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com:8080/guacamole
- Or use ssh from your terminal
ssh popgen@ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com
(replace XXX with your Amazon instance IP address)