Analyses of introgression with Dsuite

Milan Malinsky, Hannes Svardal, 29 January 2020

Background and Aim

Admixture between populations and hybridisation between closely related species are common and a bifurcating tree is often insufficient to capture their evolutionary history ( example ). Patterson’s D, also known as ABBA-BABA, and related statistics are commonly used to assess evidence of gene flow between populations or closely related species in genomic datasets. They are based on examining patterns of allele sharing across populations or closely related species. In this exercise, we are going to use Dsuite, a software package that implements Patterson’s D and related statistics in a way that is straightforward to use and computationally efficient. It also implements some tools and statistics which are not available in any other software, for example the f_dM introduced by Milan in this crater lake cichlid study and Hannes’ ‘f-branch’ approach from our Lake Malawi cichlid genomics paper. While exploring Dsuite, we are also going to learn or revise concepts related to application, calculation, and interpretation of the D and of related statistics.


  • Dsuite: latest version is already preinstalled on the Amazon instances


  • Simulated data: 20 populations, 5 gene flow events, 2 individuals per species
  • Lake Malawi cichlids: 73 closely related species, 135 individuals
  • Lake Tanganyika cichlid dataset from yesterday’s SNAPP activity

Table of contents

  1. Exploring a simulated dataset
  2. Finding adaptive introgression in Malawi cichlids
  3. Finding gene-flow in Tanganyika cichlids

1. Exploring a simulated dataset

We have prepared a jupiter notebook that is going to guide you through the commands and output files encountered in a typical Dsuite analysis while exploring a simulated dataset with 20 populations, 5 gene flow events and 2 individuals per species. To start working with the jupyter notebook

  • Connect to your amazon cloud instance (AMI).
  • Show me how