Structurama is a program for inferring population structure from genetic data. The program assumes that the sampled loci are in linkage equilibrium and that the allele frequencies for each population are drawn from a Dirichlet probability distribution. The method implements two different models for population structure. First, Structurama implements the method of Pritchard et al. (2000) in which the number of populations is considered fixed. The program also allows the number of populations to be a random variable following a Dirichlet process prior (Pella and Masuda, 2006; Huelsenbeck and Andolfatto, 2007). Importantly, the program can estimate the number of populations under the Dirichlet process prior. Markov chain Monte Carlo (MCMC) is used to approximate the posterior probability that individuals are assigned to specific populations. Structurama also allows the individuals to be admixed. Structurama implements a number of methods for summarizing the results of a Bayesian MCMC analysis of population structure. Perhaps most interestingly, the program finds the mean partition, a partitioning of individuals among populations that minimizes the squared distance to the sampled partitions.

Structurama web site


Huelsenbeck, J. P., and P. Andolfatto. 2007. Inference of population structure under a Dirichlet process model.Genetics. 175:1787-1802.

Huelsenbeck, J. P., P. Andolfatto, and E. T. Huelsenbeck. Submitted. Structurama: Bayesian inference of population structure. Bioinformatics.

Pella, J., and M. Masuda. 2006. The Gibbs and split-merge sampler for population mixture analysis from genetic data with incomplete baselines. Can. J. Fish. Aquat. Sci. 63:576-596.

Pritchard, J. K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics. 155:945-959.