Faculty Highlight: Bill Cresko

Bill Cresko
University of Oregon

So Bill Cresko heads up a lab focused on ecological and evolutionary functional genomics. The lab is very well known for their work on Stickleback which, when Bill started his work didn’t have any data in Genbank for. Bill met me for breakfast and we had a great conversation covering his interests in Stickleback, Pipefish and his philosophies in being in a highly quantitative field.

Note: All his responses are paraphrase from our breakfast meeting.

Why Stickleback?

  • Yes, actually I really liked math as a kid and ended up majoring in Physics as an undergraduate at the University of Pennsylvania and I really liked applying math to biology. I met Bob Paine an ecologist at the University of Washington and after hearing of my interests he pointed me in the direction of his graduate student how had just finished, Susan Foster and thought it would be a good match and Susan was working on Stickleback. It ended up being the perfect storm of events that led me to Susan’s lab, we clicked as a group and we got some good research going.

Alright then..Stickleback it is and I know several of you are very very interested in the use of RAD-seq in your research so we are going to visit the seminal work in the field that bore out RAD-seq and put it the lime-light as a great PopGen technique which later Julian, our very talented whisky loving unix ninja, would build the program STACKS for.

Hohenlohe et al., 2010. Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags. PLoS Genetics. 6:e1000862

 “Population genetics provides a rich and mathematically rigorous framework for understanding evolutionary processes in natural populations. This theory was built … by modeling the processes of selection, genetic drift, mutation and migration in spatially distributed populations. The field has concentrated primarily on the dynamics of one or a small number of genetic loci, largely because of methodological limitations. However, genes are not islands, but rather form part of a genomic community, integrated both by physical proximity on chromosomes and by various evolutionary processes. With technological advances, such as Next Generation Sequencing (NGS), the emerging field of population genomics now allows us to address evolutionary processes at a genomic scale in natural populations.”

The Stickleback (Gasterosteus aculeatus) is a small fish that lives in North America, Europe and Asia. They live in estuary fresh water environments and also in the ocean. Many Stickleback fresh water populations have been independently derived from their oceanic ancestors due to isolation following deglaciation which caused freshwater habitats that were like ‘small watery’ islands to appear isolating the fish.

Interestingly, these populations despite being geographically isolated causing little to no gene flow between them; populations in similiar freshwater environments have evolved in parallel along the same phenotypic trajectory at local, region and global scales.

Stickleback gives us a unique opportunity to examine the developmental genetic and genomic basis of rapid adaptation by comparing freshwater and oceanic population. Having a draft reference genome for the Stickleback greatly assisted in population genomic analyses as well…so if you can find a reference genome, even a draft one, it will help a lot.

Through lab crosses they were able to identify nearly two dozen quantitative trait loci (QTLs) and that in some cases parallel phenotypic evolution was due to parallel genetic changes. These changes having occurred via fixation of alleles of the same genes from the standing genetic variation found in oceanic populations….though they couldn’t say for sure as these alleles could’ve been the result of multiple or single mutational events.

“Are these instances of parallel evolution in individual loci representative of genome-wide patterns of parallel evolution independently derived freshwater populations?”

Assumption: Because previous evidence suggests that these populations are quite young and are independently derived from oceanic populations with little/no gene flow, we expect most of the adaptive evolution in the freshwater habitats that we might find to be the result of selection on standing genetic variation present in the founding populations.

What They Did:

  • Collected Stickleback from five populations in Alaska: Rabbit Slough (oceanic), Resurrection Bay (oceanic), Bear Paw Lake (freshwater), Boot Lake (freshwater),  and Mud Lake (freshwater).
  • 100 individuals representing 20 popluations
  • RAD tag libraries were created from genomic DNA, digested with SbfI. Only barcodes that differed by 3 nucleotides were used.
  • filtered reads, removed poor quality reads, sort/aligned to Stickleback genome using Bowtie
  • Inferred genotypes using maximum likelihood approach.
  • Caculated population genomic statistics: expected heterozygosity, Fst (adapted to account for unequal sample size) to estimate differentiation among populations, did kernel smoothing, examine allele frequency spectrum using Tajima’s D, bootstrap resampling to assess significance.

What The Found:

  • 45,000 SNPs spread evenly throughout the genome.
  • Sequenced to depth of 5-10x in each individual
  • Significant variation with and across populations
  • Oceanic populations showed the least amount of differentiation between them; in contrast freshwater populations had quite a bit of differentiation between themselves and when compared to oceanic populations.
  • Linkage Groups (LG): LGII and LGV showed significantly low levels of diversity and heterozygosity
  • LGIII and LGXIII showed high diversity and heterozygosity
  • LGIII: genes implicated in the firsst line of defense against pathogens, inflammation pathway genes
  • LGXIII: genes involved in the innate immune response
  • No significant genome differentiation between oceanic populations
  • Significant differentiation between oceanic and freshwater popuations
  • The large majority of genomic regions of elevated Fst are shared across the three freshwater populations (see B,C,D in figure below) as compared to A or E,F

journal.pgen.1000862.g006

  • Tajima’s D statistic negative in oceanic populations which corresponded to peaks in freshwater/oceanic differentiation
  • Private allele frequency was higher in oceanic populations as compared to freshwater than vice a versa.
  • Fst peaks due to alleles not found in oceanic populations.
  • There were exceptions to pattern: Linkage groups (LG), LGI, LGXI where private allele density doesn’t differ significantly between freshwater and genome-wide average.
  • Private allele density in the ocean relative to freshwater is significantly higher
  • Diversity also elevated in oceanic populations most likely due to the environment being more permissive to multiple haplotypes whereas the freshwater environment allows only a subset to have high fitness.
  • There was a lot of correspondence with past QTL studies using microsatellite markers.
  • Identified genes of adaptive significance: Ectodysplasin A (Eda), implicated in loss of lateral plate; osteogenesis genes (skeletal character)…candidates include genes involved in patterning and homeostasis of skeletal traits, osmotic stress, development of osmoregulatory organs and others with pleiotropic roles…again in skeletogensis and osmoregulation.
  • LGVII and IV appear to be important in differentiation of freshwater stickleback (this region again, holding skeletal candidate genes).

What does it all Mean??!!

  • Oceanic Stickleback populations have few barriers to dispersal, large amounts of gene flow and little population genetic subdivision.
  • The freshwater populations, despite their younger age, are more divergent both from the oceanic ancestral populations and from each other, consistent with our supposition that they represent independent colonizations from the ancestral oceanic population.
  • RAD tags is a tool that can be used for population genetics studies in organisms that do not yet have a sequenced genome (the read could instead be aligned de novo).
  • Large populations of oceanic stickleback have given rise repeatedly to freshwater populations which then became phenotypically differentiated on a background of minor neutral population divergence.

“Are these instances of parallel evolution in individual loci representative of genome-wide patterns of parallel evolution independently derived freshwater populations?”

  • …the previously identified parallel genetic basis for the loss of armor traits in stickleback appears to be a general rule across the genome, in that much of the adaptation of stickleback populations to freshwater conditions likely involves the repeated use of the same repertoire of developmental and physiological systems, genes, and perhaps even alleles. However, the details of this parallel evolution – for example, whether it involves independent fixation of alleles that are identical by descent in multiple derived populations, or fixation of different alleles at the same locus – appear to differ in different parts of the genome

Final Thoughts:

  • This was the first whole-genome analysis of threespine stickleback where high-density SNP markers show signatures of selection in natural populations.
  • All current work has confirmed and extended past work (QTL/microsatellite)
  • This work revealed genomics regions imporant in the transition from oceanic to freshwater habitats.

“…parallel phenotypic evolution in stickleback may be underlain by extensive, genome-wide, parallel genetic evolution”

…and that’s pretty neat. I suggest you read the full paper which has much more details on methods and reasoning underlying many of their conclusions and it’s just an overall good read.

How did you get the idea for RAD-seq?

  • We had faculty seminar series and in one of these were musing about using SNP chips and I was wondering if I needed to throw all my financial eggs into one basket and do a SNP chip for the stickleback…this was around 2006. So myself and Eric Johnson sat down and hammered out the details of a microarray and along with two very talented undergrads Mike Miller and Joe Dunham just ‘made it work’. Oddly enough when Illumina sequencing came out we contacted to see if we could get their adapter sequences so we could see if it was modify-able for our sticky-ends approach and Illumina wouldn’t tell us. So we bought some, TOPO cloned it, sequenced it then went back to them saying, we have the sequence, here it is…can we modify it for our purposes…they said yes. We successfully ran the Illumina run and got 6 million 35 bp sequences and were like…now what? The loci were random around the genome and we just decided to take a look and see what the data would tell us…

Mel interjects –So what then is your feeling on discovery versus hypothesis driven science given we’ve been attempting to drive the idea of more hypothesis driven science in NGS at this workshop in many presentations?

  • There is a role for discovery based science in getting a handle on a system but that’s not enough…eventually hypotheses have to be addressed.

So discovery based techniques do not absolve you of creating/defining and testing a hypothesis…

  • Next generation sequencing is still an experimental system and we have to treat it as such. We need to think about all the same things as we do in the wet lab…number of samples, number of replicates, a hypothesis we are trying to test…we have to design next generation sequencing experiment as we would design a wet lab protocol. We troubleshoot, we try different parameters, ‘levels of enzyme’ if you will, we try a variety of assays and we use the information to better inform future experimental design and hypotheses.

What is your philosophy on bioinformatics training and interdisciplinary research?

  • We really need to integrate raw statistical training for bioinformaticists. In the end, everyone needs help doing their research and ‘raising’ their kids that what mentorship and training are for but that doesn’t abdicate us from really getting into our research and knowing what is going on in our system at all levels…for biologists that means getting into the programming, the math, the statistics…for programmers and mathematicians it means getting into the biology the evolution; it’s the only way to really understand your research and your system.

“Don’t be beholden to the black box…”

Python or Perl?

  • Python and R

awww….poor Perl, she’s really taking a beating at this workshop; though she still has a champion in Mike Zody!

Who was a great influence in your development and continued career as a scientist?

  • Susan Foster definitely! She was a great mentor and was strong in the face of great criticisms at times (mid 90’s/early 2000’s)…it wasn’t that easy to be a woman in science
  • My PostDoc advisor: John Postlethwait and mentor Chuck Kimmel at University of Oregon, both of whom have very humbling CVs! They really developed Zebrafish as a model organism and paved the way for evolutionary developmental biology and interrogating how gene families evolve.

Favorite organism?

  • Aside from our new puppy? Probably the Leafy Sea Dragon, how could you not love this ridiculous animal you just look at it and wonder; “How the hell did that happen!?”

So what’s in store for the future?

  • We’d really like to dig into the Pipefish genome and develop it into a model organism, there’s so much novelty to discover and test.

So…I’m interested in microbial/viral ecology and I need to ask…Do Stickleback get sick?

  • Huh…bacterial or viral I don’t know but they sure do carry a lot of parasites.

Three words to sum up your field… ecological and evolutionary functional genomics

  • “Well, that’s cool.”

Bill has his lab website of course that you can go to and learn all about the Stickleback, the Pipefish and all manner of ridiculously awesome organisms that they are studying in addition to the field of  ecological and evolutionary functional genomics. And he’s a pretty nifty guy who loves to talk science. He is also involved in some bacterial work via P50 grant that he attained to do microbiome studies on germ-free Stickleback…remember the gnotobiotic mice? Well gnotobiotic stickleback are cheaper, easier to maintain and really cool to have around in the corner of the lab to subject to all manner of biological material!

Thanks Bill!

seadragon
How the hell did that happen? ~Bill Cresko, Workshop on Genomics 2014

Cheers…Dr. Mel