Today was a marathon day for me in blogging (this is my 5th and last for the evening)…attempting to catch up with all the sessions as well as the final faculty highlights for this years workshop on genomics. As I was furiously typing Chris sat next to me pondering his own slides for the talk he is to give tomorrow occasionally tossing me glances. When my fingers took a break he commented…
“Wow, I feel productive just sitting next to you with how much you are typing…”
Kind words indeed from a man who actually epitomizes productivity and collaboration to the -enth level! Degrees from Emory and Stanford, Post Docs at the Max Planck and another with Penn State/University of Helsinki. And associate editor for Evolution and BMC Evolutionary Biology; he’s published in highly recognized and respected journals including Molecular Ecology, BMC Genomics and PNAS; has 9 collaborators ranging in expertise from genomics to chemical ecology, teaches at workshops…a gander at his picture on his website has me wondering: “What thoughts must be thought in that mind of his as he gazes out in wonder at the world?” His CV is impressive and his good-naturedness and willingness to try a glass of absinthe, green fairies be damned, admirable.
When I was first looking around at Chris Wheat’s work, because I routinely internet stalk people I am considering hitting up for information, research, interviews whatnot what features most prominently on his site is butterflies…makes sense given his research deals primarily with butterflies.
So you have a very nice description of Why Butterflies on your site…did you just ‘know’ you were going to study butterflies or did a mentor point you to them as a good ‘model’ system? Is there a personal backstory to your butterfly focus or is it purely a great model of convenience for your work?
“Well as an undergrad I realized I like research, even though my first research project was on stream bacteria using allozymes to look at gene flow in stream bed sediment. My university (Emory) didn’t have much in the way of molecular ecology, so during my last year of college I was commuting 90 minutes twice a week to continue my project at the University of Georgia. While there the PI of the lab off handedly suggested I attend the talk by the visiting professor. I got there a bit late, and made my way to the back of the room…and my life changed. Dr. Watt was presenting his work, smoothly integrating genetics + biochemistry + physiology + ecology + population genetics. I’d never seen that type of work before and it made all of my undergraduate classes in each of those subjects finally fall into a cohesive, unified view of the world. I was addicted.”
“The host of the talk invited those interested to have pizza with the speaker for lunch, so I did. Eventually I managed to get the courage to talk to Dr. Watt. He was very friendly and even though I was wearing a black leather rocker jacket and had somewhat unwashed hair halfway down my back, he told me to write him a letter about my research interest. I wrote him and eventually did my Ph.D. with him. So, butterflies were the model for the integrative work.
So let’s jump into some of that model work via Chris’ paper: Wheat et al., 2011. Functional genomics of life history variation in a butterfly metapopulation. Molecular Ecology. 20: 1813. I am going to quote from the paper for part of the intro as I am not so nearly eloquent in describing their work…so I’ll let them do it.
“Genomic studies of wild populations have the potential to reveal the genetic basis of traits affecting fitness and may ultimately lead to a synthesis of population biology and genomics (Ellegren & Sheldon 2008). Historically, genetic variation affecting fitness in populations was expected to be uncommon due to fixation by selection (Fisher 1958), though opposing views were also expressed (Dobzhansky 1955; Lewontin 1974). Over time, research focus has shifted to understanding the dynamics that can maintain such variation (Wade & Goodnight 1998). Theoretical (Frank & Slatkin 2007) and empirical studies (Cain et al. 1990; Gibbs & Grant 2002) have shown that temporally varying selection due to changing environmental conditions may maintain genetic variation with large fitness effects (Bell 2010). Similarly, it is well known that selection may vary from one habitat type to another in a heterogeneous environment, thereby maintaining genetic variation at the landscape level (Levene 1953; Schaeffer 2008). Less well understood is what happens in fragmented landscapes in the absence of habitat differences but with a high rate of population turnover.”
- Can extinction–colonization dynamics lead to diversifying selection and maintain variation with large fitness effects?
- To examine the consequences of repeated local extinctions and re-establishment of new populations for the pattern of genetic variation with large fitness effects across a heterogeneous landscape.
- To apply functional genomics tools to further investigate how genetic variation interacts with metapopulation processes.
- To examine how gene expression and flight metabolic phenotypes vary (Pgi metabolic gene).
- Determine how allele frequencies are related to metapopulation dynamics
- That gene expression phenotypes and alleles associated with large phenotypic effects on fecundity and dispersal become assorted according to population age by the metapopulation dynamics (Hanski et al. 2004).
What they Did:
- Compared gene expression in butterflies from new (habitat patches recently colonized by females that came from the metapopulation) and old populations
- Butterflies from the Glanville fritillary metapopulation from the Åland Islands in Finland
- Used 2nd generation butterflies reared in garden conditions on host plant Plantago lanceolata
- Multilocus single nucleotide polymorphism (SNP) genotyping
- Examined reproductive physiology and made rearing adjustments to minimize potential effects of circadian rhythmicity
- 454 sequencing to inform their microarray development (assembly v1.0)
- Measured Peak metabolic rate (PMR) and CO2 emitted during 10 min of flight
- Developed a microarray printed with 14,251 probes at least in triplicate (~9000 unigenes).
- Obtained additional ~600,000 EST reads using 454 to form another assembly (assembly v2.0)
- They normalized their expression data
- Mixed model analysis (determine expression differences associated with population age and examine the effects of factor independently from their association with population age.
- Gene set enrichment tests to detect low (but significant) signals of expression. (GO annotations)
- Minimized redundancy in microarray probes and assigned ID’s to probes (KEGG is used for ID assignment).
- Genotyping of Pgi and Sdhd genes
- systematic scanning of nucleotide diversity for population age-related differences
What they Found:
- There was higher expression of larval serum protein genes and lipid transporters, Ace expression (regulates oviposition) in new population (new pop) butterflies
- New pops mobilize protein from fat reserves to provision developing eggs more rapidly than old population (old pop) females.
- New pop females had higher metabolic rate during flight
- Two protease inhibitors in the thorax were most strong associated with population age differences (reduced expression in new pops).
- The genetic variation affecting physiological measure of flight performance is the same variation being sorted by the metapopulation dynamics.
- Physiological variation may stem from Pgi since both abdomen and thorax had gene expression phenotypes that varied with the Pgi genotype.
- Found an indel in Sdhd (succinate dehydrogenase d) that caused higher expression of chorion genes and carbohydrate metabolism genes in thorax…meaning individuals with this indel phenotype had higher flight endurance.
- Butterlfies possing both Pgi_111_AC (heterozygotes) and Sdhd D had the greatest flight endurance and they were more likely to disperse.
What does it all Mean??!!
- “Female offspring of the founder of new populations exhibit differences in life history traits related to dispersal and reproduction.” thereby facilitating colonization of extinct patches and potentially perpetuating the successful genotype/phenotype via their offspring. This positive relationship is unusual in that typically flight and egg development would compete for protein in the body of the insect however in this case it doesn’t appear to be that way, there doesn’t appear to be a trade-off between dispersal and fecundity.”
How is genetic variation with fitness advantage maintained in the metapopulation is still a question that cannot be fully answered at this time. Spatially and/or temporally varying selection may play a role or heterozygote advantage in concert with other effects. The current data points strongly to heterozygote advantage.
” These results demonstrate that integrating functional genomics with population ecology is a powerful way to obtain mechanistic insights into life history ecology and evolution (Ronce & Olivieri 2004) and to identify new candidate genes affecting eco-evolutionary dynamics (Saccheri & Hanski 2006). Our findings have significance for conservation biology, because the life history traits we have studied affect metapopulation persistence in fragmented landscapes (Hanski & Ovaskainen 2000).”
What role do you think timing plays in extinction-colonization dynamics, in terms of time from extinction to colonization and do you think that would affect the eventual fitness outcome or ability of the metapopulation to maintain the adaptive variation? Would we expect the same outcome if recolonization occurred days/weeks versus months or years later?
“Space and time are really important factors in these issues. After colonization, when the cessation of gene flow and allelic loss starts, each generation is another cycle in the extinction vortex. It is only the dispersal of individuals that keeps some of these isolated populations ‘alive’ and tat really an issue of landscape.”
So you used 454 analysis to inform your microarray development and followed up with qPCR and genotyping analyses for genes that looked interesting in differentiating your old/new populations.
“Exactly, I view transcriptome and genome scale analyses as hypothesis generating tools that require additional study in other samples at other levels of biological organization. There are so many ways to end up chasing false positive conclusions that I view followup work as essential. I’m not so worried about p-values but rather…what is real vs. what is fiction.”
I was interested on your position on RNA-seq/ChIP-seq in the field at the moment, I’ve heard it is the ‘wild west’ of sequencing nowadays. Do you have any advice for those diving into the world of RNA-seq?
Do you favor microarrays?
“I favor the best tool for the hypothesis to be tested. I’m still happy to look at allozyme data if they are the right tool for the question. So, I’d say I favor the right tool, and if both microarray and RNA-seq could be the right tool, I’d favor the one that allowed more biological replication and robust analysis. In some cases that is using a microarray, for others, RNA-seq could be better.”
Do you suggest they do both microarray/RNA-seq to gain more confidence in their dataset (along with other follow up confirmation work ie. qPCR?) Or do you think RNA/ChIP-seq have moved far enough that you don’t have to take it with such a boulder of salt?
“Well these are important issues. What comes to mind is this…in many cases science follows fads and is led by technological developments rather than hypothesis testing. I focus on non-model species, so my field is dominated by tech developments in the biomedical community and the tools that focus upon them. Ever since the advent of allozymes, though to DNA sequencing, then microarray, then 2nd generation sequencing (genome and RNA-seq), all these tech revolutions follow a similar pattern:
- Studies emerge using new technology with experimental designs that violate all the statistics you have ever been taught. But they are the first to publish, they are sexy and the cost argument wins the day.
- The field finally matures in terms of the sample sizes, hypothesis testing and experiment design (These are all different sides of the same cube).
- A new related field comes along, and as the tech innovation cycle starts again, those trying to move forward with the now and ‘old’ but robust tool are in many cases criticized for not using the latest tool.
What bother’s me about this cycle is that hypothesis should always come first, followed by rigorous experimental design. Microarray analysis is now a very mature field, where linear mixed models can be used for complex experimental designs to detect expression variation among several levels of interacting treatment groups, at expression levels qPCR could never detect. RNA-seq is awesome but it’s analysis power is at a very early stage compared to microarrays.
Currently only one statistical method exists for RNA-seq that can moderately handle multifactor experimental design complexity. But I don’t know if it does it correctly, since the field is too new and no simulation studies have been conducted.
In sum, I view RNA-seq or ChIP-seq, when used for expression analysis as hypothesis generating tools that need further study and validation. One may be better than another for a given question…but they are just starting points. Following up on their insights at other levels of organization like proteomics or metabolomics, is much more informative than use qPCR on the same samples.”
What do you feel is the most important consideration when conducting an expression study? Or set of considerations?
“Know your question/hypothesis. There is no substitute for this, and it’s not an easy thing to know unless you invest some time doing your homework. When we started, we wanted to know the gene involved in heritable dispersal differences. From there we decided that gene expression could take us there and then we had to make decisions as to what tissue and developmental stage to look at. Whether we should focus on the head, thorax, or abdomen was a large choice, and using whole bugs was a very new idea since expression variation was a strange average across different body regions. Ideally we would have used individual organs etc but we needed lots of biological replication to accurately capture our phenotype, so we needed to limit the time point and tissue we sampled.
I also think that we were terribly naive to use the transcriptome to try and find the genes affecting dispersal…since gene expression variation seen at 100’s of genes can be caused by a single mutation that itself doesn’t give rise to much expression variation at all. We got rather lucky, but that was because we hacked our system to test a very clear hypothesis.”
Do you think it is possible to conduct this sort of ecological/evolutionary dynamics study in a non-model system that does not have good or any reference genome, but perhaps is well characterized in terms of ecology and behavior, using modern sequencing techniques like RNA-seq with cross-validation such as qPCR? Or do you really need a tight well characterized system first? Example: A student was working on fireflies last year attempting to attach behavior and ecology to gene expression…however there was no reference genome.
“Well, when we started there was only microsatellites and some COI sequence for this species. So we did this study in a truly model ecological system which had no genomic resources. I focus upon developing approaches in non-model species and I think there is a lot of progress that can be made. Using RNA-seq coupled with pooled genomic sequencing could rapidly change any species into one where clearly candidate genes could emerge.
Let me also add that I don’t think qPCR per say is the best way to validate global gene expression insights. Since I’m interested in knowing what is happening in wild population I prefer to validate expression observations at higher levels an in independent samples, by quantifying enzyme kinetics, splicing dynamics or assessing expression after the manipulation of factors that I think might be important.”
“This is the beauty and power of hypothesis testing.”
“After your RNA-seq study, you should get some biological signal of where to look next to chse down your phenotype of interest. This is exactly what we did in our next study, where we were able to show that mutation in hypoxia signalling is likely a huge factor affecting dispersal propensity (Marden et al., 2013).”
Has there been any evidence in other model or non-model organism systems of this extinction-colonization diversifying selection ‘dynamics’ that you might point me to as other examples to highlight?
“Here’s another article of ours: The genomic and physiological basis of life history variation in a butterfly metapopulation.
But for other systems that look in the wild at metapopulation dynamics effects on selection, there are not many…but there are nice examples of diversifying selection on dispersal variation. The most recent paper looks at Drosphilia flight adn the For gene.”
You had a link: Using microarrays to find new candidate genes for study in the wild, the paper I read seems to cover this…is this the best example or is there a newer study coming out soon?
“Well I think our follow up paper in evolution is perhaps much more insightful” http://onlinelibrary.wiley.com/doi/10.1111/evo.12004/abstract
So you are a biologist by training, did you teach yourself the computational side? Programming?
“Well, I taught myself. I was starting with genomic scale data in 2004-5, and I wanted to check if the core facility was doing things correctly. Initially I started finding ways to graph the data to see if it fit with my 1st principle expectations based upon my biology/genomic training. After realizing that core facilities generally make mistakes, I needed to keep digging into the data, and soon was doing bioinformatics.
The most effective way for me to learn python and bash scripting was to have a bioinformatic master’s student sit next to me at my desk, while working on a summer project that I put them on. Basically, I watched them write code, took notes, copied certain lines of text, and before I knew it I was off and running!”
Do you have any advice for students in the same position in terms of how to approach augmenting their knowledge and where (stats, programming etc…)?
“I highly recommend this type of paired learning, as both parties greatly benefit”
Finally, who was the defining force/mentor or if you prefer, defining moment in your life that set you on the scientific path you are on now?
“Honestly, the single most important person for my academic career was the guy who got me hooked on asking questions; Dr. Eisen at Emory. In my intro biology course he never gave us answers, but used socratic-like guiding through well placed questioning.
I also have to say that I’ve had some significant challenges in my career and am quite lucky to be where I am today. I think that came from learning tough lessons from people how never knew they were teaching me. Currently, I collaborate with people that I really like and that makes my job incredibly precious. I suggest that people work towards creating and fostering such communities while not compromising the higher principles of academic life.”
Three words that you feel sum up your experience(s) with Butterflies or ecological genomics?
COLLABORATIVE DIALETIC REWARDS
You study the Butterfly but is it your favorite organism as well?
“Well for work, yeah, butterflies are great. For fun, chasing them on horseback can’t be beat.”
You can find out more about Chris and his research on his website or the University Page (linked at the top of the post).
An amazing Thank You to Chris for answering all these questions and sharing his insights and experiences in research with us, it is an amazing ride and we look forward to your talk tomorrow!
Alright…my fellow bioinformatic minions that rounds out our Faculty Highlights for this year I hope you’ve enjoyed getting to know your faculty and I hope you are inspired by their research, their dedication, their unfailing sense of humor and good-naturedness toward all of our questions…
The Access Hollywood of Next Gen Sequencing and Genomics is now closed for the season…