PrologueIt began – as in so many stories – as an unlikely sequence of not-really connected events: a sort of accident. In my case: climate and cycads.
It was 1999, and I just had handed in my Diplom thesis in geology-palaeontology (considered equivalent to a M.Sc., but usually substantially more than that, hence, took longer to get) entitled "Phylogenie der Cycadales" (phylogeny of the cycads: Grimm 1999; on my homepage, you'll find html-Version [abstract in English], I hand-coded more than a decade back; footnotes are not working anymore).
|The "coil" (a doodled phylogenetic tree), summarising the results of my Diplom-thesis|
|The "best tree" found for my morphological data matrix.|
Mainly, I used literature to add some fossil to an earlier morphological matrix and because of my background in genetics, I tried to get some genetic data. In a joint effort, mainly thanks to the magic hands of my later (and now unfortunately dead) technician, Karin Stögerer – also known as "Karin die Hexe" (the Witch), because she did miracles with any material – we succeeded to get two handful of cycad sequences covering the variable D4 region of the nuclear-encoded 25S rRNA gene added to some available in gene banks to get a molecular tree. Naturally, the molecular-based and morphological tree were quite incongruent. Which should become the motto of my entire professional scientific career: there is always an alternative (tree).
|The results of my "cladistic" analysis of the molecular data|
Act 1. And the Story of the Beech (the bitchy beast) commencesOur first meeting was a truly Livingstonian one. Thomas was told by someone that his name appears as one of the authors of a poster on beech. Being curious why, he came to me standing at my poster (back then, poster presentation was dedicated its own slot, so a lot of people would go around and compliment and encourage youngest researchers like me for their work), smiling:
"Hello, my name is Thomas Denk."
And I answered: "You are that Thomas Denk?!"
As it turned out, after he send his material to Germany – he had collected beeches accross their entire range in the wild for morphological studies (Denk 1999a, b, c) – he hadn't heard anything about any genetic results. And here was a guy he never heard off, showing a molecular tree based on his material.
One reason for this was published in our first paper on beeches (Denk et al. 2002).
Evolution, in particular the closer we get to the leaves of the Tree of Life, is not treelike, but involves reticulation. Today and in the past. And we had sequenced the so-called "ITS region", a non-coding but transcribed part of the nuclear-encoded 35 S rDNA cistron: a gene region that encodes three of the four nuclear ribosomal RNAs, essential for all living organisms. It is inherited from both father (paternal) and mother (maternal) and occurs in so-called tandem-repeats (adding up to the thousands per genome) in one or several so-called "Nucleolus Organizer Regions" (you find some explanatory figures in e.g. Grimm et al. 2005), genetic particulars at the end of certain chromosomes. A widely cited paper in 1993 or 1994 stated that a process called "concerted evolution" usually homogenises the many repeats across the genome, which was partly a lie already then. In many plants they are homogenised but in as many they are not, and sometimes you can find evidence of even ancient hybridisation. For beeches, we were lucky to had opened a botanical Pandora's Box. And being a scientific snake pit, people usually skipped organisms showing similar things than we had in the beech trees.
For our first beech phylogeny paper, we got two short, half-page long reviewes by anonymous (male) experts in the field, both suggesting that the paper should be rejected. The one said, our tree conflicts with earlier ones (using a dozen accessions, and all from botanical gardens), hence, must be wrong. The other, a palaeobotanist, said since we don't show a parsimony tree, which should be "standard" (the parsimonists had lost the so-called "phylogenetic wars" about five years earlier, but news travel slower in palaeontology than neontology and in botany compared to zoology, and he was both), all our results are naught.
Normally, the paper would have not been published, but our editor, a woman, recruited a third anonymous reviewer (may have been a woman, too), who wrote a five-page long report pointing out all our many actual errors (mostly terminology, we just followed what was published in phylogenetic literature, a bad guideline) but concluding that the results alone would be worth publishing and is a must-publish for the journal. A tight call, even though we had submitted our paper to a respectable, but low-impact journal (it has had quite an impact despite of this).
Act 2. Grasping complexity of molecular evolution – a first phylogenetic framework for the beechThe beech trees and the maples would become the topic of my Ph.D. thesis: Tracing the Mode and Speed of Intrageneric Evolution (Grimm 2003, online 2005, open access), a work that – to my knowledge – only two other persons read completely (and no native speaker, it should e.g. be "supported" whenever I wrote "sustained"): my second supervisor, the geneticist (a woman) and another woman (we see a pattern here, right?), who assisted the professor officially writing the external review. Back then, German Ph.D.'s involved thesis reports by your official supervisor, and another faculty member, and there was a 1-hour private exam joined by the dean in the last five or ten minutes (in my case, we had a good time, I was asked some questions and then we discussed essentially future projects). I needed an external third reviewer, because #1 and #2 decided I should get a summa cum laude. (A German/continental thing, we like to grade, in the old times a "Summa", the best-possible grade, meant a sure career, but like all things, it lost its value.) We compiled more ITS data reflecting intra-individual and intra- and inter-specific variation. I spend a lot of time looking at my alignment and started seeing the recurring mutational patterns. In the maples is was like reading in a book; but in beech, most genetic differentiation was not fixing a mutation at a site, but gaining or loosing a intragenomic (intra-individual) site variation. Something, we much later called "twisps" (2ISP; Potts et al. 2014).
Thanks to peer-review confidentiality, our first maple paper surfaced three years after my Ph.D. (Grimm et al. 2006). Our next beech paper (Denk, Grimm & Hemleben 2005) with more data and a better coverage of the other species, went smoothly (in a fashion) through the Forest of Reviews. We used Thomas' earlier morphological matrix (Denk 2003), added a few fossils, and compared it 1:1 with the molecular differentiation. To make the reviewers (and readers) aware of the situation, what we are looking at, I graphed some of the mutational patterns.
|Selected differentiation patterns in the ITS of beech (figs 2–4 in Denk et al. 2005). Within species lineages consistent trends can be observed, or increased variation including a share type. This is how gene pools evolve in a non-trivial case.|
The shadow-beasts in the Forest of Review growled again — one negative ("reject") review and no other reviewer would respond to the editor's request to review the paper. So, the editor read the paper himself, told us to to ignore the first review pretty much and provided a second, stating (in response to the reviewer's opinion) that he thinks "also negative results are worth publishing". Some editors do a great job, even behind the Impermeable Fog. In our case, the – still pretty unchallenged – only phylogenetic hypotheses available for the beech trees.
|A summary of various phylogenetic inferences for beech trees.|
We soon realised beech species are in permanent flow. Reticulation being not uncommon, the situation cannot be directly modelled using the standard one-dimensional stick graphs, we know as phylogenetic trees. Our study objects obviously had it all, cross-species messing, population dynamic effects, incongruent genealogies, "paralogy" (in a phylogenetic sense, meaning that variants of the same molecule prefer different trees, not to be confused with genetic paralogues, which are the opposite of orthologues) etc. and we had little reason to assume that in the past it was completely different. We had to deal with the possibility that the beech may speciate and then fuse again when getting into contact, and that this is the reason for the complex ITS differentiation. And, not working with extremely rare individual skeletons, but wide-spread, common macro- and microfossils, we had to expect that at least some of those fossils are the (direct) ancestors of the modern species.
Welcome to the Dark Side of Phylogenetics. The dark realm, in which the much-cherished bright sunlight of cladistics – you just infer a one-dimensional stick graph, and this tells you all you need to know about evolution – diminishes into an ephemeral line at the horizon.
Act 3. Stepping out of the comfort zone into the pastDespite the genetic chaos (to some degree) and with a little training, anyone can identify the few generally accepted species in nature (good species in accordance with Mallet's object-orientated solution to the species question). With more training and insights you can even realise that we have three or four contemporary species in western Eurasia, and not one (Denk 1999b,c; Gömöry & Paule 2010). We never found however the muse to formalise this.
Thus, one could make a morphology-based tree that made sense at the time (Denk 2003), but linking that tree with the genetics of the genus (Denk et al. 2005) made things not easier. Especially when we started to add fossils, including some that may be ancestors of others (and modern-day species). Something the classical trees (and Farrisian cladistics) simply cannot handle (we did even tried explicit biogeographic inferences, but their results were completely for the bin). We moved on to networks and exploratory data analysis (EDA; Denk & Grimm 2009).
EDA is something, I never had heard anything about before I learned about the Genealogical World of Phylogenetic Networks (GWoN) and met David Morrison about five years later, when he came visiting me to see with his own eyes the little heretic sitting at Farris' own institution. But even ignorant of what EDA means, I had been doing it since my Diplom thesis. A bit late (and too late for me to stay in professional science), but it was good to know, I was not travelling a totally obscure path, but one that had been taken by others a good time before me.
Networks are a natural choice for morphological data, and data sets including fossils (see e.g. posts on GWoN). When I was still in Tübingen, a fellow mycologist, Markus Göker, pointed me to a paper in the Journal of Theoretical Biology by Spencer et al. (2004). Later, we teamed up for some pretty heretic papers, one providing the first individual-based genetic network of beeches and a solution to the ITS variation problem by using a “host-associate” framework (Göker & Grimm 2008); another an alignment-free way to define cryptic species in planktic foramifers (Göker et al. 2010) — you know, things mycologists and geno-geologists typically team up for.
Spencer et al. found that the Neighbour-net (Bryant & Moulton 2002, 2004), a planar (2-dimensional) distance-based (meta)phylogenetic graph, outperforms all tree-based inference method when it comes to actual ancestor-descendant relationships. Due to my analysis of genetic data in beeches (and maples, and then oaks, Denk & Grimm 2010) and the sheer coincidence that Daniel Huson, the programmer of SplitsTree, got a professorship in Tübingen sitting in the same building than the husband of my boss, I knew about the potential of consensus networks to visualise internal data conflict and topological alternative. Just from the spatio-temporal distribution of beech fossils (map above), it is pretty obvious that some fossil morphotypes represent precursors (and likely direct ancestors) of one or more modern-day species, others are similar but harder to place. Neighbour-nets and "bipartition networks" (Grimm et al. 2006), better labelled support consensus networks (Schliep et al. 2017), seemed a natural choice.
The 2009 paper remains a pretty unique piece until today, a decade later, as it used distance-based morphology-based planar (meta)phylogenetic networks (above), support consensus networks based on parsimony bootstrap replicates and Bayesian-inferred tree samples (example below), and last but not least "distance suns" and other simple graphs visualising morphological change.
|Four graphs visualising diversity and potential relationships between fossil and modern beech species. Knowing all aspects of your data can be the key for putting up a holistic evolutionary framework.|
All these explicit, data-based analyses were rounded up by the considerable knowledge of Thomas, the actual palaeobotanist in our 2-man team, about the fossil record, morphological change and variation, and allowed us to come up with a pretty neat history of the genus through space and time. 2009, the oldest beech fossil (i.e. representing Fagus, the modern genus) was Fagus langevinii from the spectacular, c. 50 Ma old McAbee Flora, which has all characteristics of a modern-day beech. Recently, my colleagues in Vienna found an about 10 Ma older Fagus pollen in in-situ formed nodules embedded in an absolutely dated strata of the "Hare Island" in the Disco Bay, western Greenland (Grímsson et al. 2016; molecular data predicted that the lineage comprising Fagus latest evolved in the Late Cretaceous, 80 Ma ago, but all Fagaceae/Fagales dating so far have been fundamentally flawed, see below).
Act 4. (Tree-based) "clocks" and "rocks" uniteFor us, and for science, this appeared to be the end of the story. I lost my technician and lab when I had to leave Germany in 2008 (to "develop my career by gaining international experience", as the German Science Foundation put it), so we had no resources to do anything anymore. Based on what we found, and what readily become obvious by earlier and later species-level studies on the plastids (little species correlation, just provenance; as in their distant sister genus, the oaks), no-one would take up the task of producing more gene data to proof us wrong or take up from where we left. It's "publish or perish" in professional science, and complex situations require much more investment than trivial ones, with less output (and, often, more annoyance during the so-called "confidential single-blind peer review" process). Also, it is beneficial if researchers with different backgrounds work together, which is too rarely the case, in real and not only by sharing co-authorships on papers.
But it was not.
Magical seven years later, we had the opportunity to publish (Renner et al. 2016) an explicit molecular dating experiment (and yet again updated fossil record by my veteran co-author Thomas Denk) finding that everything matches up astonishingly well. With little resources, hence, few genes, but still a nice piece. Being one of the rare dating studies in botany that demonstrate the importance of cross-disciplinary research and inter-disciplinary data sets to bridge the artificial gap usually seen and much highlighted in various papers: too young clocks, too old fossils.
Even when you then find new (and older fossils, Grímsson et al. 2016; the dating paper was submitted and done, when my new colleagues in Vienna people found their pollen).
There is no discrepancy between "clocks" and "rocks", when the "rock" part is well understood or selected with care when it comes to the "clock". It may be a bold statement, but I would think that all too young molecular dating estimates and most "long-distance" dispersal would have not been published or put forward as results, if knowledged palaeobotanists would have actively contributed to or reviewed these papers. Or if molecular phylogeneticists and bioinformaticians would work for palaeontologists (like I did); and start adressing the many elephants in the room.
Epilogue: The end?For me, it is. I'm not a professional scientist anymore. The chance is that it will take a decade or more till the next step ahead is done. But there is plenty to do.
First, somebody should generate more nuclear data, and see if it changes something for the (fossilised birth-death) dating. Some of our estimates for within-genus diversification appear to be a bit old and our data set was a compromise due to lack of resources. When you do so, you have to rely on material collected in the wild; arboretum trees will lead you nothing close to reality. For phylogeny in (extratropical) Fagaceae, you need nuclear data; plastid genealogies are interesting, also for comparison and testing ancient reticulation scenarios, but entirely decoupled from the speciation processes in the past: the evolutionary unfolding of the genus in space and time.
Second, somebody should formalise the actual species found in western Eurasia, which is quite a nasty work (long synonymy lists, and a lot of dusty work figuring out, which name would be valid or if a new name needs to be coined). Data-wise everything is clear, there are geographic and genetic-morphological coherent groups within what is these all called Fagus sylvatica (literally "Forest Beech", but known as Common Beech or Rotbuche, "Red Beech").
Third, and least likely*, some people could make an open access database including any revised Fagus fossil, so one could make details maps of the distribution in the past.
Which other plant genus has such a good basis to start from? Let me conclude with a German proverb, a not really good suggestion what to do in case of thunderstorms to avoid being stroke down by lightning.
Buchen sollst du suchen; vor Eichen, sollst du weichen!
(translation: Look out for beeches, and avoid the oaks; Thomas and I were lucky to deal with both ...)
* Least likely, because universities and research institutes dealing in biodiversity and evolution still hire molecular systematicists (obsolete these days, the only thing you need is one bioinformatician) instead of field botanists (needed to gather proper material) and palaeobotanists (needed to do the non-automatable work needed to get good palaeontological data. A perfectly staffed phylogenetic department has three (proper) zoologists, six botanists, and one bioinformatician for analysing the data the other nine produce. Why double as much botanists than zoologists? Plants are usually trickier and challenging than animals (for a start, you usually need to study much more individuals), and much less well studied for now over 150 years. That is why you have a highly detailed Mesozoic Tree of Life on the animal side, but nothing comparable on the much more common co-eval plants that a good deal of the very few found dinosaurs thrived on. When you have to increase the number of zoologists (typically it'd be 6:3 or worse for the botany side), take those that work with common animals (invertebrates).
(Mostly our own papers, I'm afraid. Even though it is one of the most economically and ecologically important tree genera in the extratropical Northern Hemisphere and has less than 10 species, and a fine fossil record, it never got much attention by phylogeneticists and palaeobotanists. 10 may not be enough to look into...)
Blakey RC. 2008. Gondwana paleogeography from assembly to breakup—A 500 m.y. odyssey. In: Fielding CR, Frank TD, and Isbell JL, eds. Resolving the Late Paleozoic Ice Age in Time and Space. Boulder: Geological Society of America, p. 1–28.
Bryant D, Moulton V. 2002. NeighborNet: an agglomerative method for the construction of planar phylogenetic networks. In: Guigó R, and Gusfield D, eds. Algorithms in Bioinformatics, Second International Workshop, WABI. Rome, Italy: Springer Verlag, Berlin, Heidelberg, New York, p. 375-391.
Bryant D, Moulton V. 2004. Neighbor-Net: An agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21:255-265.
Denk T. 1999a. The taxonomy of Fagus in western Eurasia and the ancestors of Fagus sylvatica s.l. Acta Palaeobotanica Supplement 2:633-641.
Denk T. 1999b. The taxonomy of Fagus in western Eurasia, 1: Fagus sylvatica subsp. orientalis (= F. orientalis). Feddes Repertorium 110:177-200.
Denk T. 1999c. The taxonomy of Fagus in western Eurasia, 2: Fagus sylvatica subsp. sylvatica. Feddes Repertorium 110:381-412.
Denk T. 2003. Phylogeny of Fagus L. (Fagaceae) based on morphological data. Plant Systematics and Evolution 240:55-81.
Denk T, Grimm G, Stögerer K, Langer M, Hemleben V. 2002. The evolutionary history of Fagus in western Eurasia: Evidence from genes, morphology and the fossil record. Plant Systematics and Evolution 232:213-236.
Denk T, Grimm GW. 2009. The biogeographic history of beech trees. Review of Palaeobotany and Palynology 158:83–100.
Denk T, Grimm GW. 2010. The oaks of western Eurasia: traditional classifications and evidence from two nuclear markers. Taxon 59:351–366.
Denk T, Grimm GW, Hemleben V. 2005. Patterns of molecular and morphological differentiation in Fagus: implications for phylogeny. American Journal of Botany 92:1006-1016.
Göker M, Grimm GW. 2008. General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability. BMC Evolutionary Biology 8:86.
Göker M, Grimm GW, Auch AF, Aurahs R, Kučera M. 2010. A clustering optimization strategy for molecular taxonomy and its application to planktonic foraminifera SSU rDNA. Evolutionary Bioinformatics 6:97–112.
Gömöry D, Paule L. 2010. Reticulate evolution patterns in western-Eurasian beeches. Botanica Helvetica 120:63–74.
Grimm GW. 1999. Phylogenie der Cycadales. Diploma thesis thesis. Eberhard-Karls Universität, Tübingen.
Grimm GW. 2003. Tracing the mode and speed of intrageneric evolution - a case study of genus Acer L. and Fagus L. D.Sc. thesis. Eberhard-Karls University.
Grimm GW, Denk T. 2008. ITS evolution in Platanus: homoeologues, pseudogenes, and ancient hybridization. Annals of Botany 101:403-419.
Grimm GW, Denk T. 2010. The reticulate origin of modern plane trees (Platanus, Platanaceae) - a nuclear marker puzzle. Taxon 59:134-147.
Grimm GW, Denk T, Hemleben V. 2007. Coding of intraspecific nucleotide polymorphisms: a tool to resolve reticulate evolutionary relationships in the ITS of beech trees (Fagus L., Fagaceae). Systematics and Biodiversity 5:291-309.
Grimm GW, Renner SS, Stamatakis A, Hemleben V. 2006. A nuclear ribosomal DNA phylogeny of Acer inferred with maximum likelihood, splits graphs, and motif analyses of 606 sequences. Evolutionary Bioinformatics 2:279–294.
Grimm GW, Schlee M, Komarova NY, Volkov RA, Hemleben V. 2005. Low-level taxonomy and intrageneric evolutionary trends in higher plants. In: Endress PK, Lüttge U, and Parthier B, eds. From Plant Taxonomy to Evolutionary Biology. Stuttgart: Wissenschaftl. Verlagsges. mbH, p. 129-145.
Grímsson F, Grimm GW, Zetter R, Denk T. 2016. Cretaceous and Paleogene Fagaceae from North America and Greenland: evidence for a Late Cretaceous split between Fagus and the remaining Fagaceae. Acta Palaeobotanica 56:247–305.
Heath TA, Huelsenbeck JP, Stadler T. 2014. The fossilized birth–death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences 111:E2957–E2966.
Hennig W. 1950. Grundzüge einer Theorie der phylogenetischen Systematik. Berlin: Dt. Zentralverlag.
Potts AJ, Hedderson TA, Grimm GW. 2014. Constructing phylogenies in the presence of intra-individual site polymorphisms (2ISPs) with a focus on the nuclear ribosomal cistron. Systematic Biology 63:1–16.
Renner SS, Grimm GW, Kapli P, Denk T. 2016. Species relationships and divergence times in beeches: New insights from the inclusion of 53 young and old fossils in a birth-death clock model. Philosophical Transactions of the Royal Society B DOI:10.1098/rstb.2015.0135.
Schliep K, Potts AJ, Morrison DA, Grimm GW. 2017. Intertwining phylogenetic trees and networks. Methods in Ecology and Evolution DOI:10.1111/2041-210X.12760.
Spencer M, Davidson EA, Barbrook AC, Howe CJ. 2004. Phylogenetics of artificial manuscripts. Journal of Theoretical Biology 227:503-511.