One way to avoid having direct ancestors in our dataset, and other issues of palaeophylogenetic studies such as problematic coding of characters, is to not use composite or higher taxa. Instead of (fossil) species and genera, one scores each individual or local population as an operational taxonomic unit (OTU). In my time as professional scientist and blogger, I dealt with two of such data sets: our Osmundaceae and Osmundales matrices (Bomfleur, Grimm & McLoughlin 2015, 2017; see also this post) and the data matrix of Tschopp, Matteus & Benson (2015) on "long-necks" (see this post), the dinosaurs familiar to us at least since Littlefoot's adventures. A third, plesiosaurian "sea monsters" (Fischer et al. 2018), came to my attention via Twitter, which probably will be topic of another post on the Genealogical World of Phylogenetic Networks.
|A graph (neighbour-net) depicting morphological diversity in the rhizomes of king ferns and relatives (order Osmundales). We (Bomfleur et al., 2017, open access) used 122 OTUs covering everything from the Permian (often individuals, rarely mass finds of rhizomes) till the modern-day species (accounting for merely a dozen of the dots). Recognised taxa (using a phylognetic classification) highlighted by colour.|
For taxonomic purposes – the definition of biologically meaningful units comprising similar and closely related individuals such as fossil genera – such elaborate matrices are a huge leap forward of our understanding of the past. Palaeontological nomenclature is often overly specific, and I spare you (and me) a discussion of species concepts in palaeontology. There's little point to theoretically discuss something even neontology has no universal answer to, but a working practical one (Mallet 1995; for those that are into it, read Mallet 2010). To me, it's simple, organisms that form a coherent group of (putative or likely) relatives in space and time, should have a collective name. If they are indistinguishable, when there is evidence for largely overlapping features (as e.g. in the case of the ancestors of the western Eurasian Common Beech, Fagus sylvatica s.l., Denk 1999), one should be careful assigning different species epithets.
Organisms that are part of the same evolutionary lineage, should have a collective name, too.
In palaeontology, we tend to give (or are forced to by our peers during review) very similar things different names when we find them in different places or stratigraphic units (time zones). Facing two main problems:
- Diagnoses that are not mutually exclusive. A common phenomenon for plants, e.g. in the above-mentioned king ferns, aggravated by the fact that plant fossils are usually found as disarticulated, dispersed organs.
- Diagnoses so specific that any other fossil must directly represent a different species (not rare in case of animals, especially vertebrates). Especially when they are well preserved and show a lot of traits, as e.g. a fully preserved skeleton or completely preserved plant organ.
(Mis)Guided by theoretically fixed but impractical species concepts, palaeontologists have used missing information to argue for different names. In extreme cases, this has lead authors to assign a flower to one genus, the seeds in it (i.e. in-situ) to another, and the pollen found in-situ to a third, ending up describing several new species based on a single fossil. Why? Because the same seeds and pollen are found isolated (dispersed), and we can't be sure that they come from the same species (or genus) that produced the flower one was lucky to find. This avoids being wrong (in a way), because you don't have to make a call. But also renders palaeo-taxonomy a completely useless endeavour. If we want to trace a lineage through space and time, we need names that point us to the right fossils. If we give a thing a name, that name should have a purpose (and meaning).
For phylogenetic inference these matrices impose however some new problems. They inevitably have a higher percentage of missing data when we try to really take in whatever is possible. For our (character-limited, we could score just 45) Osmundales rhizomes, a fifth of the matrix cells are blank (i.e. question marks). A swiss cheese. In the case of the 10-times larger (character-wise) matrix of Tschopp et al. two-thirds are blank, a cheesy aerosol, and a bit more than half in the 5-times larger (than ours) matrix of Fischer et al. A special missing data bias subcategory is lack of overlap in character suites. Imagine, you only find fossil A and C of the Blue Cow, the Not-Blue Cow and the Blue Cow-oid, but not fossil B, the Blu-ish Cow that can connect the both. There would be a very high chance to interpret the shared bits of bluish-ness of the Cow-oid as convergences, and place it far away of the (Blue) Cow clade, its actual species.
Even under better circumstances, missing data can affect phylogenetic inference. Best-case scenario is that it reduces the decision capacity of the optimisation algorithm. Worst-case scenario that it leads to inferring clades that do not reflect a monophyletic group (sensu Hennig).
Let's assume we have all three Blue Cow fossils and a well-preserved fossil of Red Cows, the sister species of the Blue Cow. For our Blu Cow problem, the following would happen during tree inference (very likely when using parsimony or distance-based inferences, and possibly when using a probabilistic framework, maximum likelihood or Bayesian inference).
Branch support, usually not very high to start with, may decrease further; topological ambiguity can increase. This is a practical problem, which we can solve by reducing the matrix to the best-preserved OTUs to infer a backbone tree or network, and then embed the poorer-preserved OTUs into the phylogeny by associating them with the found groups or clades by additional means (such as sheer similarity, as e.g. done by Tschopp et al. and us).
The change of branch support when using data subsets (less taxa, less characters) is also a good indicator for missing data bias. Which brings us to under-utilised bootstrapping and jack-knifing resampling procedures. For the cows, the bootstrap consensus network (see Schliep et al. 2017, open access, for an introduction) could inform us that the colour-less cow is either a Red or Blue Cow and, hence, should be excluded from a tree inference, and that Parabos is not a good sister genus, but a close relative, potential sister, or even co-specific with the Blue Cow.
Another issue is that the more honest (unfiltered) and comprehensive a matrix is, the less treelike will be its signal (see Osmundales graph above, which essentially lacks tree-like portions). Most morphological traits have been evolved more than once, independently or in parallel and at different time and some can be pretty variable already at the population, intraspecific or interspecific level — by using individuals, we cover both (see our Osmundales dataset and papers). For lineages with living representatives (as in the case of the Osmundales) this can be quantified or at least qualified, a molecular phylogeny can be used to identify tree-compatible and tree-incompatible morphological traits. For extinct lineages, we are essentially crossing marginal sea ice during polar night hoping the candle in our hand holds all the way. Unless we have e.g. a marine invertebrate with nearly continuous fossil records providing thousands of individuals and reflecting transition in time and space at fine scale.
|Using the Osmundales approach on Tschopp et al.'s long-necks matrix. Colouration of dots indicate whether the taxon would behave well or not in tree-inference. Taxa with little missing data and under-average iDV (individual Delta Value; see Göker & Grimm 2008, open access, for an explanation) are favourable (green dots), those with poor coverage and over-average iDV will act as "rogue" OTUs, i.e. reduce branch support and increase topological ambiguity.|
Individuals are snapshots
It is crucial to realise this: An individual (fossil or living) is not only the product of the evolutionary history of the group but also of ± reticulate population processes within a group of closest-relatives (populations, species, species aggregates, genera). Reticulation – interbreeding, introgression, hybridisation – is much more common than generally assumed, also in animals (an example fresh from the press).
But it goes deeper than this. Just give the following a thought. We use an inference method that assumes that a biological (composite) unit, a lineage or species, always splits into two (lineages or species) that then evolve independently from each other for a primary biological unit, an individual, that – in the case of sexual reproduction – combines two lineages, namely that of the father and that of the mother.
And fatherly and motherly inherited traits (genetic or morphological) are free to mix in subsequent populations, and this mixing decides what the species – the sum of interacting individuals and populations – looked like at a certain time in a particular place. And speciation is easy (see this brilliant paper by Mallet 2008, open access).
|Left: what we need to infer trees, right: what we really deal with. S – species, K – (pseudo)cryptic species, P – individual populations. Blue/red – inherited from the founding ("first") father and mother, respectively.|
Plus, the fossil record is usually discontinuous and fragmentary.
|The true tree and two solutions (reconstructions) for explaining the observed (preserved) pattern. MS — recognisable morphospecies, S — (actual) species.|
Adding the time frames one usually deals with (in case of the Osmundales, we dealt with 250 million years), what we have in the case of a fossil individual is a tiniest window in the past: one individual of a population of unknown size (population size matters a lot when it comes to fixing geno- or phenotypes), representing a single generation and part of a species of unknown extent. What we try to do is to get a complete as possible picture having a dozen puzzle pieces of a 10,000-piece puzzle at hand. Reconstructing a hay-stack by finding the needles in it.
Natural and un-natural hybrids
Now you may think (and rightfully so): Come on, how frequent is reticulation? And do population processes affect more than a few generations, blinks of geological time? We are looking at millions of years!
We can't know for extinct groups. But we know that the closer two species are that come into contact (S1 and S2 in the example above just diverged), the more likely it is they will have vital off-spring or even fuse (Mallet gives many real-world examples in his papers on hybrid speciation). Backcrossing will eliminate occasional hybrids, but stable hybrid zones are found today. And there is plenty of evidence for complete takeover by asymmetric hybridisation: introgression. Some hybrid (or crosses) have intermediate morphologies, others not.
|Ancient and recent reticulation in the formation of modern-day plane species (Grimm & Denk 2010)|
And at least in angiosperms (which also go back 150 Ma or more, depending on the definition), we have something known as "hybrid vigor": the hybrid (often allopolyploids) will be something intermediate or new (phenotypically) and very productive.
Also for (still) living animals, genetics provide plenty of evidence of reticulation, today and in the past. Just take the bear data we re-used for our Intertwining trees and networks paper (Schliep et al. 2017).
A few more things to keep in mind in this context: Professional snake breeders (an example what you can buy) make a fortune by breeding obscure hybrids between snake lineages that must have been isolated 100 or more million years: today they live on different continents separated by wide, old oceans. Lizards are notoriously ignorant of species barriers (again, see Mallet's papers). And we shoot Grizzly (a Brown Bear variant/ subspecies) – Ice Bear hybrids to keep the species apart and ensure the survival of the Ice Bear (while melting the poles). Natural hybrids of two highly specialised bear species every kid can distinguish, and any vertebrate zoologist just by part of the skeleton. The unnatural inter-continental hybrid snakes are usually infertile, but a pup of an occassional love affair between a Grizzly and an Ice Bear is not. The "Grolar Bear" is also featured in this post providing images of ten exotic ("bizarre") mammal hybrids and this post has pictures of most-loved (some alleged) hybrids between the domestic cats and their wild feline relatives (including those of other genera) — especially U.S. legislation has little concern regarding unnatural cross-breeding for making money. More details can be found here.
|Cat hybrids (natural and artificial) in a dated phylogenetic context. The dating strikes me to be a rough, nonetheless: there are still today occasional vital intergeneric hybrids. Now imagine how the situation was 10 or more million years ago, when the ancestors of all those genera were still species (or populations of the same species). Image source: www.messybeast.com|
Phylogeny's Damocles' Sword – neutral evolutionPhenotypic change observed in future individuals of the same species (or population) does not need to involve cladogenesis (or reticulation) at all. I was taught this by my population genetics professor in Tübingen, the late Diether Sperlich [ResearchGate, Memorandum], who was a great fan of Finland and (from the very beginning) neutral evolution (his daughter translated Motoo Kimura's book into German). Like probably most (or all) population geneticists, he did not hold so much regard for Hennigian phylogenetics or Farrisian cladistics, which he considered unnatural frameworks. Already retired, when I started studying genetics (because it was much less effort for me than other things), he gave a population genetic course in the semester breaks, usually attended by some few Finnish (summer) guest students, and one year, two Finns were joined by me (I did most of my Diplom-biology courses in the break, because during the semester I was busy with my actual studies: geology-palaeontology; back then, in the usual relaxed way with getting on as much excursions as possible).
|Prof. Sperlich's experimental set-up teaching you the basics of neutral evolution and population genetics/phenetics.|
Naturally this takes time, and if a species is large and its populations already a bit isolated, the new trait may fix only in some part of the species. The evolutionary history (and genetics) of the beech provides such examples. When isolation becomes permanent, we may have one sibling species with no change, and another carrying the new trait. There may be more traits scattered in the ancestral species, which are then passed to some but not all descendants. But with fossil histories, time is what we have plenty of. And Mother Nature is permanently throwing coins.
In the (very) rare case, a mutation is beneficial, selection steps in and massively accelerates the process. So as long as the population with the individual having the beneficial mutation is not isolated, the new beneficial (= more offspring/dominant in heterozygotes) phenotypic (or genotypic) characteristic will spread through the entire species: A new species emerges from the hollow shell of an old, like a butterfly from a cocoon.
Let's add this spice to the example I used in my earlier post.
My theoretical example uses two basic observations:
- Isolated lineages will accumulate unique derived features (illustrated by colour gradients and new colours). Either because of cladogenesis (a lineage splits into two), e.g. the differentiation of the ancestral species into its geographic "races", forming four species: violet, still-black splitting in lush and pale green, grey, and yellow) or neutral evolution (e.g. blue into dark blue, yellow into light orange).
- Already established lineages coming into contact may intrograde when close relatives (direct sisters, e.g. the violet and purple) or hybridise when less close, but sharing a common ancestry (blue and red, both surviving descendants of the violet and purple sisters, form pink).
With real-world data the chance for inferring partly or entirely wrong clades increases.
- Long-branch attraction would ensure that X*1 would be placed as sister to any of the other most derived (asterisks = modern) individuals, because it is too distinct from its actual ancestor Z2. The As may be drawn closer to the root, because of their general similarity to the non-derived Zs. As consequence the B- and C-comprising clades would become sisters (I call this phenomenon "short-branch culling").
- The cross S1 may act as rogue OTU, or draw its putative sister S2 to the blue B-clade away from the T and U clades. A tree needs to decide what has more weight: joining S1 with the Bs (then S2 will move, too) or join S2 in a STU-clade (then S1 will stay).
- One of the most frequent branching artefacts in morphology-based trees (and occasionally, in molecular trees) is that derived, long-branched taxa are placed as early diverged sisters to their (much) less diverged relatives (sisters or ancestors). Likely inevitable for our example in case of the B+P clade (monophyletic grades – paraphyletic clades paradox).
For comparison, the evolutionary tree depicting correctly all relationships between the sampled individuals (and their populations/species).
Quite a different thing.
Why palaeontologists should keep in mind modern-day dynamicsThere is currently no method to reconstruct such an evolutionary tree. Since we are looking at needles separated by many generations, we may be tempted discarding population and speciation processes shaping and messing up morphologies (phenotypes). But populations are often not homogenous (morphologically and genetically). And not all populations of a species are necessarily homogenised. They eventually be, but we have no control about the point we tapped into the pool with our fossil individual. Was is during a speciation process, was it long after isolation and trait sorting took place? The tinier and detailed the puzzle pieces, we use in our matrix, the more we have to deal with a mix of lineage-sorted (phylogenetically controlled) and highly volatile, single-tree-incompatible inter-/intra-species phenotypic characters.
The more complex an organism gets, the more flickly it becomes in selecting its breeding partners/ measures, and the more dynamic (volatile) and diverse its manifesting species. And so even geographically restricted species/populations can have quite a complex population structure, as recently shown by de los Ángeles Bayas-Rea, Felix and Montufar (PeerJ, 2018) for bottlenose dolphins found in the Gulf of Guayaquil of Ecuador (the list of "bizarre hybrids" includes the "wholphin", a bottlenose dolphin hybrid with another dolphin genus). What remains after millions of years as phenotype and pops up in the fossil record can be considered the product of many stochastic filtering processes and permanently changing levels of out-breeding, two processes that have little to do with cladogenesis. And the latter is all we model when inferring a phylogenetic tree.
Other factors increasing species complexity are a fragmented, dynamic ranges or being a plant; worst-case: a wind-pollinated such as the beech. The papers by Mallet instructed me to stop pondering about species concepts. Still, the currently living beech species are easy to circumscribe (morphologically and genetically) but are obviously not the product of a strictly dichotomous evolution — a trivial series of cladogeneses as reflected in the commonly seen phylogenetic trees.
When you look at a palaeo-phylogenetic papers (individual or composite taxon based), the authors usually work under the following assumptions:
- Species are stable, fixed biological primary units that don't change over time.
- Species can be characterised by morphology.
- Speciation is a strictly dichotomous process (a series of cladogenesis events); whenever there is a visible morphological change, we have two new (sister) species, and there has been a cladogenesis event.
- Morphology mirrors phylogeny (often: the most-parsimonious solution for the morphological data reflects the true tree).
- The better papers realised: Genera are no good units; a phylogeny should be ideally based on individuals and not on (potentially artificial) composite taxa — there are two sides to this medallion (see above).
- Species are not stable, fixed biological units. Neither morphologically nor genetically.
- Morphology can out-run molecular systematics, when it comes to define species (coherent-as-possible, genetic and morphology-wise, but performs worse when it comes to infer inter-species relationships (or higher up: phylogenies).
- Reticulation is not rare. Species barriers are permeable and will be broken, more so in plants but also in animals. A rule of thumb is, the more primitive the animal, the more permeable the species boundaries. The only question is: will this have an impact on what we see? Also, speciation can be a complex process (see doodles above and the example of the beech).
- Morphological evolution is not parsimonious in general.
- Genera (at least in plants) are usually good units for tracing a group in the past. Because they can be defined in a much more stable way, so that they don't vanish in a blink of time (in the case of beech: some 60 Ma, in the case of modern Osmundaceae: ~ 200 Ma).
So, what can I do?
Given the fragmentation of the fossil record, we have no (simple) way to determine how much of the (individual) morphological diversity is the result of lineage-sorting, the phylogeny of the group we study — (true) tree-compatible signals; and how much relates to stochastic, potentially reticulate low-level differentiation processes — tree-incompatible signals. And there is more enemy fire: Parallel or convergent evolution of the same traits in related or unrelated lineages, epigenetic phenomena leading to phenotypes not fixed by genotypes but reacting to the current environment. Using morphology alone, we also have no objective way to define species or higher taxa (such as genera) to filter lineage-specific traits.
Hence, we need to explore the signals in our data with methods that allow visualising conflict (e.g. support consensus networks, Schliep et al. 2017) or can deal with incompatible signals (e.g. the quick-and-easy to do "Neighbor-Net", Bryant & Moulton 2002, 2004). Not just infer a tree. Most relationships seen in those trees may not be (entirely) wrong, but all (unfiltered) morphological data sets show conflict. At best, the inferred tree(s) capture(s) aspects, but they don't capture evolutionary history at large. Less-so using individual-based matrices than the classic ones using composite taxa, because they (the individual-based onces) are data-wise much better (usually less biased by concepts). And closer to the complex reality.
Long-known, by the way. When the (then, mostly a) theory of evolution was still young, researchers realised that signals from morphology are riddled by internal conflict, which can be visualised using networks but not trees.
|A network based on St George Mivart's 1865 and 1867 primate trees. Depicted in D. Morrison's Is this the first network from conflicting datasets? post (GWoN, September 2012)|
See the link list of my post on Where have all the ancestors gone? for a list of links for further reading.
In The challenging and puzzling ordinary beech – a history, I tell the story how we reconstructed the evolutionary history of the beech. It's no miracle to put up holistic hypothesis about how a genus (or any other group) unfolds in space and time, just a joint effort.
Bomfleur B, Grimm GW, McLoughlin S. 2015. Osmunda pulchella [now: Osmundastrum pulchellum] sp. nov. from the Jurassic of Sweden—reconciling molecular and fossil evidence in the phylogeny of modern royal ferns (Osmundaceae). BMC Evolutionary Biology 15:126.
Bomfleur B, Grimm GW, McLoughlin S. 2017. The fossil Osmundales (Royal Ferns)—a phylogenetic network analysis, revised taxonomy, and evolutionary classification of anatomically preserved trunks and rhizomes. PeerJ 5:e3433.
Bryant D, Moulton V. 2002. NeighborNet: an agglomerative method for the construction of planar phylogenetic networks. In: Guigó R, and Gusfield D, eds. Algorithms in Bioinformatics, Second International Workshop, WABI. Rome, Italy: Springer Verlag, Berlin, Heidelberg, New York, p. 375-391.
Bryant D, Moulton V. 2004. Neighbor-Net: An agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21:255-265.
Denk T. 1999. The taxonomy of Fagus in western Eurasia and the ancestors of Fagus sylvatica s.l. Acta Palaeobotanica Supplement 2:633-641.
Fischer V, Benson RBJ, Druckenmiller PS, Ketchum HF, Bardet N. 2018. The evolutionary history of polycotylid plesiosaurians. Royal Society Open Science 5:17217 [e-pub].
Göker M, Grimm GW. 2008. General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability. BMC Evolutionary Biology 8:86.
Grimm GW, Denk T. 2010. The reticulate origin of modern plane trees (Platanus, Platanaceae) - a nuclear marker puzzle. Taxon 59:134-147.
Heath TA, Huelsenbeck JP, Stadler T. 2014. The fossilized birth–death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences 111:E2957–E2966.
Mallet J. 1995. A species definition for the Modern Synthesis. Trends in Ecology and Evolution 10:294-299.
Mallet J. 2001. The speciation revolution. Journal of Evolutionary Biology 14:887-888.
Mallet J. 2007. Hybrid speciation. Nature 446:279-283.
Mallet J. 2008. Hybridization, ecological races, and the nature of species: empirical evidence for the ease of speciation. Philosophical Transactions of the Royal Society of London, Series B 363:2971–2986.
Mallet J. 2010. Why was Darwin’s view of species rejected by twentieth century biologists? Biology and Philosophy 25:497-527.
Schliep K, Potts AJ, Morrison DA, Grimm GW. 2017. Intertwining phylogenetic trees and networks. Methods in Ecology and Evolution DOI:10.1111/2041-210X.12760.
Tschopp E, Mateus O, Benson RBJ. 2015. A specimen-level phylogenetic analysis and taxonomic revision of Diplodocidae (Dinosauria, Sauropoda). PeerJ 3:e857.