What you should show in a palaeophylogenetic study

Far the most palaeophylogenetic studies rely exclusively on tree-inference as methodological framework. Thus, ignoring the fundamental properties over the underlying data: matrices that provide few tree-like signals. A recommendation what to show (and why).

No matter whether we use individuals or composite taxa, it makes little sense to just infer a tree based on morphological puzzle pieces (individuals, populations, singleton species) or higher-level conceptual taxa (widespread species, genera, etc.) covering millions of years, vast space, and an unknown (usually untraceable) amount of low-level reticulate processes and convergent evolution.

Often, we will have no means to test the proportion of lineage-sorted, i.e. phylogenetically relevant, traits — tree-compatible signal, and those that are the consequence of reticulation and (to some degree) stochastic processes, tree-incompatible signal. Thus, we need to make ourselves and our readers aware of eventual signal issues in our data, the alternative evolutionary hypotheses, and explore this as far as possible. Reviewers of palaeontological papers including phylogenetic inferences seem to bother very little about maintaining a consistent, meaningful presentation of the inference-part of such studies. To fill this void, a four-point list.

Left: phylogram, which should be shown; right, what typically is shown: a cladogram. Data source: Tschopp et al., PeerJ, 2015, open access (see also this post on Genealogical World of Phylogenetic Networks).
1. Always show a phylogram, a tree with branch-length, not a cladogram. 

It makes a difference if an older, early branching "sister" has a long (probably sister lineage) or a short terminal branch (possibly ancestral form or actual ancestor). Also, one may question the taxonomic concept, when realising that the "sister" taxa from different regions (or time periods) are virtually identical.

Phylograms are also imperative when chronograms are shown (Bayesian tip-dating of morphological matrices is currently fashionable, but poorly understood; Matzke & Irmis, PeerJ, 2018) to have an idea about the original branch lengths, which directly reflect the amount of change, behind the age estimates.

Same tree, with branch supported annotated. Here, non-parametric bootstrap support (BS) under three optimality criteria: LS – Least-squares, ML – Maximum Likelihood, MP – Maximum Parsimony. * = LS/MP-BS < 15.
2. Indicate (non-parametric) bootstrap support for all branches

Not because, it is the best support measure. Theoretically, Bayesian posterior probabilities (PP) are superior, they provide a (mathematical-sound) probability for a clade. The bootstrap on the other hand may at best approximate this probability. In contrast to PP, the bootstrap is a resampling procedure that gives an impression about the robustness of the signal supporting the tree and can be quickly estimated under different optimality criteria (see also Some things you probably don't know about the bootstrap). Not knowing how many characters are tree-compatible and tree-incompatible, it's exactly what one needs. The Bayesian analysis will force the data converging to one tree, even when there is substantial conflict in the underlying data (see also Two papers you may want to read before inferring trees from morphological data).

Using thresholds worth showing (traditionally only bootstrap supports, BS > 50) is generally bad idea. For instance, the example above is based on a large (more than 400 characters, i.e. scored traits) but also very gappy character matrix. A very low BS of 10 may reflect that about 40 characters support this branch and the rest don't oppose it. Which can be quite a bit, and include the one or other much-sought for synapomorphy.
  • A long branch with low support usually relates to internal conflict — characters favouring different trees (more accurately: different taxon set bipartitions). Quite common when analysing morphological data sets.
  • A short branch with high support points to perfect lineage sorting (rarely found in the case of morphological data) — few characters conflict the found "best tree" (optimised topology); or methodological artefacts — in case the high support is only found with one optimality criterion.
  • Long branches with high support reflect trivial (data-wise) relationships — the clade represents a likely monophyletic group characterised by a series of potential synapomorphies (unique, shared derived traits) or (just) shared derived traits; a coherent group of taxa that are consistently more similar too each other than to any other taxon in the data set and effectively can be pinpointed without any prior inference.
  • Short branches with low support relate to lack of discriminate signal — characters back stochastically one of many equally possible alternatives; the data is rather impotent regarding this aspect of the tree. When using molecular data, they usually relate to fast ancient radiations, in case of morphological data it may have additional reasons: crucial characters missing in critical taxa, scored traits don't capture the evolutionary processes that shaped this part of the tree, and ancestor-descendant relationships embedded in the matrix.
Trees need clear signals. Same character matrix, but only OTUs with less than 50% missing data. Result: a pretty well-supported tree.
Since we are dealing with potential missing data artefacts and an unknown amount of stochasticity, it cannot hurt to compare values from other criteria than just homoplasy-vulnerable and change-probability-naive (or post-inference weighted) parsimony. When one restricts the taxon sample to those OTUs with more than 50% data coverage, Tschopp et al.'s data allows inferring a quite similar tree with much higher support (see on the right). The low support seen above is mostly due to poorly covered OTUs (here: individual specimens) acting as 'rogue taxa'.

When LS and ML (or MP) support differ substantially – in the example shown above when including all OTUs the missing data prevented the estimation of a sensible distance matrix, but when the taxon set is reduced to well-covered taxa, LS and ML-BS are quite similar – or when ML differs from MP (and LS using simple Hamming distances), this can evidence branching artefacts in the latter, or point to signal quality issues: ML allows and optimises for rate variation across lineages (character changes will have different weights), whereas (unweighted) MP counts each change as an equally probable step. MP (less-so ML) will be less decisive with increasing amount of homoplasious characters, whereas LS may compensate (to some degree) since it does not use individual character changes but the overall similarity patterns. For instance, the above taxon-reduced tree, a ML tree (a LS-optimised neighbour joining tree shows the same topology but the according most-parsimonious tree resolves the Apatosaurinae as a grade) fits the original paper's conclusions, and all branches have highest support under LS, whereas its lowest under MP. [Side-note: To increase the support/ decisiveness under MP, the original study used post-inference weighting, which down-weighs characters incompatible with the found tree before re-running the analysis. Something commonly done, but effectively a snake-biting-its-tail approach.]

Usually shown, strict consensus tree, and what to show: a consensus network.

3. Use consensus networks to visualise topological uncertainty and alternatives

Use the consensus network of the equally parsimonious solutions, the "most-parsimonious trees" (MPT) instead of the showing-little masking-much strict consensus trees. Whereas the strict consensus tree depicts only trivial relationships seen in the tree sample (above: 3000 equally parsimonious solutions for Tschopp et al.'s complete matrix used as-is, i.e. no re-weighting or ordering applied), the consensus network shows where they disagree and how they differ from each other. Rogue OTUs (e.g. Diplodocus YPM 1922), messing with the tree inference by inflicting topological ambiguity (spanning up prominent box-like structures), are easy to identify using the strict consensus network, but impossible to depict using the strict consensus tree. Furthermore, we can see that despite placement ambiguity, the potential Diplodocineae and Apatosaurinae often group together in the MPTs.
Similarly, support consensus networks (see Schliep et al., Methods Ecol. Evol., 2017, open access) based on the bootstrap replicate samples and the Bayesian sampled topologies outperform the majority-rule or all-compatible consensus trees in any possible way.

Parsimony bootstrap support consensus network for the reduced taxon set. Note that the MP-optimised tree (green and red edges) showed branches that were not the best-supported alternatives. When compared between three optimality criteria, one can see that although distance-based LS and character-based ML optimisations are largely congruent, the latter prefers (ML-BS of 50 vs. 39) to place NSMT PV 20375 as sister to the Diplodocinae clade.

Given the complexity of the signal in morphological data sets we need to ask: Are there better supported alternatives or are all alternatives essentially random and without support? Ambiguous support may be due to lack of signal or competing alternatives. In the case of morphological data ...
  • ... a branch with e.g. a "low" (or "no") bootstrap (BS) support such as 35 and no alternatives with BS > 10 may indicate that one-third of the discriminating characters support the branch (which is quite a lot when you think about the data one deals with), while the other two-third don't contest it in a consistent way. Hence, point to a "good" clade, a valid hypothesis. 
  • ... a branch with, e.g., a "moderate" BS support of 60 and and single competing alternative with BS support of 35 directly reflects substantial internal conflict and points to a prime and secondary alternative, both of which need to be considered when interpreting the results of the reconstruction.
The general guideline should be: Don't hide but show (and explore) the alternatives to the preferred (optimised) tree. Ideally, you should be able to explain the level of support of every branch in your preferred tree.

4. If the fossil sample is dense enough, provide reconstructions for different time periods 

This can help to eliminate miscellaneous signal due to ancestor-descendant patterns in the data or branching artefacts. Many data sets reflect a general trend from older, underived, literally primitive, to younger, more derived (complex, not rarely better preserved) taxa, which will force the trees into a staircase-like structure; and there may be "temporal" convergences and (inevitable) long-branch attraction (or "short branch culling").

Trees in their basic form, i.e. unrooted as optimised by the tree-inference programmes, could be stacked in the same way than networks (see the according posts on the Genealogical World of Phylogenetic Networks: Stacking neighbour-nets and Stacking neighbour-nets – a real-world example).

Why proposing better tree-based analysis

Reading my posts or even some of my papers, you may have realised that I am a heretic with limited regard for trees (and less for cladistics). So, why do I outline a tree-based analysis framework?

Personally, given the complexity and tree-unlikeness of the signal, I (personally) would rely exclusively on neighbour-nets and consensus networks to analyse morphological data sets of extinct organisms and to put forward taxonomic schemes and evolutionary hypotheses (keep in mind, being a distance method, neighbour-nets require that meaningful pairwise distances can be established for all included OTUs). When rate of change, the overall diversity, is low, the fully parsimonious median networks (unweighted or counter-weighted against homoplasy) may be an option, too, e.g. to explore within-lineage details and to reconstruct explicitly ancestor-descendant relationships (things never done are always worth a try). Based on the networks, one could decide on the most sensible topological alternatives, tree hypotheses, to optimise and test for e.g. time-aware Bayesian inferences such as the now fashionable Bayesian tip-dating with BEAST2 (in case you want to do them, too).

BUT! Not showing a tree will get you severe trouble during review. Cladistia still rules the Seven Palaeo-Seas, especially when the peer review process is confidential instead fulfilling basic standards (in modern society) of transparency. Trees, on the other hand, always go through smoothly, even when their branches have little support or are obviously biased (the one or other is the case for, I'd say, 80% of all published morphology-based phylogenetic papers, independent of the journal's respective impact factors). They are just so pleasingly simple graphs for a very complex problem.

No comments:

Post a Comment

Enter your comment ...