Cladistics vs Phylogenetics: What's the difference?

While working for a 2-piece post on the Genealogical World of Networks (A bit of heresy: networks for matrices used in Cladistics studies), I stepped over a threat on ResearchGate, where someone asked this. I browsed through the answers, and felt obliged to answer as well.

[The following is a fairly literal copy of my answer on RG, graphics are added]

Cladistics is about clades, defined as subtrees in rooted trees. There's a nice chapter in Joe Felsenstein's 2004 book, Inferring Phylogenies, on this; also pointing out why we actually should clearly distinguish between clades, a subtree in a rooted graph, and monophyla, an interpretative concept. It goes back to Farris (1983) and not Hennig (1950).

An inferred tree: all but one subtree can be diagnosed by form features. Members of genus Oval are part of the all-rounded subtree, but the olive Oval is sister to a subtree comprising Donut and the purple Oval.

By rooting the tree using the outgroup, the subtrees become clades and we can put names to the clades. Either branch-based (bold internodes) or node-based (dots representing the hypothetical 'most-recent common ancestor' — MRCA). Only under the assumption that the inferred tree reflects the true evolutionary tree, such a cladistic classification is a phylogenetic classification.

Hennig "just" provided a new (indeed better, because it can be tested) concept for monophyly in the framework of his "Kladistik" (which differs in quite some bits from what later became "Cladistics")

The problem Hennig tried (half succeeded, half failed) to solve: evolution of two reciprocally ("mutually") monophyletic lineages, Roundish (all descendants of Rounded) and Pointish (all descendants of Pointed), in time and morphospace. Note each cladogenesis, i.e. dichotomous split, is accompanied by a unique change in form or colour. But whereas forms only evolved once, i.e. are or were(!) synapomorphies, colours were also evolved in parallel (lush blue octagon) or independently in the Roundish and Pointish lineages (olives).

Realising that form is more important than colour, we can put up an intuitive phylogenetic classification: all groups go back to a defined common ancestor, i.e. are monophyletic (in a pre-Hennigian sense). Hennig noted the difference between groups of inclusive common origin, his monophyla (green; Ashlock, 1971, proposed the term 'holophyla' to avoid confusion), and those of exclusive common origin, which he termed paraphyla (red). Since paraphyla are impossible to define without recognising monophyla, he suggested to avoid them at all cost. For example, Roundish is a monophyletic group defined by a smooth, rounded outline. They include one sub-monophylum: Donutia, which members are defined by a uniquely derived donut-shape, a synapormophy (found in all descendants of the first donut). But the other two shapes collect only part of the descendants of the first oval (Ovalia), also the ancestor of all Roundish, and first turned-over oval (Obovalia).

A proper Hennigian phylogenetic classification, recognising three monophyla instead of the paraphyla. It also illustrates what Hennig had to ignore and many cladists still do: change over time (also called evolution). The Pinkoids are all descendants of the first pink oval. Prior to the evolution of the monophyletic Donutia, pink was the synapomorphy of the Pinkoids. Today, due the evolution of darker shades, it is a symplesiomorphy — an ancestral, primitive, shared trait of some Pinkoids except for the Donutia. For their sister lineage, the Orangeoids, we have no synapomorphy at all — a dead end for Hennig. Nevertheless, we can characterise the monophyla viz clades (this is still a rooted tree) by their unique combination of traits: the Orangeoids are rounded, but not pinkish; the Turqoids greenish stars. This is where cladistics sets in: the tree topology is inferred from all traits, no matter whether they represent modern or ancient synapomorphies.

We had last year two posts on this (on the Genealogical World of Phylogenetic Networks)

PS Statements [see Victor Orrico's answer on RG] that NJ trees are "phenetic" are wrong (a common error): the NJ algorithm produces phylogenetic trees fulfilling either the minimum evolution (ME) or least-squares (LS) optimality criteria (depending how set up). The algorithms for UPGMA and NJ are both cluster-algorithms (so "phenetic", if you want), but for the NJ it has been shown that it succeeds in finding a good estimate for the ME or LS tree (which UPGMA does only by accident). NJ is just a shortcut to find a ME or LS-optimised phylogenetic tree from a distance matrix (again e.g. Felstenstein, 2004, Înferring Phylogenies). A perfect matrix, where each cladogenesis is represented by at least two subsequent synapomorphies will result in a perfect distance matrix, and the ME or LS tree inferred from this matrix, will be the true tree, and identical to the single MPT inferred from the character matrix. If convergences outcompete synapomorphies, the MPT will have clades that are not monophyletic, as will (to a lesser degree it seems) the ME or LS tree, whereas compatibility and probabilistic methods can handle this to some degree.

Phylogenetics is about phylogeny, evolutionary pathways, and goes back to Darwin and Wallace's age. The first phylogenetic trees were published in the 19th century, one of the earliest at my Alma mater, the University of Tübingen, by Franz-Martin Hilgendorf (who also published possibly the first phylogenetic network). Haeckel did a lot to advocate phylogenetic trees, and also coined monophyly, if I remember correctly). Regarding first phylogenetic trees including a definition of what a phylogenetic tree is, see this post by David Morrison
[Side-remark: A phylogenetic tree is a tree depicting ancestor-descendant relationships, which, ironically, no cladogram, the still commonly seen rooted trees without branch lengths, can; and phylograms, rooted trees with branch lengths, only indirectly by zero-length terminal branches.]

Left, Hilgendorf's 1866 phylogenetic tree depicting ancestor-descendant relationships (monophyletic groups coloured); right, a cladogram depicting most of the monophyla, but no ancestor-descendant relationships.

I gave it a quick search, and found this nice set of lecture slides giving a quite comprehensive introduction into "evolutionary (phylogenetic) trees" and three of the methods to infer them: "Parsimony; Distance matrix based; Maximum likelihood" [link to PDF].

Cladistics is hence a (quite restricted) subset of phylogenetics (not synonymous with Hennig's "Kladistik").

So, to be on the safe side, always go for phylogenetics.

An optimal (dated) reconstruction for our example including only tip-taxa. For the modern-day taxa, the inferred tree equals the true tree (assuming perfectly clear, tree-like, molecular data). Fossil taxa placed based on morphology. Using this result, we can label the clades ...

... some of which fulfil Hennig's monophyly (green), others are (inevitably) paraphyletic (orange). Or even diphyletic (red): because of its colour, which is only found in two of the taxa, the extinct Fivestar is placed as sister to the extant Fourstar, although it represents an extinct side lineage of all modern Staroids. To escape this branching error, we would need to feed the analysis (constrain it) with the (phylogenetic) information (informed assumption) that 5-star-morphologies and turquoise colour are primitive ("plesiomorphic") within the Staroids, and predate the divergence of the modern lineages. Only by going back to Hennig's philosophical framework, we may decide which clade to keep (the likely monophyletic ones) and which to drop (the probably not monophyletic ones) to evolve a cladistic classification into a phylogenetic (here: Hennigian) one.

And largely irrelevant these days. Not a few are aware (openly or shyly) that clades in rooted trees often correspond to monophyla, i.e. groups of inclusive common origin, but not necessarily do so. Incomplete lineage sorting is cladistics' greatest foe. Just take the many cases where different genomes tell different stories: the nuclear, mitochondrial and/or plastid trees may have different highly supported clades, but there can only be one monophylum (or two overlapping ones, in case of hybridisation). Which we try to infer based e.g. on the coalescent tree (which is a special form of coalescent network).

Or, think of a misplaced root or ingroup-outgroup long-branch attraction that easily turn a grade into a clade an vice versa. Especially parsimony trees can be severely misleading (see eg. this recent paper by Scotland RW, Steel M. 2015. Circumstances in which parsimony but not compatibility will be provably misleading. Systematic Biology 64:492–504).

Ingroup-outgroup long-branch attraction. The outgroup flips around the ingroup tree, the splits remain the same, but all monophyla (green boxes) become grades (more in Clades, cladograms, ... on GWoN)

Plus, there are many evolutionary/biological processes that inflict reticulation, i.e. ancestor-descendant relationships that cannot be modelled by a tree at all. A phylogenetic tree is just a special phylogenetic network, i.e. a phylogenetic network without reticulation.

A notable exception is classification. Cladistic classification, putting names to clades in inferred trees (under the implicit assumption that all clades represent monophyla fide Hennig), is still the holy goal.

Although, we often bend the rules and use (more general) phylogenetic classification concepts. Oaks being an example: the first multigene trees placed them in two separated, well-supported clades, but no-one was bold enough to divide this (most likely monophyletic) genus into two genera fitting the two clades in the trees or include the chestnuts etc. in the oaks. We formalised the two oak clades last year as subgenera (paywalled final version; free Pre-Print with one major change: Ponticae and Virentes accepted as additional sections in final version), the new infrageneric classification of oaks is hence a cladistic one based on nuclear oligo-gene and phylogenomic trees. But we are confident that it is also a phylogenetic one: our subgenera and sections are not only clades, but also monophyla (today and back into the past).

Cladistic or Hennig-phylogenetic classification (e.g. PhyloCode, using 'clade' as synonym for 'monophylum') is, however, impractical (to impossible, see e.g. Brummit 2002, How to chop up a tree) when being extended to fossils, we summarised the different concepts (those used in reality) in Fig. 8 of our 2017 Osmundales paper (open access). Naming (likely) paraphyla, or groups that may be para- or monophyletic, is inevitable. Ancestral forms and groups need names, too (no mention of fossil/ancestral taxa in the PhyloCode).

Why is there so much confusion?

Apparently many still hang on to parts of the 80s intellectual cladist package as summarised by Joe Felsenstein this list can be found in his 2001 piece for Systematic Biology, open access). So you not rarely get odd (and wrong) comments from (anonymous) reviewers (I got them quite often, since I usually used networks for phylogenetics and frequently had to deal with fundamentally "a-cladistic" data)

Quoted from Felsenstein (Syst. Biol., 2001, p. 466):

"The cladists of that era had accepted a number of points as an intellectual package. At one point in the mid-1980s I tried to summarize the package and came up with these points, in order of importance

Use Hennig’s terminology—autapomorphy, symplesiomorphy, and so forth—rather than terms like ancestral or derived. [still very common]
Classify cladistically; use only monophyletic groups. [still the official standard, see eg. dinosaur part of the Tree of Life/ Wikipedia (e.g. Coelurosauria and subclades); with funny consequences: the full hierachy for modern-day birds is (Wikipedia, 17/2/2020): Kingdom Animalia, Phylum Chordata, Clade Dinosauria, Clade Saurichia, Clade Theropoda, Clade Avetheropoda, Calde Coelurosauria, Clade Tyrannoraptora, Clade Maniraptoromorpha, Clade Maniraptoriformis, Clade Maniraptora, Clade Aveairfolia, Clade Pennaraptora, Clade Paraves, Clade Eumaniraptora, Clade Averaptora, Clade Avialae (= "flying dinosaurs"), Clade Euavialae, Clade Avebrevicaudata, Clade Pygostylia, Clade Ornithothoraces, Clade Euornithes, Clade Ornithuromorpha, Clade Ornithurae, Class Aves)]
Do biogeography by vicariance (pace Hennig). [not exclusive anymore but still common, an (bad) example: How not to make a biogeographic study]
Use only computer programs written by leaders in the Hennig Society, all others are fundamentally flawed. [both rarely openly stated, but I experienced this during review still in the zeroes]
Use only parsimony methods. Compatibility methods are evil. [recent example: Ockham's Razor applied but not used...]
Do not weight characters. [has become rarer, but often still frown about, even by those who then use TNT's post-inference character weighting option to increase branch-support]
Be hostile to molecular data [see Ockham's Razor..., and follow-up post: Why we want to map trait evolution along networks].
Consider your methods to be hypothetico-deductive. [see e.g. Wilf et al.'s response to Denk et al.'s comment on their 2019 paper]
Fossils are to be treated the same as living species. [this is still standard, and beyond cladistics]
Parasites always have exactly the same phylogenies as their hosts.
It is important to go around saying that one cannot infer ancestor–descendant relationships. [this is a wide-spread belief, partly out of necessity: tree-inference programmes do not allow placing ancestors on the nodes or internal branches, all OTUs have to be tip taxa]
It is important to go around saying that species are individuals, not classes. [many still think species are the only "natural" biological unit, fundamentally different from e.g. genera; which everyone knows to be nonsense, who worked with data from more than one individual per species]
Be sceptical of the reality of the species as nonoperational. [see above]
History: William of Ockham told Popper to tell Hennig to use parsimony." [still a belief, especially in palaeontology]

20 comments:

Joe Felsenstein17/02/2020, 14:26
Das Grimm, thanks for reprinting my list of points that constituted the Hennig Society official thought package of the early 1980s. Could I ask for a little reformatting to make this an accurate quote from my 2001 account? There need to be separate list items for "Use only computer programs ...", for "Do not weight characters ...", for "Consider your methods ...", for "Fossils are to be treated ...", and for "It is important to go around saying that one ...". Otherwise people will be confused about whether the "computer programs" entry is somehow part of the point about biogeography. And so on. And thanks for your thoughts on the extent to which these points are still adhered to. I am somewhat horrified to see how valid these generalizations still are!
ReplyDelete
Replies
Das Grimm12/02/2021, 11:18
Nice.

A classic: Of course, I must have been indoctrinated by viewpoints and (unbased, I suppose) assumptions (you may want to point them out in the post you comment to) because my "opinion" doesn't agree with your "knowledge".

So, if you are not a disciple of Farris' school, why did you choose the title Cladistics for an open-minded book about, phylogenetic(!) classification? And not, e.g. "Phylogenetic classification in the Genomic Era"?

PS Before you claim indoctrination by some (by the way, non-existing) philosophy, you may want to actually read any of the open access systematic papers I co-authored (all providing anonymous access to the used data) and posts (you are probably the first person noticing my indoctrination, many of my peers were appauled by my lack thereof)

And stop commenting on your pay-to-view book with its, as you clarified, misleading title, and point out the flaws in this very post. I hope, my post is not exactly doing what you wanted to do in your but simply couldn't get rid of the branding? After all, putting forward phylogenetic but not cladistic classifications is still a dead-end in systematic biology. But that has nothing to do with the peers being indoctrinated, of course.
ReplyDelete
Replies
David Williams12/02/2021, 11:25
Since when has cladistics = Farris school? That is your assumption not mine. it is incorrect. I am a disciple of no one.

Yes, it is a pity that CUP charge -- I wish it were otherwise. But the answers to your question are within.

I have read some of your papers.
ReplyDelete
Replies
Das Grimm12/02/2021, 13:07
Re your first question: see the interlinked GWoN post by David Morrison

Let's distinguish between Hennig and Cladistics

The reason he wrote it, is because (in our fields) many still equal Farris' cladistics with Hennig's Kladistik, and make no difference between a "kladistische" classification fide Hennig, naming monophyla (holophyla sensu Ashlock) and a cladistic classification fide Farris, i.e. naming clades in an outgroup-rooted, inferred tree (most-striking and funny example: dinosaur classification, based on nodes and branches seen in instable, parsimony consensus cladograms).

If you haven't experience this indifference, consider yourself lucky. We had to force papers through single-blind review because our peers simply didn't understood why we only take clades as indication for inclusive common origin but not as sole criterion. And didn't want to understand why we name groups that are not corresponding to a clade with a BS ≥ 70. To my experience the indifference is much more predominant in vertrebrate palaeozoology and botany (neo- and palaeo-) than in any other organismal group. Vertebrate palaeozoologists are sure there are no ancestors in the fossil record, and systematic botany was the last retreat of the Farrisian faction when they lost the so-called Phylogenetic Wars (I escaped the usual indoctrination because my background in genetics and geology, not systematic biology).

Re "...pity that CUP charge...answers to your question are within.", so, why not repeat it here for those who cannot (or are not willing) to pay the CUP?

My answer to the question, how to classify in the presence of ancestral taxa and topological uncertainty, not too mention reticulation and budding speciation, is very simple. Love to repeat it:
Don't classify cladistically but phylogenetically!

Make use of holophyla, paraphyla, and monophyla in a pre-Hennigian sense, and take into account lineage coherence and diagnosability (what Mayr called "overall similarity"), and the time-frame as evidenced in the fossil record (if there is any). See e.g. the Osmundaceae example and my posts on What is an angiosperm [pt 1] [pt 2] [casus belli].

As far as I can tell we (still, cf. Felsenstein, 2004, chapter 10) only have cladistic doctrine (!)-induced naming problems because neither Hennig's "Kladistik" nor Farris' "cladistics" considered the existence of actual ancestors in the fossil record, not to mention evolutionary stasis and positive selection. If A and B are well-established extant monophyla and I go deep enough into time to stumble across their ancestor(s) (ancestral forms), I just call them C, even though any ancestor with its own name is per se paraphyletic. Hennig's or Farris' only option is to fuse all into A (something neontologists generally don't except, getting their good taxa destroyed because of a too primitive fossil)

Once you stop trying to classify cladistically, there are no questions anymore, only quick-and-easy answers. Pinpointing common ancestry (monophyla sensu Haeckel) is quite straightforward (especially when you have molecular data), discerning Hennig's monophyla (= holophyla) and eliminating paraphyla is not (why both Farrisians and the PhyloCode failed in producing handy and stable classifications).
ReplyDelete
Replies
David Williams12/02/2021, 13:40
If you check with the original post of Morrison's you'll notice I offered a correction to one of his statements:

"Second, parsimony analysis was developed independently of Hennig, by people such as Farris, Nelson and Platnick." Farris, maybe -- but cladistics sensu Nelson has no connection to Wagner Parsimony (Nelson, Gareth G. 1979. Cladistic analysis and synthesis: principles and definitions, with a historical note on Adanson’s Familles des Plantes (1763–1764). Systematic Zoology, 28: 1–21.). This paper may tell you a great about cladistics and how it should have been understood.

The key point: cladistics sensu Nelson has no connection to Wagner Parsimony. Morrison never replied.

I am not responsible for your experiences with reviewers or persons you seemingly feel belong to a faction. Maybe I agree with your last comment. Maybe I have suffered too. But it doesn't matter much. I am not interested in factions, schools, disciples etc. or feeling sorry for myself.

I don't want to simply argue with you, it serves no purpose. Let me just comment on one or two things:

"Vertebrate palaeozoologists are sure there are no ancestors in the fossil record". That is a very odd statement and suggests to me, although I could be wrong, you don't understand anything about recognising ancestors.

So, you want to resurrect 'overall similarity'. Good luck, you are not alone.

Actually, you seem to want to resurrect all sorts of things.

Again, with respect, one needs evidence for monophyly, paraphyly is a consequence of discovering monophyly. It is lack of evidence.

Right now I don't care if you don't read my book. You seem to know all sorts of things.

ReplyDelete
Replies
Das Grimm12/02/2021, 16:52
Re: Overall similarity
I don't want to "resurrect" overall similarity (Mayr's 2002 paper was convolute and a bit pointless); but one should not ignoring lineage coherence when doing classification. Especially not in an era, where species are erected based on 3% rules and environmental bulk DNA samples in order to replace good alpha-taxonomy.

Just to give a concrete example: To classify the angiosperms cladistically sensu Hennig (Nelson?), I don't need to infer any tree at all. It's obvious from a heat-map of genetic distances: members of likely monophyla (holophyla) are much more similar to each other than to any other taxon in the data set. That's why they give us highly-supported (and model-independent) clades. Morphologically, we have such trivial situations, too (an example from the dinosaurs), but often we face much more complex data situations, e.g. when putting up a matrix for extant and extinct angiosperms.
ReplyDelete
Replies
Das Grimm12/02/2021, 17:02
Re: Paraphyly; not "just lack of signal"

Paraphyly is the lack of lineage-conserved, uniquely derived character suites—homoiologies and Hennig's rare synapomorphies. What makes paraphyletic groups useful in classification is exactly what you point out: we can define them via (nested) monophyla.

Paraphyly is furthermore also a common consequence of population dynamics (interesting case: individual-based fossil phylogenies), and, subsequently, genetic drift, active population sizes, and effect and frequence of bottleneck situations, in short: asymmetric speciation processes. If I remove one population (or species) from a master population (source species), it may become quickly different but the master population doesn't change. As soon as I give the isolate a name (being beyond doubt monophyletic and visiblydifferent), the still-unchanged remainder becomes a paraphylum and becomes invalid. Fast ancient radiations popping up easy-to-diagnose, highly-supported clades with prominent roots, i.e. easy-to-argue monophyla, quite often keep leftovers that are genetically ambiguous and morphologically poorly differentiated. Which we, using cladistic classifications, solve by erecting monotypic genera of little diagnostic value, or simply not recognising the easy-to-diagnose monophyla within the larger group (see e.g. certain much-inflated angiosperm families).

I guess, in most aspects, we are not that far apart. It's only (pretty dogmatic, sorry) statements like "paraphyly is lack of evidence" that distinguish your (in principle, good-)cladistic (in contrast to Farrisian naive-cladistic) doctrine ("Only monophyla must be named!") from my real-world inspired (as in this post), utterly doctrine-free solutions to complex (non-trivial) cases, hence, different solutions for different sets of data and data situations.

If you have time to waste: Just try one of the many freely accessible datasets I covered in my papers and posts, following the suggestions in your own book to demonstrate that (the original, pre-Farris) cladistics are still the only possible classification system and are applicable also to non-trivial situations (rather to those where Mayr's overall similarity would suffice, see example in the response above). Proof me wrong, there are many examples to choose from.
ReplyDelete
Replies
Das Grimm12/02/2021, 17:06
Postscriptum and epilogue

Absolutely not sorry for myself; I'm very happy in my early retirement because I don't have to fight anymore utterly useless fights with anonymous shadows shooting from the dark, who occassionaly demonstrated to even know less about their own data/papers (one example) than they had about ours. But, and apparently in contrast to you, I met many people who suffered during peer-review without really knowing why having been indoctrinated by cladists. That's why I have such an illustrious collection of co-authors ranging from micropalaeontologists to bioinformaticians: people came to me with data that was not trivial to analyse, and required some open-minded creativity. In my early retirement, I can afford to give all those victims of doctrines a voice, and a helping hand, pointing them to solutions that worked for us. And call-out challenging hindering doctrines, such as cladistic classification (be it Hennig, Nelson or Farris-style).
ReplyDelete
Replies
pnyikos13/08/2022, 23:21
I've been trying to find out about the "cladist wars," which produced a revolution in biology, whose fallout you, Das Grimm, are ably dealing with. The essays I've read so far are (1) long on generalities and short on details and (2) have little to do with the issue of which I am most concerned: the demise of the Linnean system of classification, including the banishment of paraphyletic taxa and the resulting forcing of all organisms to the branch tips of phylogenetic trees.

The only specific event of those wars that I have ever read about (from two different sources, one of which I own: Kenneth S. Thompson's LIVING FOSSIL: The Story of the Coelacanth) is the 1978 event "the lungfish, the salmon, and the cow". It had to do with a corollary of item (2) above, the radical redefinement by cladists of the word "related". The anti-cladist side claimed that it was absurd to regard a lungfish to be more closely related to a cow than to a salmon, but that was a naive choice of taxa, and the outcome was an undeserved victory for the cladist side.

Had a competent vertebrate paleontologist been involved, the choice of taxa could have been very different. One choice readily available at that time was "Bos, Ichthyostega, Elpistostege." [Nowadays, the third is better replaced by the more familiar Tiktaalik.]
It does violence to our ordinary idea of human relationships to claim that Ichthyostega is more closely related to us human beings than it is to Elpistostege. It's almost as bad as saying that Mitochondrial Eve is more closely related to everyone alive today than she was to anyone in her family at the time she was born.

Had the victory gone the other way, we might be far advanced in a definition of "more related" that combines phylogeny with measures of disparity. As it is, the theory of macroevolution, to which disparity is an indispensable tool, has made comparatively little progress to date.
ReplyDelete
Replies

Add comment

Enter your comment ...

Res.I.P. – an unprofessional science (and other things) blog

Labels

Translate

Cladistics vs Phylogenetics: What's the difference?

20 comments: