What I was not allowed to show #1: A neighbour-net of seed plants

Even as a professional scientist, I always put a lot of effort in enhancing the graphics of our papers. In some cases, the mighty Wizards of the Forest of Review appreciated the effort, but most didn’t bother. In some cases, the circumstances forced me to dump some pretty nice graphs. In this series of posts, I’ll show what has been lost because of the Impermeable Fog, or because my co-authors were vary it might wake dogs and more evil things lurking in the Forest of Review.

How I got my first and only co-authorship in Nature

End 2006 (or start of 2007), I was offered a co-authorship on a paper that got stuck in Nature’s Impermeable Fog (we never managed to get any paper into that particular fog at all, whatever we dared to submit to Nature or Science was rejected within 24h). The paper (Friis et al. 2007) was about an enigmatic Cretaceous seed studied with an extremely sophisticated non-destructive imaging method. But this alone, and the substantial fame of the authors already on the paper, was not enough to get it through. One of the Wizards of the Forest (i.e. an anonymous peer) found the paper needed an explicit phylogenetic analysis (it didn't, but thanks to this opinion, I was funded for four years by the Swedish state). The first author, not trusting anyone else, asked a colleague, whether he could do it (she knew he had done it in the past for one of his papers). He answered that he leaves this business entirely to his trusted German analyst, me. So, I got a matrix scoring for morphological characters that had been used to infer trees of extinct and extant seed plants (spermatophytes; Hilton & Bateman 2006), and the seed was included. I made the required analysis (classic parsimony tree and bootstrapping analysis), boring but enough to get a co-author seat.

In-text figure 3, my visible contribution to the Nature paper
But, already aware of its inherent deficiencies, I didn’t stop there.

Non-trivial data need non-established procedures, right?

I thought so. My contact with phylogenetic networks (in the broadest sense) was relatively fresh, but I directly thought that a matrix with a lot of signal issues – aside the usual missing data problems, incompatible signals, many homoplastic characters – is just the right material for them. So, I did not just produce a tree with some branch support annotated along the branches (which you can find in the Supplementary Information to the paper). Instead, I generated consensus networks to summarise the sample of equally (most) parsimonious trees and the bootstrap (pseudoreplicate) tree samples; and I used the matrix of pairwise distances inferred from the character matrix to generate a neighbour-net splits graph, a planar, i.e. 2-dimensional ‘meta-phylogenetic’ network (Bryant & Moulton 2002; Bryant & Moulton 2004) [Bryant & Moulton called it a phylogenetic network, and I did, too, until David Morrison pointed me to his post, and he’s right that we should not call splits graphs phylogenetic networks.]

A neighbour-net of extinct (ancient) and modern seed plants, possibly the most- viewed unpublished figure I ever made. Main lineages are coloured, the earliest precursors ('progymnosperms') and ancient, very primitive 'seed ferns', the conifers (needle trees) and their precursors and ancient relatives (Cordaitales), Ginkgo, cycads and more advanced Mesozoic seed ferns, and finally the crown-groups: the today-dominating flowering plants (angiosperms) and the 'BEG clade' including extinct, ± derived groups (Bennettitales, Erdmanithecales), the seed we placed, and the modern-day Gnetales.

Possibly the first of its kind using such data. My first author liked what she saw, and so did essentially every other person I showed it to, then and in the many years to come (I showed it effectively to every scientist I met, and quite a number of non-scientists I entertained with the background story). But the 2nd author was wary that such openly “phenetic” methods could give the Wizards the hold to turn down the paper (he used his famous catch-phrase, when we discussed it: “I wouldn’t do that”, which translates into “no”). Back in the days many people toying around with phylogenetics (particular botanists, and, notoriously, palaeontologists) had obviously not read Felsenstein’s (2004) book Inferring Phylogenies and still considered all distance-based analysis as “phenetic” in contrast to the “phylogenetic” parsimony (and maximum likelihood or Bayesian inference, if you must use them…) The same people lost the ‘Phylogenetic War’ a decade before (see chapter 10 in Felsenstein’s book), but this didn’t mean they could not find someone else to pick on. So, when the revision was submitted to Nature, my beloved figure was not even in the Supplementary Information. [I learned a few other things about high-impact publications, which I may share some later point in the right context.]

The figure, as it should have appeared in the Supplementary Information, before the veto of the 2nd author

The lost graph: a network based on similarity among extinct and extant spermatophytes

In spite the data-inherent problems with the underlying matrix (Hilton & Bateman 2006), the graph nicely depicts the actual relationships of the main extinct and extant spermatophyte groups to each other, most of which still stand. It furthermore illustrates the signal strengths and weaknesses of the matrix. The angiosperms, the flowering plants – today’s most dominant and diverse plant group – are recognised as a most distinct group, with only one possible closest fossil relative: Caytonia, a Mesozoic seed fern and apparently a coding artefact. By severe matrix recoding Rothwell, Crepet & Stockey (2009) and Rothwell & Stockey (2016) were able to dissociate that little bastard from the long-branching flowering plants. The circular arrangement reflects the evolution from primitive forms (colloquially known as “seed ferns”) to increasingly modern and complex ones (modern conifers, Gnetales and angiosperms). Any relationship in the original tree is found in the graph, as well as not so unlikely alternatives. Primitive members of each main lineage are placed closer to the centre or putative root of the graph than derived ones. Also, the graph is the most comprehensive and honest depiction of the signal in the underlying data, something that back then my 2nd author, the dangerous peer, and – nearly 10 years later – also Rothwell & Stockey (2016) still refuse to realise: the signal is complex, and cannot be captured by a single tree (see also Coiro, Chomicki & Doyle 2017; and this post on the topic).

Some more figures I made in course of the study. Left, a support consensus network based on parsimony bootstrapping. On the right, a consensus network of distance-based trees generated by eliminting one taxon per tree (both going – at the time – too far into unchartered territory to be considered for the supplement)


Two years later, I managed to publish my first morphology-based networks (Denk & Grimm 2009; Friis et al. 2009), naturally in low/mid-impact journals, where they quite smoothly passed the wary eyes of Wizards of the Forest (and objections of certain co-authors). One or two pointed out that it is a “phenetic” analysis (which it is not, and I told them so in our responses), but beyond that fundamental critique the graphs were too alien to our peers that they dared to criticise them in more detail (and the editors had no reason or right to object). And now that I’m free to do what I want to, I’ve wrote a couple of blog posts on David Morrison’s Genealogical World of Networks blog promoting the use of networks (planar or other) when studying morphological data sets. Our papers including them (Friis et al. 2009, 2015; Schlee et al. 2011; Grímsson et al. 2014; Mendes et al. 2014; Bomfleur, Grimm & McLoughlin 2015, 2017) are all quite well picked up by the community (regarding number of downloads and citations), and we will see if this helps to break down the wall in the heads.

However, I can’t stop thinking: What if my graph would have been published back then in high-fly Nature? I possibly would not have to write those posts at all, because it would already be a standard approach that long could have replaced the usually telling-nothing, often severely biased strict consensus trees in nearly every morphology-based phylogenetic paper on extinct plant or animal groups published so far...

More posts on morphological data, fossils and networks (including links to CC-BY-licenced figshare material)

Spermatophyte networks

Grimm GW. 2017. Should we infer trees on treeunlikely matrices? In: Morrison DA, editor. The Genealogical World of Phylogenetic Networks.
Grimm GW. 2017. Morphology-based neighbour-net of seed plants. figshare.
Grimm G. 2017. Morphology-based neighbour-net of seed plants: quick exploratory data analysis of the matrix of Rothwell & Stockey (2016). figshare.

Related networks

Grimm GW. 2017. Stacking neighbour-nets: ancestors and descendants. In: Morrison D, editor. The Genealogical World of Phylogenetic Networks.
Grimm GW. 2017. Stacking neighbour-nets: a real-world example. In: Morrison D, editor. The Genealogical World of Phylogenetic Networks.
Grimm GW. 2017. Osmundales diverstity through time: stacking networks. figshare.
Grimm GW. 2017. More non-treelike data forced into trees: a glimpse into the dinosaurs. In: Morrison D, editor. The Genealogical World of Phylogenetic Networks.
Grimm GW. 2017. Networks, not trees, identify “weak spots” in phylogenetic trees. In: Morrison D, editor. The Genealogical World of Phylogenetic Networks.
Grimm G. 2017. Classification of mosasaurs - using networks. figshare.

Bomfleur B, Grimm GW, McLoughlin S. 2015. Osmunda pulchella sp. nov. from the Jurassic of Sweden—reconciling molecular and fossil evidence in the phylogeny of modern royal ferns (Osmundaceae). BMC Evolutionary Biology 15:126.
Bomfleur B, Grimm GW, McLoughlin S. 2017. The fossil Osmundales (Royal Ferns)—a phylogenetic network analysis, revised taxonomy, and evolutionary classification of anatomically preserved trunks and rhizomes. PeerJ 5:e3433.
Bryant D, Moulton V. 2002. NeighborNet: an agglomerative method for the construction of planar phylogenetic networks. In: Guigó R, and Gusfield D, eds. Algorithms in Bioinformatics, Second International Workshop, WABI. Rome, Italy: Springer Verlag, Berlin, Heidelberg, New York, p. 375-391.
Bryant D, Moulton V. 2004. Neighbor-Net: An agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21:255-265.

Coiro M, Chomicki G, Doyle JA. 2017. Experimental signal dissection and method sensitivity analyses reaffirm the potential of fossils and morphology in the resolution of seed plant phylogeny. bioRxiv DOI:10.1101/134262.
Denk T, Grimm GW. 2005. Phylogeny and biogeography of Zelkova (Ulmaceae sensu stricto) as inferred from leaf morphology, ITS sequence data and the fossil record. Botanical Journal of the Linnéan Society 147:129–157. 

Denk T, Grimm GW. 2009. The biogeographic history of beech trees. Review of Palaeobotany and Palynology 158:83–100.
Friis EM, Crane PR, Pedersen KR, Bengtson S, Donoghue PCJ, Grimm GW, Stampanoni M. 2007. Phase-contrast X-ray microtomography links Cretaceous seeds with Gnetales and Bennettitales. Nature 450:549–552. 

Friis EM, Pedersen KR, von Balthazar M, Grimm GW, Crane PR. 2009. Monetianthus mirus gen. et sp. nov., a nymphaealean flower from the early Cretaceous of Portugal. International Journal of Plant Sciences 170:1086-1101.Friis EM, Grimm GW, Mendes MM, Pedersen KR. 2015. Canrightiopsis, a new Early Cretaceous fossil with Clavatipollenites-type pollen bridge the gap between extinct Canrightia and extant Chloranthaceae. Grana 54:184–212.Grímsson F, Zetter R, Halbritter H, Grimm GW. 2014. Aponogeton pollen from the Cretaceous and Paleogene of North America and West Greenland: Implications for the origin and palaeobiogeography of the genus. Review of Palaeobotany and Palynology 200:161–187.
Hilton J, Bateman RM. 2006. Pteridosperms are the backbone of seed-plant phylogeny. Journal of the Torrey Botanical Society 133:119-168.
Mendes MM, Grimm GW, Pais J, Friis EM. 2014. Fossil Kajanthus juncaliensis gen. et sp . nov from Portugal: Floral evidence for Early Cretaceous Lardizabalaceae (Ranunculales, basal eudicot). Grana 53:283–301.
Rothwell GW, Crepet WL, Stockey RA. 2009. Is the anthophyte hypothesis alive and well? New evidence from the reproductive structures of Bennettitales. American Journal of Botany 96:296–322.
Rothwell GW, Stockey RA. 2016. Phylogenetic diversification of Early Cretaceous seed plants: The compound seed cone of Doylea tetrahedrasperma. American Journal of Botany 103:923–937.
Schlee M, Göker M, Grimm GW, Hemleben V. 2011. Genetic patterns in the Lathyrus pannonicus complex (Fabaceae) reflect ecological differentiation rather than biogeography and traditional subspecific division. Botanical Journal of the Linnéan Society 165:402-421.

No comments:

Post a Comment

Enter your comment ...