Taxonomic Bias in the Fossil Record: Is it really an issue?

As a palaeontologist, especially one who doesn’t work on vertebrates and shelled invertebrates, the adequacy of the fossil record for phylogenetics and for finding out evolutionary origins of taxa is a topic I often wrestle with – Darwin himself also famously complained about this issue, devoting an entire chapter to it; all of his complaints are still valid nowadays. The value of sites of exceptional preservation was highlighted multiple times in my Rise of Animals post, but these sites truly are exceptional and not representative of the entire fossil record. In this post, I want to just discuss my own opinion on the issue of bias in the fossil record, specifically as it concerns the issue of taxonomic scale, as it’s important not only for every single palaeontologist, but also for zoologists (/botanists, etc.) using palaeontological data. This question is part of the field of taphonomy, by the way.

First, it must be said clearly that not every organism that ever lived has been fossilised because of where they lived, how they died or any other factor. Those that have been fossilised might have been destroyed after being deposited, either through geology (tectonics, metamorphosis, etc.) or through geochemistry. Finally, these fossils have to be found and collected. So the record is, at face value, very incomplete – which is why every single fossil is a valuable resource that has the potential to tell us a lot.

But the incompleteness isn’t the only problem. As Raup (1972) showed, there is significant bias in the fossil record as well. The picture above provides a simple summary flowchart (Kidwell & Flessa, 1996). This seems obvious from common sense: older rocks have probably gone through more geological events and thus are less likely to have preserved their fossils (in studiable conditions anyway). This is also true empirically (but with caveats, as we will see): 5 minutes from my house is a 2-4 Ma fossil locality where I can collect tons of wonderfully preserved shells off the ground. Alternatively, I could go to the much older metamorphically-modified Cretaceous rocks on the western side of the island, and I’ll be lucky if I find an ammonite.

But that’s a bit of a blinkered view. Having a million of the same shells is useless for all but palaeoecological studies. If one wants to study evolution and systematics in the fossil record, only a single specimen of a taxon is needed (except in cases of polymorphism, but no need to get into such things now).

The word “taxon” is critical here. “Taxon” is what we refer to when we don’t wish to apply any artificial taxonomic ranks – a taxon can be a species, a family, a phylum, etc. In other words, it does away with the scale aspect.

But scale is very important. I can collect 5 specimens of the same species and learn very little; on the other hand, I can collect 5 specimens from 5 families and learn of how diverse that particular ecosystem was, or learn that those 5 families had evolved by at least that point in time.

If one reads Raup (1972), what he showed was that older rocks contained less fossils and consequently less numbers of species. However, the fossil record would still be adequate if the remaining fossils represent a wide variety of higher-level taxa – all the species may not be there, but all one needs is a single representative of an extinct family or order to gain a whole lot of information.

Benton et al. (2000) tested this idea. What they basically did was take 1000 phylogenetic trees and plot the branching patterns with a temporal scale, to see if they match with the fossil record. And what they found was that if one looks at the origin of higher taxa in the fossil record (families, orders, etc.), and matches it with the branching pattern of the phylogenetic tree, the match is very good. In other words, the fossil record is representative at those higher scales. It’s only when one zooms in to species level, or to thousand-year level, that everything becomes fuzzy – and that’s to be expected anyway.

There is another thing to consider, one that I alluded to in the introduction: the nature of the taxa under study. Mammals and molluscs have very reliable fossil records because they’ve got bones and shells (respectively) – easily fossilisable structures. With both, it’s possible to have stretches of stratigraphy at species-level resolution (unrelated, these localities played a big role in the punctuated equilibrium – stasis debates of the 1980s). Mammalian examples come from the Badlands in the USA, where the ubiquitous teeth allow immediate dating of your formations, as well as the tracing of mammal evolution in that landscape. I described a molluscan example at the beginning of the blog.

But even there, taxonomic biases creep in. For example, mollusc shells aren’t all built the same. They may have different amounts of calcium carbonate, or they might be built out of aragonite, which is less stable than calcite. But even there, the bias that exists has no effect on our evolutionary conclusions – because of the coarser scale we study them at (Kidwell, 2005).

Compare these taxa to things like jellyfish, which are only known from the Konservat Lagerstätten of the Ediacaran, Cambrian, and Devonian. We can’t trace the evolution of jellyfish at any taxonomic level using the fossil record. This is also in stark contrast to their very close relatives, the corals.

However, it can be argued that just the fact that we have fossilised jellyfish is good enough – after all, it is completely unexpected and they do give us information we wouldn’t have otherwise. Or, for another example, embryos. We have them from the Ediacaran and Cambrian. But their preservation is also biased in every possible way, including taxonomical (Donoghue et al., 2006). My opinion is that this doesn’t matter. Maybe I’m a pessimist, but the default assumption shouldn’t be that “everything is ideally fossilised and we have an incomplete and biased fossil record”, because that’s just bullshit; it should be “hey, these are the fossils we have, we know all the biases and the limitations, so let’s get all the information we can out of them”. As long as these biases and limitations are studied and taken into account in your analysis, there is no need to push for anything beyond them, because that’s just an exercise in futility. I say this not for fellow palaeontologists, because as far as I know, we all agree. But there are a great number of evolutionary biologists (personal experience here) who dismiss the fossil record out of hand because of its incompleteness and bias, instead of accepting it as it is and using the information it gives. These people are idiots, if only for throwing out the potentially most informative source of data on organismal evolution.

So, the summary: the fossil record is biased. Of course it is, and it would be stupid to deny it. But this bias is pretty well-known and can be predicted and tested for in any data set, so its existence is not a problem for palaeontological analyses, as long as it’s acknowledged.


Benton MJ, Wills MA & Hitchin R. 2000. Quality of the fossil record through time. Nature 403, 534-537.

Donoghue PCJ, Kouchinsky A, Waloszek D, Bengtson S, Dong X-P, Val’kov AK, Cunningham JA & Repetski JE. 2006. Fossilized embryos are widespread but the record is temporally and taxonomically biased. Evolution & Development 8, 232-238.

Kidwell SM. 2005. Shell Composition Has No Net Impact on Large-Scale Evolutionary Patterns in Mollusks. Science 307, 914-917.

Kidwell SM & Flessa KW. 1996. THE QUALITY OF THE FOSSIL RECORD: Populations, Species, and Communities. Annual Review of Earth and Planetary Sciences 24, 433-464.

Raup DM. 1972. Taxonomic Diversity during the Phanerozoic. Science 177, 1065-1071.

Research Blogging necessities :)

Benton, M., Wills, M., & Hitchin, R. (2000). Quality of the fossil record through time Nature, 403 (6769), 534-537 DOI: 10.1038/35000558

Leave a Reply