My two pesos on the metagenome anomaly controversy

On the 9th of August (2019) a team of researchers from German, Chinese and Indian institutions uploaded a rather controversial preprint to the bioRxiv:

Anomalous phylogenetic behavior of ribosomal proteins in metagenome assembled genomes by Garg et al. https://doi.org/10.1101/731091

The team challenged the idea that metagenome-assembled genomes (MAGs) represent real organisms, but suggested that these are “unnatural constructs, genome-like patchworks of genes that have been stitched together into computer files by binning.

Needless to say, it created a little twitter storm.

The challenge focused on two major groups of prokaryotes, the Asgard archaea and the Candidate Phyla Radiation (CPR) of bacteria. These two clades have been mostly erected on metagenome projects, although independent evidence for the existence of these groups have been provided through single-cell genomics (Rinke et al. 2013, full references at the end of the post) or the enrichment of a Loki archaeon culture and the sequencing of its genome (Imachi et al. 2019), for example.

The challenge was grounded on phylogenetic evidence. The authors show that the ribosomal proteins found in MAGs of Asgard archaea and CPR bacteria do not retrieve consistent phylogenies, as if different ribosomal proteins would have been “stitched together” from very different sources.

Fig. 3 Neighbor-Nets reconstructed from concatenated alignments of 23 ribosomal proteins for archaeal reference samples and archaeal MAGs.(a) The Neighbor-Net of a concatenated alignment of 23 ribosomal proteins in the archaeal reference sample ARS3…

Fig. 3 Neighbor-Nets reconstructed from concatenated alignments of 23 ribosomal proteins for archaeal reference samples and archaeal MAGs.

(a) The Neighbor-Net of a concatenated alignment of 23 ribosomal proteins in the archaeal reference sample ARS3001 shows very little conflict throughout, resulting in a tree-like network with 16 well supported splits (indicated with red dots). (b) A Neighbor net drawn from a concatenated alignment of the same 23 ribosomal proteins from Asgard archaeal MAGs results in a network with a star-like structure. The insets magnify the central area of interest to better highlight the difference of signals of the two networks. Taken from Garg et al. (2019).

So what is going on?

I have to say that this does not surprise me at all. I am not a metagenomics or bioinformatics specialist, but I do use some basic tools often and regularly search both standard genome and metagenome databases for protein sequence of my interest. I always, without exception, find assembly errors in standard genomes from supposedly axenic isolates and in metagenomes.

At first it blew my mind, now I just find it a nuisance. I have written about this before, and here I give you some examples:

Trichoplax adhaerens, a weird animal with a piece of Photosystem I

Contamination of genome projects with DNA from other organisms

A cyanobacterium with an anoxygenic Type II reaction center from purple bacteria? (Contamination)

2. Results – First phylogeny of BchC and we have discovered a new phototroph!

But what really blows my mind now is how easily such anomalies could be confused with horizontal gene transfer. I am so used to these type of problems that the work by Garg et al. appears to me only timely. Indeed… I kind of expected something like this would happen at some point.

And yet, I have myself some evidence that the CPR are a real thing! Last year we published a phylogenetic study of FtsH proteins. We collected over 6000 FtsH subunits across all bacteria and eukaryotes, including 247 CPR sequences and definitely, the CPR made a monophyletic group. However, the CPR FtsH clustered within the standard diversity of life and did not show the level of divergence suggested by Hug et al. (2016). In fact, comparing trees, the phylogenetic position of the CPR FtsH is more similar to that reported in the single-cell genomic study by Rinke et al. (2013) than to the metagenomic study by Hug et al. (2016).

Phylogeny of FtsH proteases. CPR in red.

Phylogeny of FtsH proteases. CPR in red.

If the CPR made a distinct lineage to that which encompasses the classical known diversity of bacteria, then the position of CPR FtsH is rather anomalous, even if horizontal gene transfer is invoked.

This year, in collaboration with two other early-career scientists, we reported a phylogenomic study of a novel group of phototrophs belonging to an uncultivated phylum of bacteria, the WPS-2 or the Candidatus Eremiobacterota. This phylum is entirely made of MAGs, but we found clearly divergent photosynthetic genes, that had some distant affinity to the Chloroflexi. The ribosomal protein tree did place Eremiobacterota near the Chloroflexi, which seemed like a consistent story. Although, we also detected what appears to be significant horizontal gene transfer into this novel clade (Ward et al. 2019).

Interestingly enough, in the preprint by Salcher et al. (2019) an imaging technique was used to spot some Asgard archaea, but the imaged cells look nothing like what Imachi et al. (2019) reported for the isolated Loki archaeon. Is that just a coincidence?

My conclusion is that the truth is perhaps in the middle. Metagenomes do contain strong signals of novel diversity and novel clades, but many, if not most MAGs (?), do contain a very substantial contribution from foreign sequences that is underestimated by assembly quality control protocols... and which leads to anomalous distances and phylogenetic posititions. But that is my non-expert opinion!

I do think the challenge by Garg et al. is entirely valid and justified, and I hope it will trigger a critical revision of metagenomic data and of the conclusions that have emerged from metagenome projects. I will look forward to seeing a counter-attack! :)

References 

Di Rienzi, S. C., I. Sharon, K. C. Wrighton, O. Koren, L. A. Hug, B. C. Thomas, J. K. Goodrich, J. T. Bell, T. D. Spector, J. F. Banfield and R. E. Ley (2013). "The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria." Elife 2. DOI: 10.7554/eLife.01102.

Hug, L. A., B. J. Baker, K. Anantharaman, C. T. Brown, A. J. Probst, C. J. Castelle, C. N. Butterfield, A. W. Hernsdorf, Y. Amano, K. Ise, Y. Suzuki, N. Dudek, D. A. Relman, K. M. Finstad, R. Amundson, B. C. Thomas and J. F. Banfield (2016). "A new view of the tree of life." Nat Microbiol 1: 16048. DOI: 10.1038/nmicrobiol.2016.48.

Imachi, H., M. K. Nobu, N. Nakahara, Y. Morono, M. Ogawara, Y. Takaki, Y. Takano, K. Uematsu, T. Ikuta, M. Ito, Y. Matsui, M. Miyazaki, K. Murata, Y. Saito, S. Sakai, C. Song, E. Tasumi, Y. Yamanaka, T. Yamaguchi, Y. Kamagata, H. Tamaki and K. Takai (2019). "Isolation of an archaeon at the prokaryote-eukaryote interface." bioRxiv: 726976. DOI: 10.1101/726976.

Rinke, C., P. Schwientek, A. Sczyrba, N. N. Ivanova, I. J. Anderson, J. F. Cheng, A. Darling, S. Malfatti, B. K. Swan, E. A. Gies, J. A. Dodsworth, B. P. Hedlund, G. Tsiamis, S. M. Sievert, W. T. Liu, J. A. Eisen, S. J. Hallam, N. C. Kyrpides, R. Stepanauskas, E. M. Rubin, P. Hugenholtz and T. Woyke (2013). "Insights into the phylogeny and coding potential of microbial dark matter." Nature 499(7459): 431-437. DOI: 10.1038/Nature12352.

Salcher, M. M., A.-Ş. Andrei, P.-A. Bulzu, Z. G. Keresztes, H. L. Banciu and R. Ghai (2019). "Visualization of Loki- and Heimdallarchaeia (Asgardarchaeota) by fluorescence <em>in situ</em> hybridization and catalyzed reporter deposition (CARD-FISH)." bioRxiv: 580431. DOI: 10.1101/580431.

Ward, L. M., T. Cardona and H. Holland-Moritz (2019). "Evolutionary Implications of Anoxygenic Phototrophy in the Bacterial Phylum Candidatus Eremiobacterota (WPS-2)." Front Microbiol 10: 1658. DOI: 10.3389/fmicb.2019.01658.