Reconstructing the evolutionary history of modern species is a difficult problem complicated by the conceptual and technical limitations of phylogenetic tree building methods. Here, we propose a comparative proteomic and functionomic inferential framework for genome evolution that allows resolving the tripartite division of cells and sketching their history. Evolutionary inferences were derived from the spread of conserved molecular features, such as molecular structures and functions, in the proteomes and functionomes of contemporary organisms. Patterns of use and reuse of these traits yielded significant insights into the origins of cellular diversification. Results uncovered an unprecedented strong evolutionary association between Bacteria and Eukarya while revealing marked evolutionary reductive tendencies in the archaeal genomic repertoires. The effects of nonvertical evolutionary processes (e.g., HGT, convergent evolution) were found to be limited while reductive evolution and molecular innovation appeared to be prevalent during the evolution of cells. Our study revealed a strong vertical trace in the history of proteins and associated molecular functions, which was reliably recovered using the comparative genomics approach. The trace supported the existence of a stem line of descent and the very early appearance of Archaea as a diversified superkingdom, but failed to uncover a hidden canonical pattern in which Bacteria was the first superkingdom to deploy superkingdom-specific structures and functions. 1. Introduction Tracing the evolution of extant organisms to a common universal cellular ancestor of life is of fundamental biological importance. Modern organisms can be classified into three primary cellular superkingdoms, Archaea, Bacteria, and Eukarya [1]. Molecular, biochemical, and morphological lines of evidence support this trichotomous division. While the three-superkingdom system is well accepted, establishing which of the three is the most ancient remains problematic. Initial construction of unrooted phylogenies based on the joint evolution of genes linked by an ancient gene duplication event revealed that, for each set of paralogous genes, Archaea and Eukarya were sister groups and diverged from a last archaeal-eukaryal common ancestor [2, 3]. This “canonical” rooting that places Bacteria at the base of the “Tree of Life” (ToL) is still widely accepted despite the fact that many other paralogous gene couples produced discordant topologies and despite known technical artifacts associated with these sequence-based evolutionarily deep
References
[1]
C. R. Woese, O. Kandler, and M. L. Wheelis, “Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya,” Proceedings of the National Academy of Sciences of the United States of America, vol. 87, no. 12, pp. 4576–4579, 1990.
[2]
J. P. Gogarten, H. Kibak, P. Dittrich et al., “Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes,” Proceedings of the National Academy of Sciences of the United States of America, vol. 86, no. 17, pp. 6661–6665, 1989.
[3]
N. Iwabe, K. Kuma, M. Hasegawa, S. Osawa, and T. Miyata, “Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes,” Proceedings of the National Academy of Sciences of the United States of America, vol. 86, no. 23, pp. 9355–9359, 1989.
[4]
S. Gribaldo and H. Philippe, “Ancient phylogenetic relationships,” Theoretical Population Biology, vol. 61, no. 4, pp. 391–408, 2002.
[5]
H. Philippe and P. Forterre, “The rooting of the universal tree of life is not reliable,” Journal of Molecular Evolution, vol. 49, no. 4, pp. 509–523, 1999.
[6]
G. Caetano-Anollés and A. Nasir, “Benefits of using molecular structure and abundance in phylogenomic analysis,” Frontiers in Genetics, vol. 3, article 172, 2012.
[7]
F. Delsuc, H. Brinkmann, and H. Philippe, “Phylogenomics and the reconstruction of the tree of life,” Nature Reviews Genetics, vol. 6, no. 5, pp. 361–375, 2005.
[8]
E. V. Koonin, K. S. Makarova, and L. Aravind, “Horizontal gene transfer in prokaryotes: quantification and classification,” Annual Review of Microbiology, vol. 55, pp. 709–742, 2001.
[9]
O. Popa and T. Dagan, “Trends and barriers to lateral gene transfer in prokaryotes,” Current Opinion in Microbiology, vol. 14, no. 5, pp. 615–623, 2011.
[10]
R. Jain, M. C. Rivera, and J. A. Lake, “Horizontal gene transfer among genomes: the complexity hypothesis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 96, no. 7, pp. 3801–3806, 1999.
[11]
G. Caetano-Anollés and D. Caetano-Anollés, “An evolutionarily structural universe of protein architecture,” Genome Research, vol. 13, no. 7, pp. 1563–1571, 2003.
[12]
M. Wang and G. Caetano-Anollés, “Global phylogeny determined by the combination of protein domains in proteomes,” Molecular Biology and Evolution, vol. 23, no. 12, pp. 2444–2454, 2006.
[13]
F. J. Sun and G. Caetano-Anollés, “Evolutionary patterns in the sequence and structure of transfer RNA: early origins of Archaea and viruses,” PLoS Computational Biology, vol. 4, no. 3, Article ID e1000018, 2008.
[14]
F. J. Sun and G. Caetano-Anollés, “The origin and evolution of tRNA inferred from phylogenetic analysis of structure,” Journal of Molecular Evolution, vol. 66, no. 1, pp. 21–35, 2008.
[15]
F. J. Sun and G. Caetano-Anollés, “The evolutionary history of the structure of 5S ribosomal RNA,” Journal of Molecular Evolution, vol. 69, no. 5, pp. 430–443, 2009.
[16]
F. J. Sun and G. Caetano-Anollés, “The ancient history of the structure of ribonuclease P and the early origins of Archaea,” BMC Bioinformatics, vol. 11, article 153, 2010.
[17]
H. Xue, K. L. Tong, C. Marck, H. Grosjean, and J. T. F. Wong, “Transfer RNA paralogs: evidence for genetic code-amino acid biosynthesis coevolution and an archaeal root of life,” Gene, vol. 310, no. 1-2, pp. 59–66, 2003.
[18]
M. di Giulio, “The tree of life might be rooted in the branch leading to Nanoarchaeota,” Gene, vol. 401, no. 1-2, pp. 108–113, 2007.
[19]
A. G. Murzin, S. E. Brenner, T. Hubbard, and C. Chothia, “SCOP: a structural classification of proteins database for the investigation of sequences and structures,” Journal of Molecular Biology, vol. 247, no. 4, pp. 536–540, 1995.
[20]
A. Andreeva, D. Howorth, J. M. Chandonia et al., “Data growth and its impact on the SCOP database: new developments,” Nucleic Acids Research, vol. 36, no. 1, pp. D419–D425, 2008.
[21]
D. Caetano-Anollés, K. M. Kim, J. E. Mittenthal, and G. Caetano-Anollés, “Proteome evolution and the metabolic origins of translation and cellular life,” Journal of Molecular Evolution, vol. 72, no. 1, pp. 14–33, 2011.
[22]
G. Caetano-Anollés, M. Wang, D. Caetano-Anollés, and J. E. Mittenthal, “The origin, evolution and structure of the protein world,” Biochemical Journal, vol. 417, no. 3, pp. 621–637, 2009.
[23]
M. Ashburner, C. A. Ball, J. A. Blake et al., “Gene ontology: tool for the unification of biology,” Nature Genetics, vol. 25, no. 1, pp. 25–29, 2000.
[24]
M. Harris, J. Clark, A. Ireland, et al., “The Gene Ontology (GO) database and informatics resource,” Nucleic Acids Research, vol. 32, pp. D258–D261, 2004.
[25]
K. M. Kim and G. Caetano-Anollés, “Emergence and evolution of modern molecular functions inferred from phylogenomic analysis of ontological data,” Molecular Biology and Evolution, vol. 27, no. 7, pp. 1710–1733, 2010.
[26]
M. Wang, L. S. Yafremava, D. Caetano-Anollés, J. E. Mittenthal, and G. Caetano-Anollés, “Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world,” Genome Research, vol. 17, no. 11, pp. 1572–1585, 2007.
[27]
K. Illerg?rd, D. H. Ardell, and A. Elofsson, “Structure is three to ten times more conserved than sequence—a study of structural response in protein cores,” Proteins, vol. 77, no. 3, pp. 499–508, 2009.
[28]
K. M. Kim and G. Caetano-Anollés, “The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms,” BMC Evolutionary Biology, vol. 12, no. 1, article 13, 2012.
[29]
K. M. Kim and G. Caetano-Anollés, “The proteomic complexity and rise of the primordial ancestor of diversified life,” BMC Evolutionary Biology, vol. 11, no. 1, article 140, 2011.
[30]
M. P. Hoeppner, P. P. Gardner, and A. M. Poole, “Comparative analysis of RNA families reveals distinct repertoires for each domain of life,” PLoS Computational Biology, vol. 8, no. 11, article e1002752, 2012.
[31]
G. J. Olsen, C. R. Woese, and R. Overbeek, “The winds of (evolutionary) change: breathing new life into microbiology,” Journal of Bacteriology, vol. 176, no. 1, pp. 1–6, 1994.
[32]
C. R. Woese, “Bacterial evolution,” Microbiological Reviews, vol. 51, no. 2, pp. 221–271, 1987.
[33]
M. C. Rivera and J. A. Lake, “The ring of life provides evidence for a genome fusion origin of eukaryotes,” Nature, vol. 431, no. 7005, pp. 152–155, 2004.
[34]
W. Martin and M. Müller, “The hydrogen hypothesis for the first eukaryote,” Nature, vol. 392, no. 6671, pp. 37–41, 1998.
[35]
A. Nasir, K. M. Kim, and G. Caetano-Anolles, “Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea Bacteria and Eukarya,” BMC Evolutionary Biology, vol. 12, article 156, 2012.
[36]
J. Gough and C. Chothia, “SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments,” Nucleic Acids Research, vol. 30, no. 1, pp. 268–272, 2002.
[37]
D. Wilson, M. Madera, C. Vogel, C. Chothia, and J. Gough, “The SUPERFAMILY database in 2007: families and functions,” Nucleic Acids Research, vol. 35, no. 1, pp. D308–D313, 2007.
[38]
J. Gough, K. Karplus, R. Hughey, and C. Chothia, “Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure,” Journal of Molecular Biology, vol. 313, no. 4, pp. 903–919, 2001.
[39]
S. Garcia-Vallve, E. Guzman, M. A. Montero, and A. Romeu, “HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes,” Nucleic Acids Research, vol. 31, no. 1, pp. 187–189, 2003.
[40]
J. Gough, “Convergent evolution of domain architectures (is rare),” Bioinformatics, vol. 21, no. 8, pp. 1464–1471, 2005.
[41]
C. Moissl-Eichinger and H. Huber, “Archaeal symbionts and parasites,” Current Opinion in Microbiology, vol. 14, no. 3, pp. 364–370, 2011.
[42]
M. Wang, C. G. Kurland, and G. Caetano-Anollés, “Reductive evolution of proteomes and protein structures,” Proceedings of the National Academy of Sciences of the United States of America, vol. 108, no. 29, pp. 11954–11958, 2011.
[43]
D. Ungar and F. M. Hughson, “SNARE protein structure and function,” Annual Review of Cell and Developmental Biology, vol. 19, pp. 493–517, 2003.
[44]
K. Georgiades, V. Merhej, K. El Karkouri, D. Raoult, and P. Pontarotti, “Gene gain and loss events in Rickettsia and Orientia species,” Biology Direct, vol. 6, article 6, 2011.
[45]
S. Gribaldo, A. M. Poole, V. Daubin, P. Forterre, and C. Brochier-Armanet, “The origin of eukaryotes and their relationship with the Archaea: are we at a phylogenomic impasse?” Nature Reviews Microbiology, vol. 8, no. 10, pp. 743–752, 2010.
[46]
J. Kuriyan and M. O'Donnell, “Sliding clamps of DNA polymerases,” Journal of Molecular Biology, vol. 234, no. 4, pp. 915–925, 1993.
[47]
B. Stillman, “Smart machines at the DNA replication fork,” Cell, vol. 78, no. 5, pp. 725–728, 1994.
[48]
K. Kleman-Leyer, D. W. Armbruster, and C. J. Daniels, “Properties of H. volcanii tRNA intron endonuclease reveal a relationship between the archaeal and eucaryal tRNA intron processing systems,” Cell, vol. 89, no. 6, pp. 839–847, 1997.
[49]
M. Wang and G. Caetano-Anollés, “The evolutionary mechanics of domain organization in proteomes and the rise of modularity in the protein world,” Structure, vol. 17, no. 1, pp. 66–78, 2009.
[50]
L. S. Yafremava, M. Wielgos, S. Thomas, et al., “A general framework of persistence strategies for biological systems helps explain domains of life,” Frontiers in Genetics, vol. 4, article 16, 2013.
[51]
E. V. Koonin, T. G. Senkevich, and V. V. Dolja, “Compelling reasons why viruses are relevant for the origin of cells,” Nature Reviews Microbiology, vol. 7, no. 8, article 615, 2009.
[52]
P. Forterre, “The origin of viruses and their possible roles in major evolutionary transitions,” Virus Research, vol. 117, no. 1, pp. 5–16, 2006.
[53]
A. Nasir, K. M. Kim, and G. Caetano-Anollés, “Viral evolution: primordial cellular origins and late adaptation to parasitism,” Mobile Genetic Elements, vol. 2, no. 5, pp. 247–252, 2012.
[54]
C. Brochier-Armanet, P. Forterre, and S. Gribaldo, “Phylogeny and evolution of the Archaea: one hundred genomes later,” Current Opinion in Microbiology, vol. 14, no. 3, pp. 274–281, 2011.
[55]
E. Desmond, C. Brochier-Armanet, P. Forterre, and S. Gribaldo, “On the last common ancestor and early evolution of eukaryotes: reconstructing the history of mitochondrial ribosomes,” Research in Microbiology, vol. 162, no. 1, pp. 53–70, 2011.
[56]
O. Kandler, “Cell wall biochemistry and three-domain concept of life,” Systematic and Applied Microbiology, vol. 16, no. 4, pp. 501–509, 1994.
[57]
C. R. Woese, “On the evolution of cells,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 13, pp. 8742–8747, 2002.