Sampling informational properties of codon usage through the tree of life

Guardado en:
Detalles Bibliográficos
Publicado en:PLoS One vol. 20, no. 11 (Nov 2025), p. e0335824
Autor principal: Martínez, Octavio
Otros Autores: Reyes-Valdés, Manuel Humberto, Ochoa-Alejo, Neftalí
Publicado:
Public Library of Science
Materias:
Acceso en línea:Citation/Abstract
Full Text
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:The genetic code, a unifying principle in biology, ensures that all organisms, stemming from a Last Universal Common Ancestor (LUCA), share fundamental rules for translating DNA into proteins. However, codon usage varies across the tree of life, influenced not only by GC-content and proteome composition but also by complex, often less understood rules dependent on each species’ evolutionary trajectory. To better understand these rules, we segregated codons into their functional parts and applied Shannon’s information-theoretic measures to 1,434 species from eight diverse taxonomic groups. We provide robust evidence that the first codon base plays a central role in amino acid determination, while the third base serves an accessory function. Using conditional entropy measures, we rigorously quantified this relationship, universally confirming the greater informational variability of the third base across all sampled species for the first time at this scale. Our analysis revealed significant heterogeneity in coding strategies across different taxonomic groups. Notably, the unique variability observed in Archaea, in contrast to the more constrained patterns in Eukaryotes and Bacteria, underscores the profound influence of evolutionary pressures and distinct life histories on genetic information processing. The identification of outlier species, exhibiting distinct informational profiles, highlights specific instances where unusual lifestyles or ecological niches may have driven unique adaptations in codon usage and underlying informational dependencies. These informational patterns offer a complementary perspective to traditional phylogenetic analyses, further revealing a hierarchical organization of informational dependencies among codon components that sheds light on the intricate grammar of genetic information. We also rigorously investigated the relationship between GC-content and our informational measures, concluding that these entropy measures provide valuable insights that cannot be obtained from GC-content alone. This work not only offers a novel framework for quantifying informational properties of codon usage but also reveals previously unappreciated aspects of how genetic information is encoded and processed across life’s domains.
ISSN:1932-6203
DOI:10.1371/journal.pone.0335824
Fuente:Health & Medical Collection