Cargando…

Metagenomic ventures into outer sequence space

Sequencing DNA or RNA directly from the environment often results in many sequencing reads that have no homologs in the database. These are referred to as “unknowns," and reflect the vast unexplored microbial sequence space of our biosphere, also known as “biological dark matter." However,...

Descripción completa

Detalles Bibliográficos
Autor principal: Dutilh, Bas E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Taylor & Francis 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4588555/
https://www.ncbi.nlm.nih.gov/pubmed/26458273
http://dx.doi.org/10.4161/21597081.2014.979664
Descripción
Sumario:Sequencing DNA or RNA directly from the environment often results in many sequencing reads that have no homologs in the database. These are referred to as “unknowns," and reflect the vast unexplored microbial sequence space of our biosphere, also known as “biological dark matter." However, unknowns also exist because metagenomic datasets are not optimally mined. There is a pressure on researchers to publish and move on, and the unknown sequences are often left for what they are, and conclusions drawn based on reads with annotated homologs. This can cause abundant and widespread genomes to be overlooked, such as the recently discovered human gut bacteriophage crAssphage. The unknowns may be enriched for bacteriophage sequences, the most abundant and genetically diverse component of the biosphere and of sequence space. However, it remains an open question, what is the actual size of biological sequence space? The de novo assembly of shotgun metagenomes is the most powerful tool to address this question.