Cargando…

A systematic search for discriminating sites in the 16S ribosomal RNA gene

BACKGROUND: The 16S rRNA is by far the most common genomic marker used for prokaryotic classification, and has been used extensively in metagenomic studies over recent years. Along the 16S gene there are regions with more or less variation across the kingdom of bacteria. Nine variable regions have b...

Descripción completa

Detalles Bibliográficos
Autores principales: Vinje, Hilde, Almøy, Trygve, Liland, Kristian Hovde, Snipen, Lars
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3910680/
https://www.ncbi.nlm.nih.gov/pubmed/24467869
http://dx.doi.org/10.1186/2042-5783-4-2
_version_ 1782301992028209152
author Vinje, Hilde
Almøy, Trygve
Liland, Kristian Hovde
Snipen, Lars
author_facet Vinje, Hilde
Almøy, Trygve
Liland, Kristian Hovde
Snipen, Lars
author_sort Vinje, Hilde
collection PubMed
description BACKGROUND: The 16S rRNA is by far the most common genomic marker used for prokaryotic classification, and has been used extensively in metagenomic studies over recent years. Along the 16S gene there are regions with more or less variation across the kingdom of bacteria. Nine variable regions have been identified, flanked by more conserved parts of the sequence. It has been stated that the discriminatory power of the 16S marker lies in these variable regions. In the present study we wanted to examine this more closely, and used a supervised learning method to search systematically for sites that contribute to correct classification at either the phylum or genus level. RESULTS: When classifying phyla the site selection algorithm located 50 discriminative sites. These were scattered over most of the alignments and only around half of them were located in the variable regions. The selected sites did, however, have an entropy significantly larger than expected, meaning they are sites of large variation. We found that the discriminative sites typically have a large entropy compared to their closest neighbours along the alignments. When classifying genera the site selection algorithm needed around 80% of the sites in the 16S gene before the classification error reached a minimum. This means that all variation, in both variable and conserved regions, is needed in order to separate genera. CONCLUSIONS: Our findings does not support the statement that the discriminative power of the 16S gene is located only in the variable regions. Variable regions are important, but just as many discriminative sites are found in the more conserved parts. The discriminative power is typically found in sites of large variation located inside shorter regions of higher conservation.
format Online
Article
Text
id pubmed-3910680
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39106802014-02-14 A systematic search for discriminating sites in the 16S ribosomal RNA gene Vinje, Hilde Almøy, Trygve Liland, Kristian Hovde Snipen, Lars Microb Inform Exp Research BACKGROUND: The 16S rRNA is by far the most common genomic marker used for prokaryotic classification, and has been used extensively in metagenomic studies over recent years. Along the 16S gene there are regions with more or less variation across the kingdom of bacteria. Nine variable regions have been identified, flanked by more conserved parts of the sequence. It has been stated that the discriminatory power of the 16S marker lies in these variable regions. In the present study we wanted to examine this more closely, and used a supervised learning method to search systematically for sites that contribute to correct classification at either the phylum or genus level. RESULTS: When classifying phyla the site selection algorithm located 50 discriminative sites. These were scattered over most of the alignments and only around half of them were located in the variable regions. The selected sites did, however, have an entropy significantly larger than expected, meaning they are sites of large variation. We found that the discriminative sites typically have a large entropy compared to their closest neighbours along the alignments. When classifying genera the site selection algorithm needed around 80% of the sites in the 16S gene before the classification error reached a minimum. This means that all variation, in both variable and conserved regions, is needed in order to separate genera. CONCLUSIONS: Our findings does not support the statement that the discriminative power of the 16S gene is located only in the variable regions. Variable regions are important, but just as many discriminative sites are found in the more conserved parts. The discriminative power is typically found in sites of large variation located inside shorter regions of higher conservation. BioMed Central 2014-01-27 /pmc/articles/PMC3910680/ /pubmed/24467869 http://dx.doi.org/10.1186/2042-5783-4-2 Text en Copyright © 2014 Vinje et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Vinje, Hilde
Almøy, Trygve
Liland, Kristian Hovde
Snipen, Lars
A systematic search for discriminating sites in the 16S ribosomal RNA gene
title A systematic search for discriminating sites in the 16S ribosomal RNA gene
title_full A systematic search for discriminating sites in the 16S ribosomal RNA gene
title_fullStr A systematic search for discriminating sites in the 16S ribosomal RNA gene
title_full_unstemmed A systematic search for discriminating sites in the 16S ribosomal RNA gene
title_short A systematic search for discriminating sites in the 16S ribosomal RNA gene
title_sort systematic search for discriminating sites in the 16s ribosomal rna gene
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3910680/
https://www.ncbi.nlm.nih.gov/pubmed/24467869
http://dx.doi.org/10.1186/2042-5783-4-2
work_keys_str_mv AT vinjehilde asystematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT almøytrygve asystematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT lilandkristianhovde asystematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT snipenlars asystematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT vinjehilde systematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT almøytrygve systematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT lilandkristianhovde systematicsearchfordiscriminatingsitesinthe16sribosomalrnagene
AT snipenlars systematicsearchfordiscriminatingsitesinthe16sribosomalrnagene