Cargando…

The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data

Oligotyping is a novel, supervised computational method that classifies closely related sequences into “oligotypes” (OTs) based on subtle nucleotide variation (Eren et al., 2013). Its application to microbial datasets has helped reveal ecological patterns which are often hidden by the way sequence d...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ramette, Alban, Buttigieg, Pier Luigi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2014
Materias:	Microbiology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4231947/ https://www.ncbi.nlm.nih.gov/pubmed/25452747 http://dx.doi.org/10.3389/fmicb.2014.00601

_version_	1782344505745211392
author	Ramette, Alban Buttigieg, Pier Luigi
author_facet	Ramette, Alban Buttigieg, Pier Luigi
author_sort	Ramette, Alban
collection	PubMed
description	Oligotyping is a novel, supervised computational method that classifies closely related sequences into “oligotypes” (OTs) based on subtle nucleotide variation (Eren et al., 2013). Its application to microbial datasets has helped reveal ecological patterns which are often hidden by the way sequence data are currently clustered to define operational taxonomic units (OTUs). Here, we implemented the OT entropy decomposition procedure and its unsupervised version, Minimal Entropy Decomposition (MED; Eren et al., 2014c), in the statistical programming language and environment, R. The aim of this implementation is to facilitate the integration of computational routines, interactive statistical analyses, and visualization into a single framework. In addition, two complementary approaches are implemented: (1) An analytical method (the broken stick model) is proposed to help identify OTs of low abundance that could be generated by chance alone and (2) a one-pass profiling (OP) method, to efficiently identify those OTUs whose subsequent oligotyping would be most promising to be undertaken. These enhancements are especially useful for large datasets, where a manual screening of entropy analysis results and the creation of a full set of OTs may not be feasible. The package and procedures are illustrated by several tutorials and examples.
format	Online Article Text
id	pubmed-4231947
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-42319472014-12-01 The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data Ramette, Alban Buttigieg, Pier Luigi Front Microbiol Microbiology Oligotyping is a novel, supervised computational method that classifies closely related sequences into “oligotypes” (OTs) based on subtle nucleotide variation (Eren et al., 2013). Its application to microbial datasets has helped reveal ecological patterns which are often hidden by the way sequence data are currently clustered to define operational taxonomic units (OTUs). Here, we implemented the OT entropy decomposition procedure and its unsupervised version, Minimal Entropy Decomposition (MED; Eren et al., 2014c), in the statistical programming language and environment, R. The aim of this implementation is to facilitate the integration of computational routines, interactive statistical analyses, and visualization into a single framework. In addition, two complementary approaches are implemented: (1) An analytical method (the broken stick model) is proposed to help identify OTs of low abundance that could be generated by chance alone and (2) a one-pass profiling (OP) method, to efficiently identify those OTUs whose subsequent oligotyping would be most promising to be undertaken. These enhancements are especially useful for large datasets, where a manual screening of entropy analysis results and the creation of a full set of OTs may not be feasible. The package and procedures are illustrated by several tutorials and examples. Frontiers Media S.A. 2014-11-14 /pmc/articles/PMC4231947/ /pubmed/25452747 http://dx.doi.org/10.3389/fmicb.2014.00601 Text en Copyright © 2014 Ramette and Buttigieg. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Microbiology Ramette, Alban Buttigieg, Pier Luigi The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
title	The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
title_full	The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
title_fullStr	The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
title_full_unstemmed	The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
title_short	The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
title_sort	r package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data
topic	Microbiology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4231947/ https://www.ncbi.nlm.nih.gov/pubmed/25452747 http://dx.doi.org/10.3389/fmicb.2014.00601
work_keys_str_mv	AT ramettealban therpackageotu2otforimplementingtheentropydecompositionofnucleotidevariationinsequencedata AT buttigiegpierluigi therpackageotu2otforimplementingtheentropydecompositionofnucleotidevariationinsequencedata AT ramettealban rpackageotu2otforimplementingtheentropydecompositionofnucleotidevariationinsequencedata AT buttigiegpierluigi rpackageotu2otforimplementingtheentropydecompositionofnucleotidevariationinsequencedata

The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data

Ejemplares similares