Cargando…

Fizzy: feature subset selection for metagenomics

BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differe...

Descripción completa

Detalles Bibliográficos
Autores principales: Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, Rosen, Gail L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634798/
https://www.ncbi.nlm.nih.gov/pubmed/26538306
http://dx.doi.org/10.1186/s12859-015-0793-8
_version_ 1782399420645507072
author Ditzler, Gregory
Morrison, J. Calvin
Lan, Yemin
Rosen, Gail L.
author_facet Ditzler, Gregory
Morrison, J. Calvin
Lan, Yemin
Rosen, Gail L.
author_sort Ditzler, Gregory
collection PubMed
description BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.
format Online
Article
Text
id pubmed-4634798
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46347982015-11-06 Fizzy: feature subset selection for metagenomics Ditzler, Gregory Morrison, J. Calvin Lan, Yemin Rosen, Gail L. BMC Bioinformatics Software BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy. BioMed Central 2015-11-04 /pmc/articles/PMC4634798/ /pubmed/26538306 http://dx.doi.org/10.1186/s12859-015-0793-8 Text en © Ditzler et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Ditzler, Gregory
Morrison, J. Calvin
Lan, Yemin
Rosen, Gail L.
Fizzy: feature subset selection for metagenomics
title Fizzy: feature subset selection for metagenomics
title_full Fizzy: feature subset selection for metagenomics
title_fullStr Fizzy: feature subset selection for metagenomics
title_full_unstemmed Fizzy: feature subset selection for metagenomics
title_short Fizzy: feature subset selection for metagenomics
title_sort fizzy: feature subset selection for metagenomics
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634798/
https://www.ncbi.nlm.nih.gov/pubmed/26538306
http://dx.doi.org/10.1186/s12859-015-0793-8
work_keys_str_mv AT ditzlergregory fizzyfeaturesubsetselectionformetagenomics
AT morrisonjcalvin fizzyfeaturesubsetselectionformetagenomics
AT lanyemin fizzyfeaturesubsetselectionformetagenomics
AT rosengaill fizzyfeaturesubsetselectionformetagenomics