Cargando…
Fizzy: feature subset selection for metagenomics
BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differe...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634798/ https://www.ncbi.nlm.nih.gov/pubmed/26538306 http://dx.doi.org/10.1186/s12859-015-0793-8 |
_version_ | 1782399420645507072 |
---|---|
author | Ditzler, Gregory Morrison, J. Calvin Lan, Yemin Rosen, Gail L. |
author_facet | Ditzler, Gregory Morrison, J. Calvin Lan, Yemin Rosen, Gail L. |
author_sort | Ditzler, Gregory |
collection | PubMed |
description | BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy. |
format | Online Article Text |
id | pubmed-4634798 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-46347982015-11-06 Fizzy: feature subset selection for metagenomics Ditzler, Gregory Morrison, J. Calvin Lan, Yemin Rosen, Gail L. BMC Bioinformatics Software BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy. BioMed Central 2015-11-04 /pmc/articles/PMC4634798/ /pubmed/26538306 http://dx.doi.org/10.1186/s12859-015-0793-8 Text en © Ditzler et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Ditzler, Gregory Morrison, J. Calvin Lan, Yemin Rosen, Gail L. Fizzy: feature subset selection for metagenomics |
title | Fizzy: feature subset selection for metagenomics |
title_full | Fizzy: feature subset selection for metagenomics |
title_fullStr | Fizzy: feature subset selection for metagenomics |
title_full_unstemmed | Fizzy: feature subset selection for metagenomics |
title_short | Fizzy: feature subset selection for metagenomics |
title_sort | fizzy: feature subset selection for metagenomics |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634798/ https://www.ncbi.nlm.nih.gov/pubmed/26538306 http://dx.doi.org/10.1186/s12859-015-0793-8 |
work_keys_str_mv | AT ditzlergregory fizzyfeaturesubsetselectionformetagenomics AT morrisonjcalvin fizzyfeaturesubsetselectionformetagenomics AT lanyemin fizzyfeaturesubsetselectionformetagenomics AT rosengaill fizzyfeaturesubsetselectionformetagenomics |