Cargando…

Translational utility of a hierarchical classification strategy in biomolecular data analytics

Hierarchical classification (HC) stratifies and classifies data from broad classes into more specific classes. Unlike commonly used data classification strategies, this enables the probabilistic prediction of unknown classes at different levels, minimizing the burden of incomplete databases. Despite...

Descripción completa

Detalles Bibliográficos
Autores principales: Galea, Dieter, Inglese, Paolo, Cammack, Lidia, Strittmatter, Nicole, Rebec, Monica, Mirnezami, Reza, Laponogov, Ivan, Kinross, James, Nicholson, Jeremy, Takats, Zoltan, Veselkov, Kirill A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5670129/
https://www.ncbi.nlm.nih.gov/pubmed/29101330
http://dx.doi.org/10.1038/s41598-017-14092-7
_version_ 1783275955443204096
author Galea, Dieter
Inglese, Paolo
Cammack, Lidia
Strittmatter, Nicole
Rebec, Monica
Mirnezami, Reza
Laponogov, Ivan
Kinross, James
Nicholson, Jeremy
Takats, Zoltan
Veselkov, Kirill A.
author_facet Galea, Dieter
Inglese, Paolo
Cammack, Lidia
Strittmatter, Nicole
Rebec, Monica
Mirnezami, Reza
Laponogov, Ivan
Kinross, James
Nicholson, Jeremy
Takats, Zoltan
Veselkov, Kirill A.
author_sort Galea, Dieter
collection PubMed
description Hierarchical classification (HC) stratifies and classifies data from broad classes into more specific classes. Unlike commonly used data classification strategies, this enables the probabilistic prediction of unknown classes at different levels, minimizing the burden of incomplete databases. Despite these advantages, its translational application in biomedical sciences has been limited. We describe and demonstrate the implementation of a HC approach for “omics-driven” classification of 15 bacterial species at various taxonomic levels achieving 90–100% accuracy, and 9 cancer types into morphological types and 35 subtypes with 99% and 76% accuracy, respectively. Unknown bacterial species were probabilistically assigned with 100% accuracy to their respective genus or family using mass spectra (n = 284). Cancer types were predicted by mRNA data (n = 1960) for most subtypes with 95–100% accuracy. This has high relevance in clinical practice where complete datasets are difficult to compile with the continuous evolution of diseases and emergence of new strains, yet prediction of unknown classes, such as bacterial species, at upper hierarchy levels may be sufficient to initiate antimicrobial therapy. The algorithms presented here can be directly translated into clinical-use with any quantitative data, and have broad application potential, from unlabeled sample identification, to hierarchical feature selection, and discovery of new taxonomic variants.
format Online
Article
Text
id pubmed-5670129
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-56701292017-11-15 Translational utility of a hierarchical classification strategy in biomolecular data analytics Galea, Dieter Inglese, Paolo Cammack, Lidia Strittmatter, Nicole Rebec, Monica Mirnezami, Reza Laponogov, Ivan Kinross, James Nicholson, Jeremy Takats, Zoltan Veselkov, Kirill A. Sci Rep Article Hierarchical classification (HC) stratifies and classifies data from broad classes into more specific classes. Unlike commonly used data classification strategies, this enables the probabilistic prediction of unknown classes at different levels, minimizing the burden of incomplete databases. Despite these advantages, its translational application in biomedical sciences has been limited. We describe and demonstrate the implementation of a HC approach for “omics-driven” classification of 15 bacterial species at various taxonomic levels achieving 90–100% accuracy, and 9 cancer types into morphological types and 35 subtypes with 99% and 76% accuracy, respectively. Unknown bacterial species were probabilistically assigned with 100% accuracy to their respective genus or family using mass spectra (n = 284). Cancer types were predicted by mRNA data (n = 1960) for most subtypes with 95–100% accuracy. This has high relevance in clinical practice where complete datasets are difficult to compile with the continuous evolution of diseases and emergence of new strains, yet prediction of unknown classes, such as bacterial species, at upper hierarchy levels may be sufficient to initiate antimicrobial therapy. The algorithms presented here can be directly translated into clinical-use with any quantitative data, and have broad application potential, from unlabeled sample identification, to hierarchical feature selection, and discovery of new taxonomic variants. Nature Publishing Group UK 2017-11-03 /pmc/articles/PMC5670129/ /pubmed/29101330 http://dx.doi.org/10.1038/s41598-017-14092-7 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Galea, Dieter
Inglese, Paolo
Cammack, Lidia
Strittmatter, Nicole
Rebec, Monica
Mirnezami, Reza
Laponogov, Ivan
Kinross, James
Nicholson, Jeremy
Takats, Zoltan
Veselkov, Kirill A.
Translational utility of a hierarchical classification strategy in biomolecular data analytics
title Translational utility of a hierarchical classification strategy in biomolecular data analytics
title_full Translational utility of a hierarchical classification strategy in biomolecular data analytics
title_fullStr Translational utility of a hierarchical classification strategy in biomolecular data analytics
title_full_unstemmed Translational utility of a hierarchical classification strategy in biomolecular data analytics
title_short Translational utility of a hierarchical classification strategy in biomolecular data analytics
title_sort translational utility of a hierarchical classification strategy in biomolecular data analytics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5670129/
https://www.ncbi.nlm.nih.gov/pubmed/29101330
http://dx.doi.org/10.1038/s41598-017-14092-7
work_keys_str_mv AT galeadieter translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT inglesepaolo translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT cammacklidia translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT strittmatternicole translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT rebecmonica translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT mirnezamireza translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT laponogovivan translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT kinrossjames translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT nicholsonjeremy translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT takatszoltan translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics
AT veselkovkirilla translationalutilityofahierarchicalclassificationstrategyinbiomoleculardataanalytics