Cargando…

Unsupervised encoding selection through ensemble pruning for biomedical classification

BACKGROUND: Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leverag...

Descripción completa

Detalles Bibliográficos
Autores principales:	Spänig, Sebastian, Michel, Alexander, Heider, Dominik
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2023
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10018861/ https://www.ncbi.nlm.nih.gov/pubmed/36927546 http://dx.doi.org/10.1186/s13040-022-00317-7

_version_	1784907900366356480
author	Spänig, Sebastian Michel, Alexander Heider, Dominik
author_facet	Spänig, Sebastian Michel, Alexander Heider, Dominik
author_sort	Spänig, Sebastian
collection	PubMed
description	BACKGROUND: Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leverage artificial intelligence to circumvent the cumbersome, wet-lab-based identification and automate the detection of promising candidates. However, the prediction of a peptide’s function is not limited to antimicrobial efficiency. To date, multiple studies successfully classified additional properties, e.g., antiviral or cell-penetrating effects. In this light, ensemble classifiers are employed aiming to further improve the prediction. Although we recently presented a workflow to significantly diminish the initial encoding choice, an entire unsupervised encoding selection, considering various machine learning models, is still lacking. RESULTS: We developed a workflow, automatically selecting encodings and generating classifier ensembles by employing sophisticated pruning methods. We observed that the Pareto frontier pruning is a good method to create encoding ensembles for the datasets at hand. In addition, encodings combined with the Decision Tree classifier as the base model are often superior. However, our results also demonstrate that none of the ensemble building techniques is outstanding for all datasets. CONCLUSION: The workflow conducts multiple pruning methods to evaluate ensemble classifiers composed from a wide range of peptide encodings and base models. Consequently, researchers can use the workflow for unsupervised encoding selection and ensemble creation. Ultimately, the extensible workflow can be used as a plugin for the PEPTIDE REACToR, further establishing it as a versatile tool in the domain. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-022-00317-7.
format	Online Article Text
id	pubmed-10018861
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-100188612023-03-17 Unsupervised encoding selection through ensemble pruning for biomedical classification Spänig, Sebastian Michel, Alexander Heider, Dominik BioData Min Methodology BACKGROUND: Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leverage artificial intelligence to circumvent the cumbersome, wet-lab-based identification and automate the detection of promising candidates. However, the prediction of a peptide’s function is not limited to antimicrobial efficiency. To date, multiple studies successfully classified additional properties, e.g., antiviral or cell-penetrating effects. In this light, ensemble classifiers are employed aiming to further improve the prediction. Although we recently presented a workflow to significantly diminish the initial encoding choice, an entire unsupervised encoding selection, considering various machine learning models, is still lacking. RESULTS: We developed a workflow, automatically selecting encodings and generating classifier ensembles by employing sophisticated pruning methods. We observed that the Pareto frontier pruning is a good method to create encoding ensembles for the datasets at hand. In addition, encodings combined with the Decision Tree classifier as the base model are often superior. However, our results also demonstrate that none of the ensemble building techniques is outstanding for all datasets. CONCLUSION: The workflow conducts multiple pruning methods to evaluate ensemble classifiers composed from a wide range of peptide encodings and base models. Consequently, researchers can use the workflow for unsupervised encoding selection and ensemble creation. Ultimately, the extensible workflow can be used as a plugin for the PEPTIDE REACToR, further establishing it as a versatile tool in the domain. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-022-00317-7. BioMed Central 2023-03-16 /pmc/articles/PMC10018861/ /pubmed/36927546 http://dx.doi.org/10.1186/s13040-022-00317-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Methodology Spänig, Sebastian Michel, Alexander Heider, Dominik Unsupervised encoding selection through ensemble pruning for biomedical classification
title	Unsupervised encoding selection through ensemble pruning for biomedical classification
title_full	Unsupervised encoding selection through ensemble pruning for biomedical classification
title_fullStr	Unsupervised encoding selection through ensemble pruning for biomedical classification
title_full_unstemmed	Unsupervised encoding selection through ensemble pruning for biomedical classification
title_short	Unsupervised encoding selection through ensemble pruning for biomedical classification
title_sort	unsupervised encoding selection through ensemble pruning for biomedical classification
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10018861/ https://www.ncbi.nlm.nih.gov/pubmed/36927546 http://dx.doi.org/10.1186/s13040-022-00317-7
work_keys_str_mv	AT spanigsebastian unsupervisedencodingselectionthroughensemblepruningforbiomedicalclassification AT michelalexander unsupervisedencodingselectionthroughensemblepruningforbiomedicalclassification AT heiderdominik unsupervisedencodingselectionthroughensemblepruningforbiomedicalclassification

Unsupervised encoding selection through ensemble pruning for biomedical classification

Ejemplares similares