Cargando…

Multiple similarly effective solutions exist for biomedical feature selection and classification problems

Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine lea...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Jiamei, Xu, Cheng, Yang, Weifeng, Shu, Yayun, Zheng, Weiwei, Zhou, Fengfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5634418/
https://www.ncbi.nlm.nih.gov/pubmed/28993656
http://dx.doi.org/10.1038/s41598-017-13184-8
_version_ 1783270086158581760
author Liu, Jiamei
Xu, Cheng
Yang, Weifeng
Shu, Yayun
Zheng, Weiwei
Zhou, Fengfeng
author_facet Liu, Jiamei
Xu, Cheng
Yang, Weifeng
Shu, Yayun
Zheng, Weiwei
Zhou, Fengfeng
author_sort Liu, Jiamei
collection PubMed
description Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones.
format Online
Article
Text
id pubmed-5634418
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-56344182017-10-18 Multiple similarly effective solutions exist for biomedical feature selection and classification problems Liu, Jiamei Xu, Cheng Yang, Weifeng Shu, Yayun Zheng, Weiwei Zhou, Fengfeng Sci Rep Article Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones. Nature Publishing Group UK 2017-10-09 /pmc/articles/PMC5634418/ /pubmed/28993656 http://dx.doi.org/10.1038/s41598-017-13184-8 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Liu, Jiamei
Xu, Cheng
Yang, Weifeng
Shu, Yayun
Zheng, Weiwei
Zhou, Fengfeng
Multiple similarly effective solutions exist for biomedical feature selection and classification problems
title Multiple similarly effective solutions exist for biomedical feature selection and classification problems
title_full Multiple similarly effective solutions exist for biomedical feature selection and classification problems
title_fullStr Multiple similarly effective solutions exist for biomedical feature selection and classification problems
title_full_unstemmed Multiple similarly effective solutions exist for biomedical feature selection and classification problems
title_short Multiple similarly effective solutions exist for biomedical feature selection and classification problems
title_sort multiple similarly effective solutions exist for biomedical feature selection and classification problems
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5634418/
https://www.ncbi.nlm.nih.gov/pubmed/28993656
http://dx.doi.org/10.1038/s41598-017-13184-8
work_keys_str_mv AT liujiamei multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems
AT xucheng multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems
AT yangweifeng multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems
AT shuyayun multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems
AT zhengweiwei multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems
AT zhoufengfeng multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems