Cargando…
Multiple similarly effective solutions exist for biomedical feature selection and classification problems
Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine lea...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5634418/ https://www.ncbi.nlm.nih.gov/pubmed/28993656 http://dx.doi.org/10.1038/s41598-017-13184-8 |
_version_ | 1783270086158581760 |
---|---|
author | Liu, Jiamei Xu, Cheng Yang, Weifeng Shu, Yayun Zheng, Weiwei Zhou, Fengfeng |
author_facet | Liu, Jiamei Xu, Cheng Yang, Weifeng Shu, Yayun Zheng, Weiwei Zhou, Fengfeng |
author_sort | Liu, Jiamei |
collection | PubMed |
description | Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones. |
format | Online Article Text |
id | pubmed-5634418 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-56344182017-10-18 Multiple similarly effective solutions exist for biomedical feature selection and classification problems Liu, Jiamei Xu, Cheng Yang, Weifeng Shu, Yayun Zheng, Weiwei Zhou, Fengfeng Sci Rep Article Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones. Nature Publishing Group UK 2017-10-09 /pmc/articles/PMC5634418/ /pubmed/28993656 http://dx.doi.org/10.1038/s41598-017-13184-8 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Liu, Jiamei Xu, Cheng Yang, Weifeng Shu, Yayun Zheng, Weiwei Zhou, Fengfeng Multiple similarly effective solutions exist for biomedical feature selection and classification problems |
title | Multiple similarly effective solutions exist for biomedical feature selection and classification problems |
title_full | Multiple similarly effective solutions exist for biomedical feature selection and classification problems |
title_fullStr | Multiple similarly effective solutions exist for biomedical feature selection and classification problems |
title_full_unstemmed | Multiple similarly effective solutions exist for biomedical feature selection and classification problems |
title_short | Multiple similarly effective solutions exist for biomedical feature selection and classification problems |
title_sort | multiple similarly effective solutions exist for biomedical feature selection and classification problems |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5634418/ https://www.ncbi.nlm.nih.gov/pubmed/28993656 http://dx.doi.org/10.1038/s41598-017-13184-8 |
work_keys_str_mv | AT liujiamei multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems AT xucheng multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems AT yangweifeng multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems AT shuyayun multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems AT zhengweiwei multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems AT zhoufengfeng multiplesimilarlyeffectivesolutionsexistforbiomedicalfeatureselectionandclassificationproblems |