Cargando…
ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data
BACKGROUND: Finding significant genes or proteins from gene chip data for disease diagnosis and drug development is an important task. However, the challenge comes from the curse of the data dimension. It is of great significance to use machine learning methods to find important features from the da...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532312/ https://www.ncbi.nlm.nih.gov/pubmed/34686127 http://dx.doi.org/10.1186/s12859-021-04443-7 |
_version_ | 1784587041966653440 |
---|---|
author | Yu, Kun Xie, Weidong Wang, Linjie Li, Wei |
author_facet | Yu, Kun Xie, Weidong Wang, Linjie Li, Wei |
author_sort | Yu, Kun |
collection | PubMed |
description | BACKGROUND: Finding significant genes or proteins from gene chip data for disease diagnosis and drug development is an important task. However, the challenge comes from the curse of the data dimension. It is of great significance to use machine learning methods to find important features from the data and build an accurate classification model. RESULTS: The proposed method has proved superior to the published advanced hybrid feature selection method and traditional feature selection method on different public microarray data sets. In addition, the biomarkers selected using our method show a match to those provided by the cooperative hospital in a set of clinical cleft lip and palate data. METHOD: In this paper, a feature selection algorithm ILRC based on clustering and improved L1 regularization is proposed. The features are firstly clustered, and the redundant features in the sub-clusters are deleted. Then all the remaining features are iteratively evaluated using ILR. The final result is given according to the cumulative weight reordering. CONCLUSION: The proposed method can effectively remove redundant features. The algorithm’s output has high stability and classification accuracy, which can potentially select potential biomarkers. |
format | Online Article Text |
id | pubmed-8532312 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-85323122021-10-25 ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data Yu, Kun Xie, Weidong Wang, Linjie Li, Wei BMC Bioinformatics Research BACKGROUND: Finding significant genes or proteins from gene chip data for disease diagnosis and drug development is an important task. However, the challenge comes from the curse of the data dimension. It is of great significance to use machine learning methods to find important features from the data and build an accurate classification model. RESULTS: The proposed method has proved superior to the published advanced hybrid feature selection method and traditional feature selection method on different public microarray data sets. In addition, the biomarkers selected using our method show a match to those provided by the cooperative hospital in a set of clinical cleft lip and palate data. METHOD: In this paper, a feature selection algorithm ILRC based on clustering and improved L1 regularization is proposed. The features are firstly clustered, and the redundant features in the sub-clusters are deleted. Then all the remaining features are iteratively evaluated using ILR. The final result is given according to the cumulative weight reordering. CONCLUSION: The proposed method can effectively remove redundant features. The algorithm’s output has high stability and classification accuracy, which can potentially select potential biomarkers. BioMed Central 2021-10-22 /pmc/articles/PMC8532312/ /pubmed/34686127 http://dx.doi.org/10.1186/s12859-021-04443-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Yu, Kun Xie, Weidong Wang, Linjie Li, Wei ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data |
title | ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data |
title_full | ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data |
title_fullStr | ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data |
title_full_unstemmed | ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data |
title_short | ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data |
title_sort | ilrc: a hybrid biomarker discovery algorithm based on improved l1 regularization and clustering in microarray data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532312/ https://www.ncbi.nlm.nih.gov/pubmed/34686127 http://dx.doi.org/10.1186/s12859-021-04443-7 |
work_keys_str_mv | AT yukun ilrcahybridbiomarkerdiscoveryalgorithmbasedonimprovedl1regularizationandclusteringinmicroarraydata AT xieweidong ilrcahybridbiomarkerdiscoveryalgorithmbasedonimprovedl1regularizationandclusteringinmicroarraydata AT wanglinjie ilrcahybridbiomarkerdiscoveryalgorithmbasedonimprovedl1regularizationandclusteringinmicroarraydata AT liwei ilrcahybridbiomarkerdiscoveryalgorithmbasedonimprovedl1regularizationandclusteringinmicroarraydata |