Cargando…
Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions
Gene Expression is the process of determining the physical characteristics of living beings by generating the necessary proteins. Gene Expression takes place in two steps, translation and transcription. It is the flow of information from DNA to RNA with enzymes’ help, and the end product is proteins...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7758324/ https://www.ncbi.nlm.nih.gov/pubmed/33362861 http://dx.doi.org/10.3389/fgene.2020.603808 |
_version_ | 1783626917407096832 |
---|---|
author | Mahendran, Nivedhitha Durai Raj Vincent, P. M. Srinivasan, Kathiravan Chang, Chuan-Yu |
author_facet | Mahendran, Nivedhitha Durai Raj Vincent, P. M. Srinivasan, Kathiravan Chang, Chuan-Yu |
author_sort | Mahendran, Nivedhitha |
collection | PubMed |
description | Gene Expression is the process of determining the physical characteristics of living beings by generating the necessary proteins. Gene Expression takes place in two steps, translation and transcription. It is the flow of information from DNA to RNA with enzymes’ help, and the end product is proteins and other biochemical molecules. Many technologies can capture Gene Expression from the DNA or RNA. One such technique is Microarray DNA. Other than being expensive, the main issue with Microarray DNA is that it generates high-dimensional data with minimal sample size. The issue in handling such a heavyweight dataset is that the learning model will be over-fitted. This problem should be addressed by reducing the dimension of the data source to a considerable amount. In recent years, Machine Learning has gained popularity in the field of genomic studies. In the literature, many Machine Learning-based Gene Selection approaches have been discussed, which were proposed to improve dimensionality reduction precision. This paper does an extensive review of the various works done on Machine Learning-based gene selection in recent years, along with its performance analysis. The study categorizes various feature selection algorithms under Supervised, Unsupervised, and Semi-supervised learning. The works done in recent years to reduce the features for diagnosing tumors are discussed in detail. Furthermore, the performance of several discussed methods in the literature is analyzed. This study also lists out and briefly discusses the open issues in handling the high-dimension and less sample size data. |
format | Online Article Text |
id | pubmed-7758324 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-77583242020-12-25 Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions Mahendran, Nivedhitha Durai Raj Vincent, P. M. Srinivasan, Kathiravan Chang, Chuan-Yu Front Genet Genetics Gene Expression is the process of determining the physical characteristics of living beings by generating the necessary proteins. Gene Expression takes place in two steps, translation and transcription. It is the flow of information from DNA to RNA with enzymes’ help, and the end product is proteins and other biochemical molecules. Many technologies can capture Gene Expression from the DNA or RNA. One such technique is Microarray DNA. Other than being expensive, the main issue with Microarray DNA is that it generates high-dimensional data with minimal sample size. The issue in handling such a heavyweight dataset is that the learning model will be over-fitted. This problem should be addressed by reducing the dimension of the data source to a considerable amount. In recent years, Machine Learning has gained popularity in the field of genomic studies. In the literature, many Machine Learning-based Gene Selection approaches have been discussed, which were proposed to improve dimensionality reduction precision. This paper does an extensive review of the various works done on Machine Learning-based gene selection in recent years, along with its performance analysis. The study categorizes various feature selection algorithms under Supervised, Unsupervised, and Semi-supervised learning. The works done in recent years to reduce the features for diagnosing tumors are discussed in detail. Furthermore, the performance of several discussed methods in the literature is analyzed. This study also lists out and briefly discusses the open issues in handling the high-dimension and less sample size data. Frontiers Media S.A. 2020-12-10 /pmc/articles/PMC7758324/ /pubmed/33362861 http://dx.doi.org/10.3389/fgene.2020.603808 Text en Copyright © 2020 Mahendran, Durai Raj Vincent, Srinivasan and Chang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Mahendran, Nivedhitha Durai Raj Vincent, P. M. Srinivasan, Kathiravan Chang, Chuan-Yu Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions |
title | Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions |
title_full | Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions |
title_fullStr | Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions |
title_full_unstemmed | Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions |
title_short | Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions |
title_sort | machine learning based computational gene selection models: a survey, performance evaluation, open issues, and future research directions |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7758324/ https://www.ncbi.nlm.nih.gov/pubmed/33362861 http://dx.doi.org/10.3389/fgene.2020.603808 |
work_keys_str_mv | AT mahendrannivedhitha machinelearningbasedcomputationalgeneselectionmodelsasurveyperformanceevaluationopenissuesandfutureresearchdirections AT durairajvincentpm machinelearningbasedcomputationalgeneselectionmodelsasurveyperformanceevaluationopenissuesandfutureresearchdirections AT srinivasankathiravan machinelearningbasedcomputationalgeneselectionmodelsasurveyperformanceevaluationopenissuesandfutureresearchdirections AT changchuanyu machinelearningbasedcomputationalgeneselectionmodelsasurveyperformanceevaluationopenissuesandfutureresearchdirections |