Cargando…

Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms

Cancer is a complex disease caused by genomic and epigenetic alterations; hence, identifying meaningful cancer drivers is an important and challenging task. Most studies have detected cancer drivers with mutated traits, while few studies consider multiple omics characteristics as important factors....

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Feng, Chu, Xin, Dai, Lingyun, Wang, Juan, Liu, Jinxing, Shang, Junliang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9141966/
https://www.ncbi.nlm.nih.gov/pubmed/35627101
http://dx.doi.org/10.3390/genes13050716
_version_ 1784715471879143424
author Li, Feng
Chu, Xin
Dai, Lingyun
Wang, Juan
Liu, Jinxing
Shang, Junliang
author_facet Li, Feng
Chu, Xin
Dai, Lingyun
Wang, Juan
Liu, Jinxing
Shang, Junliang
author_sort Li, Feng
collection PubMed
description Cancer is a complex disease caused by genomic and epigenetic alterations; hence, identifying meaningful cancer drivers is an important and challenging task. Most studies have detected cancer drivers with mutated traits, while few studies consider multiple omics characteristics as important factors. In this study, we present a framework to analyze the effects of multi-omics characteristics on the identification of driver genes. We utilize four machine learning algorithms within this framework to detect cancer driver genes in pan-cancer data, including 75 characteristics among 19,636 genes. The 75 features are divided into four types and analyzed using Kullback–Leibler divergence based on CGC genes and non-CGC genes. We detect cancer driver genes in two different ways. One is to detect driver genes from a single feature type, while the other is from the top N features. The first analysis denotes that the mutational features are the best characteristics. The second analysis reveals that the top 45 features are the most effective feature combinations and superior to the mutational features. The top 45 features not only contain mutational features but also three other types of features. Therefore, our study extends the detection of cancer driver genes and provides a more comprehensive understanding of cancer mechanisms.
format Online
Article
Text
id pubmed-9141966
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-91419662022-05-28 Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms Li, Feng Chu, Xin Dai, Lingyun Wang, Juan Liu, Jinxing Shang, Junliang Genes (Basel) Article Cancer is a complex disease caused by genomic and epigenetic alterations; hence, identifying meaningful cancer drivers is an important and challenging task. Most studies have detected cancer drivers with mutated traits, while few studies consider multiple omics characteristics as important factors. In this study, we present a framework to analyze the effects of multi-omics characteristics on the identification of driver genes. We utilize four machine learning algorithms within this framework to detect cancer driver genes in pan-cancer data, including 75 characteristics among 19,636 genes. The 75 features are divided into four types and analyzed using Kullback–Leibler divergence based on CGC genes and non-CGC genes. We detect cancer driver genes in two different ways. One is to detect driver genes from a single feature type, while the other is from the top N features. The first analysis denotes that the mutational features are the best characteristics. The second analysis reveals that the top 45 features are the most effective feature combinations and superior to the mutational features. The top 45 features not only contain mutational features but also three other types of features. Therefore, our study extends the detection of cancer driver genes and provides a more comprehensive understanding of cancer mechanisms. MDPI 2022-04-19 /pmc/articles/PMC9141966/ /pubmed/35627101 http://dx.doi.org/10.3390/genes13050716 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Feng
Chu, Xin
Dai, Lingyun
Wang, Juan
Liu, Jinxing
Shang, Junliang
Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
title Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
title_full Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
title_fullStr Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
title_full_unstemmed Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
title_short Effects of Multi-Omics Characteristics on Identification of Driver Genes Using Machine Learning Algorithms
title_sort effects of multi-omics characteristics on identification of driver genes using machine learning algorithms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9141966/
https://www.ncbi.nlm.nih.gov/pubmed/35627101
http://dx.doi.org/10.3390/genes13050716
work_keys_str_mv AT lifeng effectsofmultiomicscharacteristicsonidentificationofdrivergenesusingmachinelearningalgorithms
AT chuxin effectsofmultiomicscharacteristicsonidentificationofdrivergenesusingmachinelearningalgorithms
AT dailingyun effectsofmultiomicscharacteristicsonidentificationofdrivergenesusingmachinelearningalgorithms
AT wangjuan effectsofmultiomicscharacteristicsonidentificationofdrivergenesusingmachinelearningalgorithms
AT liujinxing effectsofmultiomicscharacteristicsonidentificationofdrivergenesusingmachinelearningalgorithms
AT shangjunliang effectsofmultiomicscharacteristicsonidentificationofdrivergenesusingmachinelearningalgorithms