Cargando…

RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features

To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, t...

Descripción completa

Detalles Bibliográficos
Autores principales: Hassan, Arfa, Alkhalifah, Tamim, Alturise, Fahad, Khan, Yaser Daanial
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9776995/
https://www.ncbi.nlm.nih.gov/pubmed/36553042
http://dx.doi.org/10.3390/diagnostics12123036
_version_ 1784855995750547456
author Hassan, Arfa
Alkhalifah, Tamim
Alturise, Fahad
Khan, Yaser Daanial
author_facet Hassan, Arfa
Alkhalifah, Tamim
Alturise, Fahad
Khan, Yaser Daanial
author_sort Hassan, Arfa
collection PubMed
description To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, the identification of cancer driver gene mutation through experimental mechanisms could be an expensive, slow, and laborious job. The advancement of computational strategies that could help in the early prediction of cancer growth effectively and accurately is thus highly needed towards early diagnoses and a decrease in the mortality rates due to this disease. Herein, we aim to predict clear cell renal carcinoma (RCCC) at the level of the genes, using the genomic sequences. The dataset was taken from IntOgen Cancer Mutations Browser and all genes’ standard DNA sequences were taken from the NCBI database. Using cancer-associated information of mutation from INTOGEN, the benchmark dataset was generated by creating the mutations in original sequences. After extensive feature extraction, the dataset was used to train ANN+ Hist Gradient boosting that could perform the classification of RCCC genes, other cancer-associated genes, and non-cancerous/unknown (non-tumor driver) genes. Through an independent dataset test, the accuracy observed was 83%, whereas the 10-fold cross-validation and Jackknife validation yielded 98% and 100% accurate results, respectively. The proposed predictor RCCC_Pred is able to identify RCCC genes with high accuracy and efficiency and can help scientists/researchers easily predict and diagnose cancer at its early stages.
format Online
Article
Text
id pubmed-9776995
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97769952022-12-23 RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features Hassan, Arfa Alkhalifah, Tamim Alturise, Fahad Khan, Yaser Daanial Diagnostics (Basel) Article To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, the identification of cancer driver gene mutation through experimental mechanisms could be an expensive, slow, and laborious job. The advancement of computational strategies that could help in the early prediction of cancer growth effectively and accurately is thus highly needed towards early diagnoses and a decrease in the mortality rates due to this disease. Herein, we aim to predict clear cell renal carcinoma (RCCC) at the level of the genes, using the genomic sequences. The dataset was taken from IntOgen Cancer Mutations Browser and all genes’ standard DNA sequences were taken from the NCBI database. Using cancer-associated information of mutation from INTOGEN, the benchmark dataset was generated by creating the mutations in original sequences. After extensive feature extraction, the dataset was used to train ANN+ Hist Gradient boosting that could perform the classification of RCCC genes, other cancer-associated genes, and non-cancerous/unknown (non-tumor driver) genes. Through an independent dataset test, the accuracy observed was 83%, whereas the 10-fold cross-validation and Jackknife validation yielded 98% and 100% accurate results, respectively. The proposed predictor RCCC_Pred is able to identify RCCC genes with high accuracy and efficiency and can help scientists/researchers easily predict and diagnose cancer at its early stages. MDPI 2022-12-03 /pmc/articles/PMC9776995/ /pubmed/36553042 http://dx.doi.org/10.3390/diagnostics12123036 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hassan, Arfa
Alkhalifah, Tamim
Alturise, Fahad
Khan, Yaser Daanial
RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
title RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
title_full RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
title_fullStr RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
title_full_unstemmed RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
title_short RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
title_sort rccc_pred: a novel method for sequence-based identification of renal clear cell carcinoma genes through dna mutations and a blend of features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9776995/
https://www.ncbi.nlm.nih.gov/pubmed/36553042
http://dx.doi.org/10.3390/diagnostics12123036
work_keys_str_mv AT hassanarfa rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures
AT alkhalifahtamim rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures
AT alturisefahad rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures
AT khanyaserdaanial rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures