Cargando…
RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features
To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9776995/ https://www.ncbi.nlm.nih.gov/pubmed/36553042 http://dx.doi.org/10.3390/diagnostics12123036 |
_version_ | 1784855995750547456 |
---|---|
author | Hassan, Arfa Alkhalifah, Tamim Alturise, Fahad Khan, Yaser Daanial |
author_facet | Hassan, Arfa Alkhalifah, Tamim Alturise, Fahad Khan, Yaser Daanial |
author_sort | Hassan, Arfa |
collection | PubMed |
description | To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, the identification of cancer driver gene mutation through experimental mechanisms could be an expensive, slow, and laborious job. The advancement of computational strategies that could help in the early prediction of cancer growth effectively and accurately is thus highly needed towards early diagnoses and a decrease in the mortality rates due to this disease. Herein, we aim to predict clear cell renal carcinoma (RCCC) at the level of the genes, using the genomic sequences. The dataset was taken from IntOgen Cancer Mutations Browser and all genes’ standard DNA sequences were taken from the NCBI database. Using cancer-associated information of mutation from INTOGEN, the benchmark dataset was generated by creating the mutations in original sequences. After extensive feature extraction, the dataset was used to train ANN+ Hist Gradient boosting that could perform the classification of RCCC genes, other cancer-associated genes, and non-cancerous/unknown (non-tumor driver) genes. Through an independent dataset test, the accuracy observed was 83%, whereas the 10-fold cross-validation and Jackknife validation yielded 98% and 100% accurate results, respectively. The proposed predictor RCCC_Pred is able to identify RCCC genes with high accuracy and efficiency and can help scientists/researchers easily predict and diagnose cancer at its early stages. |
format | Online Article Text |
id | pubmed-9776995 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-97769952022-12-23 RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features Hassan, Arfa Alkhalifah, Tamim Alturise, Fahad Khan, Yaser Daanial Diagnostics (Basel) Article To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, the identification of cancer driver gene mutation through experimental mechanisms could be an expensive, slow, and laborious job. The advancement of computational strategies that could help in the early prediction of cancer growth effectively and accurately is thus highly needed towards early diagnoses and a decrease in the mortality rates due to this disease. Herein, we aim to predict clear cell renal carcinoma (RCCC) at the level of the genes, using the genomic sequences. The dataset was taken from IntOgen Cancer Mutations Browser and all genes’ standard DNA sequences were taken from the NCBI database. Using cancer-associated information of mutation from INTOGEN, the benchmark dataset was generated by creating the mutations in original sequences. After extensive feature extraction, the dataset was used to train ANN+ Hist Gradient boosting that could perform the classification of RCCC genes, other cancer-associated genes, and non-cancerous/unknown (non-tumor driver) genes. Through an independent dataset test, the accuracy observed was 83%, whereas the 10-fold cross-validation and Jackknife validation yielded 98% and 100% accurate results, respectively. The proposed predictor RCCC_Pred is able to identify RCCC genes with high accuracy and efficiency and can help scientists/researchers easily predict and diagnose cancer at its early stages. MDPI 2022-12-03 /pmc/articles/PMC9776995/ /pubmed/36553042 http://dx.doi.org/10.3390/diagnostics12123036 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hassan, Arfa Alkhalifah, Tamim Alturise, Fahad Khan, Yaser Daanial RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features |
title | RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features |
title_full | RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features |
title_fullStr | RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features |
title_full_unstemmed | RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features |
title_short | RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features |
title_sort | rccc_pred: a novel method for sequence-based identification of renal clear cell carcinoma genes through dna mutations and a blend of features |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9776995/ https://www.ncbi.nlm.nih.gov/pubmed/36553042 http://dx.doi.org/10.3390/diagnostics12123036 |
work_keys_str_mv | AT hassanarfa rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures AT alkhalifahtamim rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures AT alturisefahad rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures AT khanyaserdaanial rcccpredanovelmethodforsequencebasedidentificationofrenalclearcellcarcinomagenesthroughdnamutationsandablendoffeatures |