Cargando…

A deep learning model predicts the presence of diverse cancer types using circulating tumor cells

Circulating tumor cells (CTCs) are cancer cells that detach from the primary tumor and intravasate into the bloodstream. Thus, non-invasive liquid biopsies are being used to analyze CTC-expressed genes to identify potential cancer biomarkers. In this regard, several studies have used gene expression...

Descripción completa

Detalles Bibliográficos
Autores principales: Albaradei, Somayah, Alganmi, Nofe, Albaradie, Abdulrahman, Alharbi, Eaman, Motwalli, Olaa, Thafar, Maha A., Gojobori, Takashi, Essack, Magbubah, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10689793/
https://www.ncbi.nlm.nih.gov/pubmed/38036622
http://dx.doi.org/10.1038/s41598-023-47805-2
_version_ 1785152424828207104
author Albaradei, Somayah
Alganmi, Nofe
Albaradie, Abdulrahman
Alharbi, Eaman
Motwalli, Olaa
Thafar, Maha A.
Gojobori, Takashi
Essack, Magbubah
Gao, Xin
author_facet Albaradei, Somayah
Alganmi, Nofe
Albaradie, Abdulrahman
Alharbi, Eaman
Motwalli, Olaa
Thafar, Maha A.
Gojobori, Takashi
Essack, Magbubah
Gao, Xin
author_sort Albaradei, Somayah
collection PubMed
description Circulating tumor cells (CTCs) are cancer cells that detach from the primary tumor and intravasate into the bloodstream. Thus, non-invasive liquid biopsies are being used to analyze CTC-expressed genes to identify potential cancer biomarkers. In this regard, several studies have used gene expression changes in blood to predict the presence of CTC and, consequently, cancer. However, the CTC mRNA data has not been used to develop a generic approach that indicates the presence of multiple cancer types. In this study, we developed such a generic approach. Briefly, we designed two computational workflows, one using the raw mRNA data and deep learning (DL) and the other exploiting five hub gene ranking algorithms (Degree, Maximum Neighborhood Component, Betweenness Centrality, Closeness Centrality, and Stress Centrality) with machine learning (ML). Both workflows aim to determine the top genes that best distinguish cancer types based on the CTC mRNA data. We demonstrate that our automated, robust DL framework (DNNraw) more accurately indicates the presence of multiple cancer types using the CTC gene expression data than multiple ML approaches. The DL approach achieved average precision of 0.9652, recall of 0.9640, f1-score of 0.9638 and overall accuracy of 0.9640. Furthermore, since we designed multiple approaches, we also provide a bioinformatics analysis of the gene commonly identified as top-ranked by the different methods. To our knowledge, this is the first study wherein a generic approach has been developed to predict the presence of multiple cancer types using raw CTC mRNA data, as opposed to other models that require a feature selection step.
format Online
Article
Text
id pubmed-10689793
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106897932023-12-02 A deep learning model predicts the presence of diverse cancer types using circulating tumor cells Albaradei, Somayah Alganmi, Nofe Albaradie, Abdulrahman Alharbi, Eaman Motwalli, Olaa Thafar, Maha A. Gojobori, Takashi Essack, Magbubah Gao, Xin Sci Rep Article Circulating tumor cells (CTCs) are cancer cells that detach from the primary tumor and intravasate into the bloodstream. Thus, non-invasive liquid biopsies are being used to analyze CTC-expressed genes to identify potential cancer biomarkers. In this regard, several studies have used gene expression changes in blood to predict the presence of CTC and, consequently, cancer. However, the CTC mRNA data has not been used to develop a generic approach that indicates the presence of multiple cancer types. In this study, we developed such a generic approach. Briefly, we designed two computational workflows, one using the raw mRNA data and deep learning (DL) and the other exploiting five hub gene ranking algorithms (Degree, Maximum Neighborhood Component, Betweenness Centrality, Closeness Centrality, and Stress Centrality) with machine learning (ML). Both workflows aim to determine the top genes that best distinguish cancer types based on the CTC mRNA data. We demonstrate that our automated, robust DL framework (DNNraw) more accurately indicates the presence of multiple cancer types using the CTC gene expression data than multiple ML approaches. The DL approach achieved average precision of 0.9652, recall of 0.9640, f1-score of 0.9638 and overall accuracy of 0.9640. Furthermore, since we designed multiple approaches, we also provide a bioinformatics analysis of the gene commonly identified as top-ranked by the different methods. To our knowledge, this is the first study wherein a generic approach has been developed to predict the presence of multiple cancer types using raw CTC mRNA data, as opposed to other models that require a feature selection step. Nature Publishing Group UK 2023-11-30 /pmc/articles/PMC10689793/ /pubmed/38036622 http://dx.doi.org/10.1038/s41598-023-47805-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Albaradei, Somayah
Alganmi, Nofe
Albaradie, Abdulrahman
Alharbi, Eaman
Motwalli, Olaa
Thafar, Maha A.
Gojobori, Takashi
Essack, Magbubah
Gao, Xin
A deep learning model predicts the presence of diverse cancer types using circulating tumor cells
title A deep learning model predicts the presence of diverse cancer types using circulating tumor cells
title_full A deep learning model predicts the presence of diverse cancer types using circulating tumor cells
title_fullStr A deep learning model predicts the presence of diverse cancer types using circulating tumor cells
title_full_unstemmed A deep learning model predicts the presence of diverse cancer types using circulating tumor cells
title_short A deep learning model predicts the presence of diverse cancer types using circulating tumor cells
title_sort deep learning model predicts the presence of diverse cancer types using circulating tumor cells
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10689793/
https://www.ncbi.nlm.nih.gov/pubmed/38036622
http://dx.doi.org/10.1038/s41598-023-47805-2
work_keys_str_mv AT albaradeisomayah adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT alganminofe adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT albaradieabdulrahman adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT alharbieaman adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT motwalliolaa adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT thafarmahaa adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT gojoboritakashi adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT essackmagbubah adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT gaoxin adeeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT albaradeisomayah deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT alganminofe deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT albaradieabdulrahman deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT alharbieaman deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT motwalliolaa deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT thafarmahaa deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT gojoboritakashi deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT essackmagbubah deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells
AT gaoxin deeplearningmodelpredictsthepresenceofdiversecancertypesusingcirculatingtumorcells