Cargando…

A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer

(1) Background:One of the most common cancers that affect North American men and men worldwide is prostate cancer. The Gleason score is a pathological grading system to examine the potential aggressiveness of the disease in the prostate tissue. Advancements in computing and next-generation sequencin...

Descripción completa

Detalles Bibliográficos
Autores principales: Hamzeh, Osama, Alkhateeb, Abedalrhman, Zheng, Julia Zhuoran, Kandalam, Srinath, Leung, Crystal, Atikukke, Govindaraja, Cavallo-Medved, Dora, Palanisamy, Nallasivam, Rueda, Luis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6963340/
https://www.ncbi.nlm.nih.gov/pubmed/31835700
http://dx.doi.org/10.3390/diagnostics9040219
_version_ 1783488258823421952
author Hamzeh, Osama
Alkhateeb, Abedalrhman
Zheng, Julia Zhuoran
Kandalam, Srinath
Leung, Crystal
Atikukke, Govindaraja
Cavallo-Medved, Dora
Palanisamy, Nallasivam
Rueda, Luis
author_facet Hamzeh, Osama
Alkhateeb, Abedalrhman
Zheng, Julia Zhuoran
Kandalam, Srinath
Leung, Crystal
Atikukke, Govindaraja
Cavallo-Medved, Dora
Palanisamy, Nallasivam
Rueda, Luis
author_sort Hamzeh, Osama
collection PubMed
description (1) Background:One of the most common cancers that affect North American men and men worldwide is prostate cancer. The Gleason score is a pathological grading system to examine the potential aggressiveness of the disease in the prostate tissue. Advancements in computing and next-generation sequencing technology now allow us to study the genomic profiles of patients in association with their different Gleason scores more accurately and effectively. (2) Methods: In this study, we used a novel machine learning method to analyse gene expression of prostate tumours with different Gleason scores, and identify potential genetic biomarkers for each Gleason group. We obtained a publicly-available RNA-Seq dataset of a cohort of 104 prostate cancer patients from the National Center for Biotechnology Information’s (NCBI) Gene Expression Omnibus (GEO) repository, and categorised patients based on their Gleason scores to create a hierarchy of disease progression. A hierarchical model with standard classifiers in different Gleason groups, also known as nodes, was developed to identify and predict nodes based on their mRNA or gene expression. In each node, patient samples were analysed via class imbalance and hybrid feature selection techniques to build the prediction model. The outcome from analysis of each node was a set of genes that could differentiate each Gleason group from the remaining groups. To validate the proposed method, the set of identified genes were used to classify a second dataset of 499 prostate cancer patients collected from cBioportal. (3) Results: The overall accuracy of applying this novel method to the first dataset was 93.3%; the method was further validated to have 87% accuracy using the second dataset. This method also identified genes that were not previously reported as potential biomarkers for specific Gleason groups. In particular, PIAS3 was identified as a potential biomarker for Gleason score 4 + 3 = 7, and UBE2V2 for Gleason score 6. (4) Insight: Previous reports show that the genes predicted by this newly proposed method strongly correlate with prostate cancer development and progression. Furthermore, pathway analysis shows that both PIAS3 and UBE2V2 share similar protein interaction pathways, the JAK/STAT signaling process.
format Online
Article
Text
id pubmed-6963340
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-69633402020-02-26 A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer Hamzeh, Osama Alkhateeb, Abedalrhman Zheng, Julia Zhuoran Kandalam, Srinath Leung, Crystal Atikukke, Govindaraja Cavallo-Medved, Dora Palanisamy, Nallasivam Rueda, Luis Diagnostics (Basel) Article (1) Background:One of the most common cancers that affect North American men and men worldwide is prostate cancer. The Gleason score is a pathological grading system to examine the potential aggressiveness of the disease in the prostate tissue. Advancements in computing and next-generation sequencing technology now allow us to study the genomic profiles of patients in association with their different Gleason scores more accurately and effectively. (2) Methods: In this study, we used a novel machine learning method to analyse gene expression of prostate tumours with different Gleason scores, and identify potential genetic biomarkers for each Gleason group. We obtained a publicly-available RNA-Seq dataset of a cohort of 104 prostate cancer patients from the National Center for Biotechnology Information’s (NCBI) Gene Expression Omnibus (GEO) repository, and categorised patients based on their Gleason scores to create a hierarchy of disease progression. A hierarchical model with standard classifiers in different Gleason groups, also known as nodes, was developed to identify and predict nodes based on their mRNA or gene expression. In each node, patient samples were analysed via class imbalance and hybrid feature selection techniques to build the prediction model. The outcome from analysis of each node was a set of genes that could differentiate each Gleason group from the remaining groups. To validate the proposed method, the set of identified genes were used to classify a second dataset of 499 prostate cancer patients collected from cBioportal. (3) Results: The overall accuracy of applying this novel method to the first dataset was 93.3%; the method was further validated to have 87% accuracy using the second dataset. This method also identified genes that were not previously reported as potential biomarkers for specific Gleason groups. In particular, PIAS3 was identified as a potential biomarker for Gleason score 4 + 3 = 7, and UBE2V2 for Gleason score 6. (4) Insight: Previous reports show that the genes predicted by this newly proposed method strongly correlate with prostate cancer development and progression. Furthermore, pathway analysis shows that both PIAS3 and UBE2V2 share similar protein interaction pathways, the JAK/STAT signaling process. MDPI 2019-12-11 /pmc/articles/PMC6963340/ /pubmed/31835700 http://dx.doi.org/10.3390/diagnostics9040219 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hamzeh, Osama
Alkhateeb, Abedalrhman
Zheng, Julia Zhuoran
Kandalam, Srinath
Leung, Crystal
Atikukke, Govindaraja
Cavallo-Medved, Dora
Palanisamy, Nallasivam
Rueda, Luis
A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer
title A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer
title_full A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer
title_fullStr A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer
title_full_unstemmed A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer
title_short A Hierarchical Machine Learning Model to Discover Gleason Grade-Specific Biomarkers in Prostate Cancer
title_sort hierarchical machine learning model to discover gleason grade-specific biomarkers in prostate cancer
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6963340/
https://www.ncbi.nlm.nih.gov/pubmed/31835700
http://dx.doi.org/10.3390/diagnostics9040219
work_keys_str_mv AT hamzehosama ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT alkhateebabedalrhman ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT zhengjuliazhuoran ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT kandalamsrinath ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT leungcrystal ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT atikukkegovindaraja ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT cavallomedveddora ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT palanisamynallasivam ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT ruedaluis ahierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT hamzehosama hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT alkhateebabedalrhman hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT zhengjuliazhuoran hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT kandalamsrinath hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT leungcrystal hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT atikukkegovindaraja hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT cavallomedveddora hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT palanisamynallasivam hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer
AT ruedaluis hierarchicalmachinelearningmodeltodiscovergleasongradespecificbiomarkersinprostatecancer