Cargando…

Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)

BACKGROUND: Liver cancer (Hepatocellular carcinoma; HCC) prevalence is increasing and with poor clinical outcome expected it means greater understanding of HCC aetiology is urgently required. This study explored a deep learning solution to detect biologically important features that distinguish prog...

Descripción completa

Detalles Bibliográficos
Autores principales: Owens, Alice R., McInerney, Caitríona E., Prise, Kevin M., McArt, Darragh G., Jurek-Loughrey, Anna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8611905/
https://www.ncbi.nlm.nih.gov/pubmed/34819028
http://dx.doi.org/10.1186/s12859-021-04454-4
_version_ 1784603382090039296
author Owens, Alice R.
McInerney, Caitríona E.
Prise, Kevin M.
McArt, Darragh G.
Jurek-Loughrey, Anna
author_facet Owens, Alice R.
McInerney, Caitríona E.
Prise, Kevin M.
McArt, Darragh G.
Jurek-Loughrey, Anna
author_sort Owens, Alice R.
collection PubMed
description BACKGROUND: Liver cancer (Hepatocellular carcinoma; HCC) prevalence is increasing and with poor clinical outcome expected it means greater understanding of HCC aetiology is urgently required. This study explored a deep learning solution to detect biologically important features that distinguish prognostic subgroups. A novel architecture of an Artificial Neural Network (ANN) trained with a customised objective function (L(RSC)) was developed. The ANN should discover new data representations, to detect patient subgroups that are biologically homogenous (clustering loss) and similar in survival (survival loss) while removing noise from the data (reconstruction loss). The model was applied to TCGA-HCC multi-omics data and benchmarked against baseline models that only use a reconstruction objective function (BCE, MSE) for learning. With the baseline models, the new features are then filtered based on survival information and used for clustering patients. Different variants of the customised objective function, incorporating only reconstruction and clustering losses (L(RC)); and reconstruction and survival losses (L(RS)) were also evaluated. Robust features consistently detected were compared between models and validated in TCGA and LIRI-JP HCC cohorts. RESULTS: The combined loss (L(RSC)) discovered highly significant prognostic subgroups (P-value = 1.55E−77) with more accurate sample assignment (Silhouette scores: 0.59–0.7) compared to baseline models (0.18–0.3). All L(RSC) bottleneck features (N = 100) were significant for survival, compared to only 11–21 for baseline models. Prognostic subgroups were not explained by disease grade or risk factors. Instead L(RSC) identified robust features including 377 mRNAs, many of which were novel (61.27%) compared to those identified by the other losses. Some 75 mRNAs were prognostic in TCGA, while 29 were prognostic in LIRI-JP also. L(RSC) also identified 15 robust miRNAs including two novel (hsa-let-7g; hsa-mir-550a-1) and 328 methylation features with 71% being prognostic. Gene-enrichment and Functional Annotation Analysis identified seven pathways differentiating prognostic clusters. CONCLUSIONS: Combining cluster and survival metrics with the reconstruction objective function facilitated superior prognostic subgroup identification. The hybrid model identified more homogeneous clusters that consequently were more biologically meaningful. The novel and prognostic robust features extracted provide additional information to improve our understanding of a complex disease to help reveal its aetiology. Moreover, the gene features identified may have clinical applications as therapeutic targets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04454-4.
format Online
Article
Text
id pubmed-8611905
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86119052021-11-29 Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma) Owens, Alice R. McInerney, Caitríona E. Prise, Kevin M. McArt, Darragh G. Jurek-Loughrey, Anna BMC Bioinformatics Methodology Article BACKGROUND: Liver cancer (Hepatocellular carcinoma; HCC) prevalence is increasing and with poor clinical outcome expected it means greater understanding of HCC aetiology is urgently required. This study explored a deep learning solution to detect biologically important features that distinguish prognostic subgroups. A novel architecture of an Artificial Neural Network (ANN) trained with a customised objective function (L(RSC)) was developed. The ANN should discover new data representations, to detect patient subgroups that are biologically homogenous (clustering loss) and similar in survival (survival loss) while removing noise from the data (reconstruction loss). The model was applied to TCGA-HCC multi-omics data and benchmarked against baseline models that only use a reconstruction objective function (BCE, MSE) for learning. With the baseline models, the new features are then filtered based on survival information and used for clustering patients. Different variants of the customised objective function, incorporating only reconstruction and clustering losses (L(RC)); and reconstruction and survival losses (L(RS)) were also evaluated. Robust features consistently detected were compared between models and validated in TCGA and LIRI-JP HCC cohorts. RESULTS: The combined loss (L(RSC)) discovered highly significant prognostic subgroups (P-value = 1.55E−77) with more accurate sample assignment (Silhouette scores: 0.59–0.7) compared to baseline models (0.18–0.3). All L(RSC) bottleneck features (N = 100) were significant for survival, compared to only 11–21 for baseline models. Prognostic subgroups were not explained by disease grade or risk factors. Instead L(RSC) identified robust features including 377 mRNAs, many of which were novel (61.27%) compared to those identified by the other losses. Some 75 mRNAs were prognostic in TCGA, while 29 were prognostic in LIRI-JP also. L(RSC) also identified 15 robust miRNAs including two novel (hsa-let-7g; hsa-mir-550a-1) and 328 methylation features with 71% being prognostic. Gene-enrichment and Functional Annotation Analysis identified seven pathways differentiating prognostic clusters. CONCLUSIONS: Combining cluster and survival metrics with the reconstruction objective function facilitated superior prognostic subgroup identification. The hybrid model identified more homogeneous clusters that consequently were more biologically meaningful. The novel and prognostic robust features extracted provide additional information to improve our understanding of a complex disease to help reveal its aetiology. Moreover, the gene features identified may have clinical applications as therapeutic targets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04454-4. BioMed Central 2021-11-24 /pmc/articles/PMC8611905/ /pubmed/34819028 http://dx.doi.org/10.1186/s12859-021-04454-4 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Owens, Alice R.
McInerney, Caitríona E.
Prise, Kevin M.
McArt, Darragh G.
Jurek-Loughrey, Anna
Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)
title Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)
title_full Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)
title_fullStr Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)
title_full_unstemmed Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)
title_short Novel deep learning-based solution for identification of prognostic subgroups in liver cancer (Hepatocellular carcinoma)
title_sort novel deep learning-based solution for identification of prognostic subgroups in liver cancer (hepatocellular carcinoma)
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8611905/
https://www.ncbi.nlm.nih.gov/pubmed/34819028
http://dx.doi.org/10.1186/s12859-021-04454-4
work_keys_str_mv AT owensalicer noveldeeplearningbasedsolutionforidentificationofprognosticsubgroupsinlivercancerhepatocellularcarcinoma
AT mcinerneycaitrionae noveldeeplearningbasedsolutionforidentificationofprognosticsubgroupsinlivercancerhepatocellularcarcinoma
AT prisekevinm noveldeeplearningbasedsolutionforidentificationofprognosticsubgroupsinlivercancerhepatocellularcarcinoma
AT mcartdarraghg noveldeeplearningbasedsolutionforidentificationofprognosticsubgroupsinlivercancerhepatocellularcarcinoma
AT jurekloughreyanna noveldeeplearningbasedsolutionforidentificationofprognosticsubgroupsinlivercancerhepatocellularcarcinoma