Cargando…

Improving five-year survival prediction via multitask learning across HPV-related cancers

Oncology is a highly siloed field of research in which sub-disciplinary specialization has limited the amount of information shared between researchers of distinct cancer types. This can be attributed to legitimate differences in the physiology and carcinogenesis of cancers affecting distinct anatom...

Descripción completa

Detalles Bibliográficos
Autores principales: Goncalves, Andre, Soper, Braden, Nygård, Mari, Nygård, Jan F., Ray, Priyadip, Widemann, David, Sales, Ana Paula
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7668590/
https://www.ncbi.nlm.nih.gov/pubmed/33196642
http://dx.doi.org/10.1371/journal.pone.0241225
_version_ 1783610514022072320
author Goncalves, Andre
Soper, Braden
Nygård, Mari
Nygård, Jan F.
Ray, Priyadip
Widemann, David
Sales, Ana Paula
author_facet Goncalves, Andre
Soper, Braden
Nygård, Mari
Nygård, Jan F.
Ray, Priyadip
Widemann, David
Sales, Ana Paula
author_sort Goncalves, Andre
collection PubMed
description Oncology is a highly siloed field of research in which sub-disciplinary specialization has limited the amount of information shared between researchers of distinct cancer types. This can be attributed to legitimate differences in the physiology and carcinogenesis of cancers affecting distinct anatomical sites. However, underlying processes that are shared across seemingly disparate cancers probably affect prognosis. The objective of the current study is to investigate whether multitask learning improves 5-year survival cancer patient survival prediction by leveraging information across anatomically distinct HPV related cancers. Data were obtained from the Surveillance, Epidemiology, and End Results (SEER) program database. The study cohort consisted of 29,768 primary cancer cases diagnosed in the United States between 2004 and 2015. Ten different cancer diagnoses were selected, all with a known association with HPV risk. In the analysis, the cancer diagnoses were categorized into three distinct topography groups of varying specificity. The most specific topography grouping consisted of 10 original cancer diagnoses differentiated by the first two digits of the ICD-O-3 topography code. The second topography grouping consisted of cancer diagnoses categorized into six distinct organ groups. Finally, the third topography grouping consisted of just two groups, head-neck cancers and ano-genital cancers. The tasks were to predict 5-year survival for patients within the different topography groups using 14 predictive features which were selected among descriptive variables available in the SEER database. The information from the predictive features was shared between tasks in three different ways, resulting in three distinct predictive models: 1) Information was not shared between patients assigned to different tasks (single task learning); 2) Information was shared between all patients, regardless of task (pooled model); 3) Only relevant information was shared between patients grouped to different tasks (multitask learning). Prediction performance was evaluated with Brier scores. All three models were evaluated against one another on each of the three distinct topography-defined tasks. The results showed that multitask classifiers achieved relative improvement for the majority of the scenarios studied compared to single task learning and pooled baseline methods. In this study, we have demonstrated that sharing information among anatomically distinct cancer types can lead to improved predictive survival models.
format Online
Article
Text
id pubmed-7668590
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-76685902020-11-19 Improving five-year survival prediction via multitask learning across HPV-related cancers Goncalves, Andre Soper, Braden Nygård, Mari Nygård, Jan F. Ray, Priyadip Widemann, David Sales, Ana Paula PLoS One Research Article Oncology is a highly siloed field of research in which sub-disciplinary specialization has limited the amount of information shared between researchers of distinct cancer types. This can be attributed to legitimate differences in the physiology and carcinogenesis of cancers affecting distinct anatomical sites. However, underlying processes that are shared across seemingly disparate cancers probably affect prognosis. The objective of the current study is to investigate whether multitask learning improves 5-year survival cancer patient survival prediction by leveraging information across anatomically distinct HPV related cancers. Data were obtained from the Surveillance, Epidemiology, and End Results (SEER) program database. The study cohort consisted of 29,768 primary cancer cases diagnosed in the United States between 2004 and 2015. Ten different cancer diagnoses were selected, all with a known association with HPV risk. In the analysis, the cancer diagnoses were categorized into three distinct topography groups of varying specificity. The most specific topography grouping consisted of 10 original cancer diagnoses differentiated by the first two digits of the ICD-O-3 topography code. The second topography grouping consisted of cancer diagnoses categorized into six distinct organ groups. Finally, the third topography grouping consisted of just two groups, head-neck cancers and ano-genital cancers. The tasks were to predict 5-year survival for patients within the different topography groups using 14 predictive features which were selected among descriptive variables available in the SEER database. The information from the predictive features was shared between tasks in three different ways, resulting in three distinct predictive models: 1) Information was not shared between patients assigned to different tasks (single task learning); 2) Information was shared between all patients, regardless of task (pooled model); 3) Only relevant information was shared between patients grouped to different tasks (multitask learning). Prediction performance was evaluated with Brier scores. All three models were evaluated against one another on each of the three distinct topography-defined tasks. The results showed that multitask classifiers achieved relative improvement for the majority of the scenarios studied compared to single task learning and pooled baseline methods. In this study, we have demonstrated that sharing information among anatomically distinct cancer types can lead to improved predictive survival models. Public Library of Science 2020-11-16 /pmc/articles/PMC7668590/ /pubmed/33196642 http://dx.doi.org/10.1371/journal.pone.0241225 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Goncalves, Andre
Soper, Braden
Nygård, Mari
Nygård, Jan F.
Ray, Priyadip
Widemann, David
Sales, Ana Paula
Improving five-year survival prediction via multitask learning across HPV-related cancers
title Improving five-year survival prediction via multitask learning across HPV-related cancers
title_full Improving five-year survival prediction via multitask learning across HPV-related cancers
title_fullStr Improving five-year survival prediction via multitask learning across HPV-related cancers
title_full_unstemmed Improving five-year survival prediction via multitask learning across HPV-related cancers
title_short Improving five-year survival prediction via multitask learning across HPV-related cancers
title_sort improving five-year survival prediction via multitask learning across hpv-related cancers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7668590/
https://www.ncbi.nlm.nih.gov/pubmed/33196642
http://dx.doi.org/10.1371/journal.pone.0241225
work_keys_str_mv AT goncalvesandre improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers
AT soperbraden improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers
AT nygardmari improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers
AT nygardjanf improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers
AT raypriyadip improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers
AT widemanndavid improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers
AT salesanapaula improvingfiveyearsurvivalpredictionviamultitasklearningacrosshpvrelatedcancers