Cargando…

Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type

Cancer of Unknown Primary (CUP) occurs in 3–5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated empirically and has very poor outcomes, with median overall survival less than one year. Gene expressi...

Descripción completa

Detalles Bibliográficos
Autores principales: Abraham, Jim, Heimberger, Amy B., Marshall, John, Heath, Elisabeth, Drabick, Joseph, Helmstetter, Anthony, Xiu, Joanne, Magee, Daniel, Stafford, Phillip, Nabhan, Chadi, Antani, Sourabh, Johnston, Curtis, Oberley, Matthew, Korn, Wolfgang Michael, Spetzler, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Neoplasia Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7815805/
https://www.ncbi.nlm.nih.gov/pubmed/33465745
http://dx.doi.org/10.1016/j.tranon.2021.101016
_version_ 1783638310725353472
author Abraham, Jim
Heimberger, Amy B.
Marshall, John
Heath, Elisabeth
Drabick, Joseph
Helmstetter, Anthony
Xiu, Joanne
Magee, Daniel
Stafford, Phillip
Nabhan, Chadi
Antani, Sourabh
Johnston, Curtis
Oberley, Matthew
Korn, Wolfgang Michael
Spetzler, David
author_facet Abraham, Jim
Heimberger, Amy B.
Marshall, John
Heath, Elisabeth
Drabick, Joseph
Helmstetter, Anthony
Xiu, Joanne
Magee, Daniel
Stafford, Phillip
Nabhan, Chadi
Antani, Sourabh
Johnston, Curtis
Oberley, Matthew
Korn, Wolfgang Michael
Spetzler, David
author_sort Abraham, Jim
collection PubMed
description Cancer of Unknown Primary (CUP) occurs in 3–5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated empirically and has very poor outcomes, with median overall survival less than one year. Gene expression profiling alone has been used to identify the tissue of origin but struggles with low neoplastic percentage in metastatic sites which is where identification is often most needed. MI GPSai, a Genomic Prevalence Score, uses DNA sequencing and whole transcriptome data coupled with machine learning to aid in the diagnosis of cancer. The algorithm trained on genomic data from 34,352 cases and genomic and transcriptomic data from 23,137 cases and was validated on 19,555 cases. MI GPSai predicted the tumor type in the labeled data set with an accuracy of over 94% on 93% of cases while deliberating amongst 21 possible categories of cancer. When also considering the second highest prediction, the accuracy increases to 97%. Additionally, MI GPSai rendered a prediction for 71.7% of CUP cases. Pathologist evaluation of discrepancies between submitted diagnosis and MI GPSai predictions resulted in change of diagnosis in 41.3% of the time. MI GPSai provides clinically meaningful information in a large proportion of CUP cases and inclusion of MI GPSai in clinical routine could improve diagnostic fidelity. Moreover, all genomic markers essential for therapy selection are assessed in this assay, maximizing the clinical utility for patients within a single test.
format Online
Article
Text
id pubmed-7815805
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Neoplasia Press
record_format MEDLINE/PubMed
spelling pubmed-78158052021-01-26 Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type Abraham, Jim Heimberger, Amy B. Marshall, John Heath, Elisabeth Drabick, Joseph Helmstetter, Anthony Xiu, Joanne Magee, Daniel Stafford, Phillip Nabhan, Chadi Antani, Sourabh Johnston, Curtis Oberley, Matthew Korn, Wolfgang Michael Spetzler, David Transl Oncol Original Research Cancer of Unknown Primary (CUP) occurs in 3–5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated empirically and has very poor outcomes, with median overall survival less than one year. Gene expression profiling alone has been used to identify the tissue of origin but struggles with low neoplastic percentage in metastatic sites which is where identification is often most needed. MI GPSai, a Genomic Prevalence Score, uses DNA sequencing and whole transcriptome data coupled with machine learning to aid in the diagnosis of cancer. The algorithm trained on genomic data from 34,352 cases and genomic and transcriptomic data from 23,137 cases and was validated on 19,555 cases. MI GPSai predicted the tumor type in the labeled data set with an accuracy of over 94% on 93% of cases while deliberating amongst 21 possible categories of cancer. When also considering the second highest prediction, the accuracy increases to 97%. Additionally, MI GPSai rendered a prediction for 71.7% of CUP cases. Pathologist evaluation of discrepancies between submitted diagnosis and MI GPSai predictions resulted in change of diagnosis in 41.3% of the time. MI GPSai provides clinically meaningful information in a large proportion of CUP cases and inclusion of MI GPSai in clinical routine could improve diagnostic fidelity. Moreover, all genomic markers essential for therapy selection are assessed in this assay, maximizing the clinical utility for patients within a single test. Neoplasia Press 2021-01-16 /pmc/articles/PMC7815805/ /pubmed/33465745 http://dx.doi.org/10.1016/j.tranon.2021.101016 Text en © 2021 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Research
Abraham, Jim
Heimberger, Amy B.
Marshall, John
Heath, Elisabeth
Drabick, Joseph
Helmstetter, Anthony
Xiu, Joanne
Magee, Daniel
Stafford, Phillip
Nabhan, Chadi
Antani, Sourabh
Johnston, Curtis
Oberley, Matthew
Korn, Wolfgang Michael
Spetzler, David
Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
title Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
title_full Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
title_fullStr Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
title_full_unstemmed Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
title_short Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
title_sort machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7815805/
https://www.ncbi.nlm.nih.gov/pubmed/33465745
http://dx.doi.org/10.1016/j.tranon.2021.101016
work_keys_str_mv AT abrahamjim machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT heimbergeramyb machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT marshalljohn machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT heathelisabeth machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT drabickjoseph machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT helmstetteranthony machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT xiujoanne machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT mageedaniel machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT staffordphillip machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT nabhanchadi machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT antanisourabh machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT johnstoncurtis machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT oberleymatthew machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT kornwolfgangmichael machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype
AT spetzlerdavid machinelearninganalysisusing77044genomicandtranscriptomicprofilestoaccuratelypredicttumortype