Cargando…

Predicting cancer type from tumour DNA signatures

BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically...

Descripción completa

Detalles Bibliográficos
Autores principales: Soh, Kee Pang, Szczurek, Ewa, Sakoparnig, Thomas, Beerenwinkel, Niko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706302/
https://www.ncbi.nlm.nih.gov/pubmed/29183400
http://dx.doi.org/10.1186/s13073-017-0493-2
_version_ 1783282202540244992
author Soh, Kee Pang
Szczurek, Ewa
Sakoparnig, Thomas
Beerenwinkel, Niko
author_facet Soh, Kee Pang
Szczurek, Ewa
Sakoparnig, Thomas
Beerenwinkel, Niko
author_sort Soh, Kee Pang
collection PubMed
description BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. METHODS: Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L (1)-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. RESULTS: We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. CONCLUSIONS: A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-017-0493-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5706302
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57063022017-12-05 Predicting cancer type from tumour DNA signatures Soh, Kee Pang Szczurek, Ewa Sakoparnig, Thomas Beerenwinkel, Niko Genome Med Research BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. METHODS: Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L (1)-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. RESULTS: We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. CONCLUSIONS: A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-017-0493-2) contains supplementary material, which is available to authorized users. BioMed Central 2017-11-28 /pmc/articles/PMC5706302/ /pubmed/29183400 http://dx.doi.org/10.1186/s13073-017-0493-2 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Soh, Kee Pang
Szczurek, Ewa
Sakoparnig, Thomas
Beerenwinkel, Niko
Predicting cancer type from tumour DNA signatures
title Predicting cancer type from tumour DNA signatures
title_full Predicting cancer type from tumour DNA signatures
title_fullStr Predicting cancer type from tumour DNA signatures
title_full_unstemmed Predicting cancer type from tumour DNA signatures
title_short Predicting cancer type from tumour DNA signatures
title_sort predicting cancer type from tumour dna signatures
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706302/
https://www.ncbi.nlm.nih.gov/pubmed/29183400
http://dx.doi.org/10.1186/s13073-017-0493-2
work_keys_str_mv AT sohkeepang predictingcancertypefromtumourdnasignatures
AT szczurekewa predictingcancertypefromtumourdnasignatures
AT sakoparnigthomas predictingcancertypefromtumourdnasignatures
AT beerenwinkelniko predictingcancertypefromtumourdnasignatures