Cargando…
Predicting cancer type from tumour DNA signatures
BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706302/ https://www.ncbi.nlm.nih.gov/pubmed/29183400 http://dx.doi.org/10.1186/s13073-017-0493-2 |
_version_ | 1783282202540244992 |
---|---|
author | Soh, Kee Pang Szczurek, Ewa Sakoparnig, Thomas Beerenwinkel, Niko |
author_facet | Soh, Kee Pang Szczurek, Ewa Sakoparnig, Thomas Beerenwinkel, Niko |
author_sort | Soh, Kee Pang |
collection | PubMed |
description | BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. METHODS: Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L (1)-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. RESULTS: We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. CONCLUSIONS: A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-017-0493-2) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5706302 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57063022017-12-05 Predicting cancer type from tumour DNA signatures Soh, Kee Pang Szczurek, Ewa Sakoparnig, Thomas Beerenwinkel, Niko Genome Med Research BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. METHODS: Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L (1)-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. RESULTS: We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. CONCLUSIONS: A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-017-0493-2) contains supplementary material, which is available to authorized users. BioMed Central 2017-11-28 /pmc/articles/PMC5706302/ /pubmed/29183400 http://dx.doi.org/10.1186/s13073-017-0493-2 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Soh, Kee Pang Szczurek, Ewa Sakoparnig, Thomas Beerenwinkel, Niko Predicting cancer type from tumour DNA signatures |
title | Predicting cancer type from tumour DNA signatures |
title_full | Predicting cancer type from tumour DNA signatures |
title_fullStr | Predicting cancer type from tumour DNA signatures |
title_full_unstemmed | Predicting cancer type from tumour DNA signatures |
title_short | Predicting cancer type from tumour DNA signatures |
title_sort | predicting cancer type from tumour dna signatures |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706302/ https://www.ncbi.nlm.nih.gov/pubmed/29183400 http://dx.doi.org/10.1186/s13073-017-0493-2 |
work_keys_str_mv | AT sohkeepang predictingcancertypefromtumourdnasignatures AT szczurekewa predictingcancertypefromtumourdnasignatures AT sakoparnigthomas predictingcancertypefromtumourdnasignatures AT beerenwinkelniko predictingcancertypefromtumourdnasignatures |