Cargando…
Predicting cancer type from tumour DNA signatures
BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5706302/ https://www.ncbi.nlm.nih.gov/pubmed/29183400 http://dx.doi.org/10.1186/s13073-017-0493-2 |
Sumario: | BACKGROUND: Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. METHODS: Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L (1)-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. RESULTS: We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. CONCLUSIONS: A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-017-0493-2) contains supplementary material, which is available to authorized users. |
---|