Cargando…
A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
BACKGROUND: Cell-free DNA’s (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7456018/ https://www.ncbi.nlm.nih.gov/pubmed/32859160 http://dx.doi.org/10.1186/s12885-020-07318-x |
_version_ | 1783575737414975488 |
---|---|
author | Cario, Clinton L. Chen, Emmalyn Leong, Lancelote Emami, Nima C. Lopez, Karen Tenggara, Imelda Simko, Jeffry P. Friedlander, Terence W. Li, Patricia S. Paris, Pamela L. Carroll, Peter R. Witte, John S. |
author_facet | Cario, Clinton L. Chen, Emmalyn Leong, Lancelote Emami, Nima C. Lopez, Karen Tenggara, Imelda Simko, Jeffry P. Friedlander, Terence W. Li, Patricia S. Paris, Pamela L. Carroll, Peter R. Witte, John S. |
author_sort | Cario, Clinton L. |
collection | PubMed |
description | BACKGROUND: Cell-free DNA’s (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings. METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel’s ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy. RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci. CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients. |
format | Online Article Text |
id | pubmed-7456018 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74560182020-08-31 A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer Cario, Clinton L. Chen, Emmalyn Leong, Lancelote Emami, Nima C. Lopez, Karen Tenggara, Imelda Simko, Jeffry P. Friedlander, Terence W. Li, Patricia S. Paris, Pamela L. Carroll, Peter R. Witte, John S. BMC Cancer Research Article BACKGROUND: Cell-free DNA’s (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings. METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel’s ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy. RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci. CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients. BioMed Central 2020-08-28 /pmc/articles/PMC7456018/ /pubmed/32859160 http://dx.doi.org/10.1186/s12885-020-07318-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Cario, Clinton L. Chen, Emmalyn Leong, Lancelote Emami, Nima C. Lopez, Karen Tenggara, Imelda Simko, Jeffry P. Friedlander, Terence W. Li, Patricia S. Paris, Pamela L. Carroll, Peter R. Witte, John S. A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer |
title | A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer |
title_full | A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer |
title_fullStr | A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer |
title_full_unstemmed | A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer |
title_short | A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer |
title_sort | machine learning approach to optimizing cell-free dna sequencing panels: with an application to prostate cancer |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7456018/ https://www.ncbi.nlm.nih.gov/pubmed/32859160 http://dx.doi.org/10.1186/s12885-020-07318-x |
work_keys_str_mv | AT carioclintonl amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT chenemmalyn amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT leonglancelote amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT emaminimac amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT lopezkaren amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT tenggaraimelda amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT simkojeffryp amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT friedlanderterencew amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT lipatricias amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT parispamelal amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT carrollpeterr amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT wittejohns amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT carioclintonl machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT chenemmalyn machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT leonglancelote machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT emaminimac machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT lopezkaren machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT tenggaraimelda machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT simkojeffryp machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT friedlanderterencew machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT lipatricias machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT parispamelal machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT carrollpeterr machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer AT wittejohns machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer |