Cargando…

A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer

BACKGROUND: Cell-free DNA’s (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in...

Descripción completa

Detalles Bibliográficos
Autores principales: Cario, Clinton L., Chen, Emmalyn, Leong, Lancelote, Emami, Nima C., Lopez, Karen, Tenggara, Imelda, Simko, Jeffry P., Friedlander, Terence W., Li, Patricia S., Paris, Pamela L., Carroll, Peter R., Witte, John S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7456018/
https://www.ncbi.nlm.nih.gov/pubmed/32859160
http://dx.doi.org/10.1186/s12885-020-07318-x
_version_ 1783575737414975488
author Cario, Clinton L.
Chen, Emmalyn
Leong, Lancelote
Emami, Nima C.
Lopez, Karen
Tenggara, Imelda
Simko, Jeffry P.
Friedlander, Terence W.
Li, Patricia S.
Paris, Pamela L.
Carroll, Peter R.
Witte, John S.
author_facet Cario, Clinton L.
Chen, Emmalyn
Leong, Lancelote
Emami, Nima C.
Lopez, Karen
Tenggara, Imelda
Simko, Jeffry P.
Friedlander, Terence W.
Li, Patricia S.
Paris, Pamela L.
Carroll, Peter R.
Witte, John S.
author_sort Cario, Clinton L.
collection PubMed
description BACKGROUND: Cell-free DNA’s (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings. METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel’s ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy. RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci. CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients.
format Online
Article
Text
id pubmed-7456018
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74560182020-08-31 A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer Cario, Clinton L. Chen, Emmalyn Leong, Lancelote Emami, Nima C. Lopez, Karen Tenggara, Imelda Simko, Jeffry P. Friedlander, Terence W. Li, Patricia S. Paris, Pamela L. Carroll, Peter R. Witte, John S. BMC Cancer Research Article BACKGROUND: Cell-free DNA’s (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings. METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel’s ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy. RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci. CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients. BioMed Central 2020-08-28 /pmc/articles/PMC7456018/ /pubmed/32859160 http://dx.doi.org/10.1186/s12885-020-07318-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Cario, Clinton L.
Chen, Emmalyn
Leong, Lancelote
Emami, Nima C.
Lopez, Karen
Tenggara, Imelda
Simko, Jeffry P.
Friedlander, Terence W.
Li, Patricia S.
Paris, Pamela L.
Carroll, Peter R.
Witte, John S.
A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
title A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
title_full A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
title_fullStr A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
title_full_unstemmed A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
title_short A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer
title_sort machine learning approach to optimizing cell-free dna sequencing panels: with an application to prostate cancer
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7456018/
https://www.ncbi.nlm.nih.gov/pubmed/32859160
http://dx.doi.org/10.1186/s12885-020-07318-x
work_keys_str_mv AT carioclintonl amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT chenemmalyn amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT leonglancelote amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT emaminimac amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT lopezkaren amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT tenggaraimelda amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT simkojeffryp amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT friedlanderterencew amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT lipatricias amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT parispamelal amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT carrollpeterr amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT wittejohns amachinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT carioclintonl machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT chenemmalyn machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT leonglancelote machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT emaminimac machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT lopezkaren machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT tenggaraimelda machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT simkojeffryp machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT friedlanderterencew machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT lipatricias machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT parispamelal machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT carrollpeterr machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer
AT wittejohns machinelearningapproachtooptimizingcellfreednasequencingpanelswithanapplicationtoprostatecancer