Cargando…

Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA

BACKGROUND: Blood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage...

Descripción completa

Detalles Bibliográficos
Autores principales: Wan, Nathan, Weinberg, David, Liu, Tzu-Yu, Niehaus, Katherine, Ariazi, Eric A., Delubac, Daniel, Kannan, Ajay, White, Brandon, Bailey, Mitch, Bertin, Marvin, Boley, Nathan, Bowen, Derek, Cregg, James, Drake, Adam M., Ennis, Riley, Fransen, Signe, Gafni, Erik, Hansen, Loren, Liu, Yaping, Otte, Gabriel L., Pecson, Jennifer, Rice, Brandon, Sanderson, Gabriel E., Sharma, Aarushi, St. John, John, Tang, Catherina, Tzou, Abraham, Young, Leilani, Putcha, Girish, Haque, Imran S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6708173/
https://www.ncbi.nlm.nih.gov/pubmed/31443703
http://dx.doi.org/10.1186/s12885-019-6003-8
_version_ 1783445966983004160
author Wan, Nathan
Weinberg, David
Liu, Tzu-Yu
Niehaus, Katherine
Ariazi, Eric A.
Delubac, Daniel
Kannan, Ajay
White, Brandon
Bailey, Mitch
Bertin, Marvin
Boley, Nathan
Bowen, Derek
Cregg, James
Drake, Adam M.
Ennis, Riley
Fransen, Signe
Gafni, Erik
Hansen, Loren
Liu, Yaping
Otte, Gabriel L.
Pecson, Jennifer
Rice, Brandon
Sanderson, Gabriel E.
Sharma, Aarushi
St. John, John
Tang, Catherina
Tzou, Abraham
Young, Leilani
Putcha, Girish
Haque, Imran S.
author_facet Wan, Nathan
Weinberg, David
Liu, Tzu-Yu
Niehaus, Katherine
Ariazi, Eric A.
Delubac, Daniel
Kannan, Ajay
White, Brandon
Bailey, Mitch
Bertin, Marvin
Boley, Nathan
Bowen, Derek
Cregg, James
Drake, Adam M.
Ennis, Riley
Fransen, Signe
Gafni, Erik
Hansen, Loren
Liu, Yaping
Otte, Gabriel L.
Pecson, Jennifer
Rice, Brandon
Sanderson, Gabriel E.
Sharma, Aarushi
St. John, John
Tang, Catherina
Tzou, Abraham
Young, Leilani
Putcha, Girish
Haque, Imran S.
author_sort Wan, Nathan
collection PubMed
description BACKGROUND: Blood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer. METHODS: Whole-genome sequencing was performed on cfDNA extracted from plasma samples (N = 546 colorectal cancer and 271 non-cancer controls). Reads aligning to protein-coding gene bodies were extracted, and read counts were normalized. cfDNA tumor fraction was estimated using IchorCNA. Machine learning models were trained using k-fold cross-validation and confounder-based cross-validations to assess generalization performance. RESULTS: In a colorectal cancer cohort heavily weighted towards early-stage cancer (80% stage I/II), we achieved a mean AUC of 0.92 (95% CI 0.91–0.93) with a mean sensitivity of 85% (95% CI 83–86%) at 85% specificity. Sensitivity generally increased with tumor stage and increasing tumor fraction. Stratification by age, sequencing batch, and institution demonstrated the impact of these confounders and provided a more accurate assessment of generalization performance. CONCLUSIONS: A machine learning approach using cfDNA achieved high sensitivity and specificity in a large, predominantly early-stage, colorectal cancer cohort. The possibility of systematic technical and institution-specific biases warrants similar confounder analyses in other studies. Prospective validation of this machine learning method and evaluation of a multi-analyte approach are underway. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12885-019-6003-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6708173
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-67081732019-08-28 Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA Wan, Nathan Weinberg, David Liu, Tzu-Yu Niehaus, Katherine Ariazi, Eric A. Delubac, Daniel Kannan, Ajay White, Brandon Bailey, Mitch Bertin, Marvin Boley, Nathan Bowen, Derek Cregg, James Drake, Adam M. Ennis, Riley Fransen, Signe Gafni, Erik Hansen, Loren Liu, Yaping Otte, Gabriel L. Pecson, Jennifer Rice, Brandon Sanderson, Gabriel E. Sharma, Aarushi St. John, John Tang, Catherina Tzou, Abraham Young, Leilani Putcha, Girish Haque, Imran S. BMC Cancer Research Article BACKGROUND: Blood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer. METHODS: Whole-genome sequencing was performed on cfDNA extracted from plasma samples (N = 546 colorectal cancer and 271 non-cancer controls). Reads aligning to protein-coding gene bodies were extracted, and read counts were normalized. cfDNA tumor fraction was estimated using IchorCNA. Machine learning models were trained using k-fold cross-validation and confounder-based cross-validations to assess generalization performance. RESULTS: In a colorectal cancer cohort heavily weighted towards early-stage cancer (80% stage I/II), we achieved a mean AUC of 0.92 (95% CI 0.91–0.93) with a mean sensitivity of 85% (95% CI 83–86%) at 85% specificity. Sensitivity generally increased with tumor stage and increasing tumor fraction. Stratification by age, sequencing batch, and institution demonstrated the impact of these confounders and provided a more accurate assessment of generalization performance. CONCLUSIONS: A machine learning approach using cfDNA achieved high sensitivity and specificity in a large, predominantly early-stage, colorectal cancer cohort. The possibility of systematic technical and institution-specific biases warrants similar confounder analyses in other studies. Prospective validation of this machine learning method and evaluation of a multi-analyte approach are underway. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12885-019-6003-8) contains supplementary material, which is available to authorized users. BioMed Central 2019-08-23 /pmc/articles/PMC6708173/ /pubmed/31443703 http://dx.doi.org/10.1186/s12885-019-6003-8 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wan, Nathan
Weinberg, David
Liu, Tzu-Yu
Niehaus, Katherine
Ariazi, Eric A.
Delubac, Daniel
Kannan, Ajay
White, Brandon
Bailey, Mitch
Bertin, Marvin
Boley, Nathan
Bowen, Derek
Cregg, James
Drake, Adam M.
Ennis, Riley
Fransen, Signe
Gafni, Erik
Hansen, Loren
Liu, Yaping
Otte, Gabriel L.
Pecson, Jennifer
Rice, Brandon
Sanderson, Gabriel E.
Sharma, Aarushi
St. John, John
Tang, Catherina
Tzou, Abraham
Young, Leilani
Putcha, Girish
Haque, Imran S.
Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
title Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
title_full Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
title_fullStr Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
title_full_unstemmed Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
title_short Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
title_sort machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free dna
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6708173/
https://www.ncbi.nlm.nih.gov/pubmed/31443703
http://dx.doi.org/10.1186/s12885-019-6003-8
work_keys_str_mv AT wannathan machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT weinbergdavid machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT liutzuyu machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT niehauskatherine machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT ariazierica machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT delubacdaniel machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT kannanajay machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT whitebrandon machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT baileymitch machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT bertinmarvin machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT boleynathan machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT bowenderek machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT creggjames machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT drakeadamm machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT ennisriley machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT fransensigne machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT gafnierik machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT hansenloren machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT liuyaping machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT ottegabriell machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT pecsonjennifer machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT ricebrandon machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT sandersongabriele machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT sharmaaarushi machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT stjohnjohn machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT tangcatherina machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT tzouabraham machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT youngleilani machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT putchagirish machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna
AT haqueimrans machinelearningenablesdetectionofearlystagecolorectalcancerbywholegenomesequencingofplasmacellfreedna