Cargando…
NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer
BACKGROUND: The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity cou...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6524241/ https://www.ncbi.nlm.nih.gov/pubmed/31096972 http://dx.doi.org/10.1186/s12920-019-0508-5 |
_version_ | 1783419516850536448 |
---|---|
author | Anzar, Irantzu Sverchkova, Angelina Stratford, Richard Clancy, Trevor |
author_facet | Anzar, Irantzu Sverchkova, Angelina Stratford, Richard Clancy, Trevor |
author_sort | Anzar, Irantzu |
collection | PubMed |
description | BACKGROUND: The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity. METHODS: In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples. RESULTS: A robust and exhaustive evaluation of NeoMutate’s performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools. CONCLUSIONS: We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-019-0508-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6524241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-65242412019-05-24 NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer Anzar, Irantzu Sverchkova, Angelina Stratford, Richard Clancy, Trevor BMC Med Genomics Research Article BACKGROUND: The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity. METHODS: In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples. RESULTS: A robust and exhaustive evaluation of NeoMutate’s performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools. CONCLUSIONS: We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-019-0508-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-05-16 /pmc/articles/PMC6524241/ /pubmed/31096972 http://dx.doi.org/10.1186/s12920-019-0508-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Anzar, Irantzu Sverchkova, Angelina Stratford, Richard Clancy, Trevor NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
title | NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
title_full | NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
title_fullStr | NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
title_full_unstemmed | NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
title_short | NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
title_sort | neomutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6524241/ https://www.ncbi.nlm.nih.gov/pubmed/31096972 http://dx.doi.org/10.1186/s12920-019-0508-5 |
work_keys_str_mv | AT anzarirantzu neomutateanensemblemachinelearningframeworkforthepredictionofsomaticmutationsincancer AT sverchkovaangelina neomutateanensemblemachinelearningframeworkforthepredictionofsomaticmutationsincancer AT stratfordrichard neomutateanensemblemachinelearningframeworkforthepredictionofsomaticmutationsincancer AT clancytrevor neomutateanensemblemachinelearningframeworkforthepredictionofsomaticmutationsincancer |