Cargando…

Accurate detection of mosaic variants in sequencing data without matched controls

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Dou, Yanmei, Kwon, Minseok, Rodin, Rachel E., Cortés-Ciriano, Isidro, Doan, Ryan, Luquette, Lovelace J., Galor, Alon, Bohrson, Craig, Walsh, Christopher A., Park, Peter J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7065972/
https://www.ncbi.nlm.nih.gov/pubmed/31907404
http://dx.doi.org/10.1038/s41587-019-0368-8
_version_ 1783505147254538240
author Dou, Yanmei
Kwon, Minseok
Rodin, Rachel E.
Cortés-Ciriano, Isidro
Doan, Ryan
Luquette, Lovelace J.
Galor, Alon
Bohrson, Craig
Walsh, Christopher A.
Park, Peter J.
author_facet Dou, Yanmei
Kwon, Minseok
Rodin, Rachel E.
Cortés-Ciriano, Isidro
Doan, Ryan
Luquette, Lovelace J.
Galor, Alon
Bohrson, Craig
Walsh, Christopher A.
Park, Peter J.
author_sort Dou, Yanmei
collection PubMed
description Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants (SNVs) and indels, achieving a multifold increase in specificity compared to existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80–90% of the mosaic SNVs and 60–80% indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.
format Online
Article
Text
id pubmed-7065972
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-70659722020-07-06 Accurate detection of mosaic variants in sequencing data without matched controls Dou, Yanmei Kwon, Minseok Rodin, Rachel E. Cortés-Ciriano, Isidro Doan, Ryan Luquette, Lovelace J. Galor, Alon Bohrson, Craig Walsh, Christopher A. Park, Peter J. Nat Biotechnol Article Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants (SNVs) and indels, achieving a multifold increase in specificity compared to existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80–90% of the mosaic SNVs and 60–80% indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease. 2020-01-06 2020-03 /pmc/articles/PMC7065972/ /pubmed/31907404 http://dx.doi.org/10.1038/s41587-019-0368-8 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Dou, Yanmei
Kwon, Minseok
Rodin, Rachel E.
Cortés-Ciriano, Isidro
Doan, Ryan
Luquette, Lovelace J.
Galor, Alon
Bohrson, Craig
Walsh, Christopher A.
Park, Peter J.
Accurate detection of mosaic variants in sequencing data without matched controls
title Accurate detection of mosaic variants in sequencing data without matched controls
title_full Accurate detection of mosaic variants in sequencing data without matched controls
title_fullStr Accurate detection of mosaic variants in sequencing data without matched controls
title_full_unstemmed Accurate detection of mosaic variants in sequencing data without matched controls
title_short Accurate detection of mosaic variants in sequencing data without matched controls
title_sort accurate detection of mosaic variants in sequencing data without matched controls
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7065972/
https://www.ncbi.nlm.nih.gov/pubmed/31907404
http://dx.doi.org/10.1038/s41587-019-0368-8
work_keys_str_mv AT douyanmei accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT kwonminseok accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT rodinrachele accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT cortescirianoisidro accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT doanryan accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT luquettelovelacej accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT galoralon accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT bohrsoncraig accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT walshchristophera accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols
AT parkpeterj accuratedetectionofmosaicvariantsinsequencingdatawithoutmatchedcontrols