Cargando…

Comparison of genetic variants in matched samples using thesaurus annotation

Motivation: Calling changes in DNA, e.g. as a result of somatic events in cancer, requires analysis of multiple matched sequenced samples. Events in low-mappability regions of the human genome are difficult to encode in variant call files and have been under-reported as a result. However, they can b...

Descripción completa

Detalles Bibliográficos
Autores principales: Konopka, Tomasz, Nijman, Sebastian M.B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795618/
https://www.ncbi.nlm.nih.gov/pubmed/26545822
http://dx.doi.org/10.1093/bioinformatics/btv654
_version_ 1782421631707119616
author Konopka, Tomasz
Nijman, Sebastian M.B.
author_facet Konopka, Tomasz
Nijman, Sebastian M.B.
author_sort Konopka, Tomasz
collection PubMed
description Motivation: Calling changes in DNA, e.g. as a result of somatic events in cancer, requires analysis of multiple matched sequenced samples. Events in low-mappability regions of the human genome are difficult to encode in variant call files and have been under-reported as a result. However, they can be described accurately through thesaurus annotation—a technique that links multiple genomic loci together to explicate a single variant. Results: We here describe software and benchmarks for using thesaurus annotation to detect point changes in DNA from matched samples. In benchmarks on matched normal/tumor samples we show that the technique can recover between five and ten percent more true events than conventional approaches, while strictly limiting false discovery and being fully consistent with popular variant analysis workflows. We also demonstrate the utility of the approach for analysis of de novo mutations in parents/child families. Availability and implementation: Software performing thesaurus annotation is implemented in java; available in source code on github at GeneticThesaurus (https://github.com/tkonopka/GeneticThesaurus) and as an executable on sourceforge at geneticthesaurus (https://sourceforge.net/projects/geneticthesaurus). Mutation calling is implemented in an R package available on github at RGeneticThesaurus (https://github.com/tkonopka/RGeneticThesaurus). Supplementary information: Supplementary data are available at Bioinformatics online. Contact: tomasz.konopka@ludwig.ox.ac.uk
format Online
Article
Text
id pubmed-4795618
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47956182016-03-21 Comparison of genetic variants in matched samples using thesaurus annotation Konopka, Tomasz Nijman, Sebastian M.B. Bioinformatics Original Papers Motivation: Calling changes in DNA, e.g. as a result of somatic events in cancer, requires analysis of multiple matched sequenced samples. Events in low-mappability regions of the human genome are difficult to encode in variant call files and have been under-reported as a result. However, they can be described accurately through thesaurus annotation—a technique that links multiple genomic loci together to explicate a single variant. Results: We here describe software and benchmarks for using thesaurus annotation to detect point changes in DNA from matched samples. In benchmarks on matched normal/tumor samples we show that the technique can recover between five and ten percent more true events than conventional approaches, while strictly limiting false discovery and being fully consistent with popular variant analysis workflows. We also demonstrate the utility of the approach for analysis of de novo mutations in parents/child families. Availability and implementation: Software performing thesaurus annotation is implemented in java; available in source code on github at GeneticThesaurus (https://github.com/tkonopka/GeneticThesaurus) and as an executable on sourceforge at geneticthesaurus (https://sourceforge.net/projects/geneticthesaurus). Mutation calling is implemented in an R package available on github at RGeneticThesaurus (https://github.com/tkonopka/RGeneticThesaurus). Supplementary information: Supplementary data are available at Bioinformatics online. Contact: tomasz.konopka@ludwig.ox.ac.uk Oxford University Press 2016-03-01 2015-11-05 /pmc/articles/PMC4795618/ /pubmed/26545822 http://dx.doi.org/10.1093/bioinformatics/btv654 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Konopka, Tomasz
Nijman, Sebastian M.B.
Comparison of genetic variants in matched samples using thesaurus annotation
title Comparison of genetic variants in matched samples using thesaurus annotation
title_full Comparison of genetic variants in matched samples using thesaurus annotation
title_fullStr Comparison of genetic variants in matched samples using thesaurus annotation
title_full_unstemmed Comparison of genetic variants in matched samples using thesaurus annotation
title_short Comparison of genetic variants in matched samples using thesaurus annotation
title_sort comparison of genetic variants in matched samples using thesaurus annotation
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795618/
https://www.ncbi.nlm.nih.gov/pubmed/26545822
http://dx.doi.org/10.1093/bioinformatics/btv654
work_keys_str_mv AT konopkatomasz comparisonofgeneticvariantsinmatchedsamplesusingthesaurusannotation
AT nijmansebastianmb comparisonofgeneticvariantsinmatchedsamplesusingthesaurusannotation