Cargando…

vipR: variant identification in pooled DNA using R

Motivation: High-throughput-sequencing (HTS) technologies are the method of choice for screening the human genome for rare sequence variants causing susceptibility to complex diseases. Unfortunately, preparation of samples for a large number of individuals is still very cost- and labor intensive. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Altmann, Andre, Weber, Peter, Quast, Carina, Rex-Haffner, Monika, Binder, Elisabeth B., Müller-Myhsok, Bertram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117388/
https://www.ncbi.nlm.nih.gov/pubmed/21685105
http://dx.doi.org/10.1093/bioinformatics/btr205
_version_ 1782206327729160192
author Altmann, Andre
Weber, Peter
Quast, Carina
Rex-Haffner, Monika
Binder, Elisabeth B.
Müller-Myhsok, Bertram
author_facet Altmann, Andre
Weber, Peter
Quast, Carina
Rex-Haffner, Monika
Binder, Elisabeth B.
Müller-Myhsok, Bertram
author_sort Altmann, Andre
collection PubMed
description Motivation: High-throughput-sequencing (HTS) technologies are the method of choice for screening the human genome for rare sequence variants causing susceptibility to complex diseases. Unfortunately, preparation of samples for a large number of individuals is still very cost- and labor intensive. Thus, recently, screens for rare sequence variants were carried out in samples of pooled DNA, in which equimolar amounts of DNA from multiple individuals are mixed prior to sequencing with HTS. The resulting sequence data, however, poses a bioinformatics challenge: the discrimination of sequencing errors from real sequence variants present at a low frequency in the DNA pool. Results: Our method vipR uses data from multiple DNA pools in order to compensate for differences in sequencing error rates along the sequenced region. More precisely, instead of aiming at discriminating sequence variants from sequencing errors, vipR identifies sequence positions that exhibit significantly different minor allele frequencies in at least two DNA pools using the Skellam distribution. The performance of vipR was compared with three other models on data from a targeted resequencing study of the TMEM132D locus in 600 individuals distributed over four DNA pools. Performance of the methods was computed on SNPs that were also genotyped individually using a MALDI-TOF technique. On a set of 82 sequence variants, vipR achieved an average sensitivity of 0.80 at an average specificity of 0.92, thus outperforming the reference methods by at least 0.17 in specificity at comparable sensitivity. Availability: The code of vipR is freely available via: http://sourceforge.net/projects/htsvipr/ Contact: altmann@mpipsykl.mpg.de
format Online
Article
Text
id pubmed-3117388
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31173882011-06-17 vipR: variant identification in pooled DNA using R Altmann, Andre Weber, Peter Quast, Carina Rex-Haffner, Monika Binder, Elisabeth B. Müller-Myhsok, Bertram Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Motivation: High-throughput-sequencing (HTS) technologies are the method of choice for screening the human genome for rare sequence variants causing susceptibility to complex diseases. Unfortunately, preparation of samples for a large number of individuals is still very cost- and labor intensive. Thus, recently, screens for rare sequence variants were carried out in samples of pooled DNA, in which equimolar amounts of DNA from multiple individuals are mixed prior to sequencing with HTS. The resulting sequence data, however, poses a bioinformatics challenge: the discrimination of sequencing errors from real sequence variants present at a low frequency in the DNA pool. Results: Our method vipR uses data from multiple DNA pools in order to compensate for differences in sequencing error rates along the sequenced region. More precisely, instead of aiming at discriminating sequence variants from sequencing errors, vipR identifies sequence positions that exhibit significantly different minor allele frequencies in at least two DNA pools using the Skellam distribution. The performance of vipR was compared with three other models on data from a targeted resequencing study of the TMEM132D locus in 600 individuals distributed over four DNA pools. Performance of the methods was computed on SNPs that were also genotyped individually using a MALDI-TOF technique. On a set of 82 sequence variants, vipR achieved an average sensitivity of 0.80 at an average specificity of 0.92, thus outperforming the reference methods by at least 0.17 in specificity at comparable sensitivity. Availability: The code of vipR is freely available via: http://sourceforge.net/projects/htsvipr/ Contact: altmann@mpipsykl.mpg.de Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117388/ /pubmed/21685105 http://dx.doi.org/10.1093/bioinformatics/btr205 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
Altmann, Andre
Weber, Peter
Quast, Carina
Rex-Haffner, Monika
Binder, Elisabeth B.
Müller-Myhsok, Bertram
vipR: variant identification in pooled DNA using R
title vipR: variant identification in pooled DNA using R
title_full vipR: variant identification in pooled DNA using R
title_fullStr vipR: variant identification in pooled DNA using R
title_full_unstemmed vipR: variant identification in pooled DNA using R
title_short vipR: variant identification in pooled DNA using R
title_sort vipr: variant identification in pooled dna using r
topic Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117388/
https://www.ncbi.nlm.nih.gov/pubmed/21685105
http://dx.doi.org/10.1093/bioinformatics/btr205
work_keys_str_mv AT altmannandre viprvariantidentificationinpooleddnausingr
AT weberpeter viprvariantidentificationinpooleddnausingr
AT quastcarina viprvariantidentificationinpooleddnausingr
AT rexhaffnermonika viprvariantidentificationinpooleddnausingr
AT binderelisabethb viprvariantidentificationinpooleddnausingr
AT mullermyhsokbertram viprvariantidentificationinpooleddnausingr