Cargando…
vipR: variant identification in pooled DNA using R
Motivation: High-throughput-sequencing (HTS) technologies are the method of choice for screening the human genome for rare sequence variants causing susceptibility to complex diseases. Unfortunately, preparation of samples for a large number of individuals is still very cost- and labor intensive. Th...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117388/ https://www.ncbi.nlm.nih.gov/pubmed/21685105 http://dx.doi.org/10.1093/bioinformatics/btr205 |
_version_ | 1782206327729160192 |
---|---|
author | Altmann, Andre Weber, Peter Quast, Carina Rex-Haffner, Monika Binder, Elisabeth B. Müller-Myhsok, Bertram |
author_facet | Altmann, Andre Weber, Peter Quast, Carina Rex-Haffner, Monika Binder, Elisabeth B. Müller-Myhsok, Bertram |
author_sort | Altmann, Andre |
collection | PubMed |
description | Motivation: High-throughput-sequencing (HTS) technologies are the method of choice for screening the human genome for rare sequence variants causing susceptibility to complex diseases. Unfortunately, preparation of samples for a large number of individuals is still very cost- and labor intensive. Thus, recently, screens for rare sequence variants were carried out in samples of pooled DNA, in which equimolar amounts of DNA from multiple individuals are mixed prior to sequencing with HTS. The resulting sequence data, however, poses a bioinformatics challenge: the discrimination of sequencing errors from real sequence variants present at a low frequency in the DNA pool. Results: Our method vipR uses data from multiple DNA pools in order to compensate for differences in sequencing error rates along the sequenced region. More precisely, instead of aiming at discriminating sequence variants from sequencing errors, vipR identifies sequence positions that exhibit significantly different minor allele frequencies in at least two DNA pools using the Skellam distribution. The performance of vipR was compared with three other models on data from a targeted resequencing study of the TMEM132D locus in 600 individuals distributed over four DNA pools. Performance of the methods was computed on SNPs that were also genotyped individually using a MALDI-TOF technique. On a set of 82 sequence variants, vipR achieved an average sensitivity of 0.80 at an average specificity of 0.92, thus outperforming the reference methods by at least 0.17 in specificity at comparable sensitivity. Availability: The code of vipR is freely available via: http://sourceforge.net/projects/htsvipr/ Contact: altmann@mpipsykl.mpg.de |
format | Online Article Text |
id | pubmed-3117388 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-31173882011-06-17 vipR: variant identification in pooled DNA using R Altmann, Andre Weber, Peter Quast, Carina Rex-Haffner, Monika Binder, Elisabeth B. Müller-Myhsok, Bertram Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Motivation: High-throughput-sequencing (HTS) technologies are the method of choice for screening the human genome for rare sequence variants causing susceptibility to complex diseases. Unfortunately, preparation of samples for a large number of individuals is still very cost- and labor intensive. Thus, recently, screens for rare sequence variants were carried out in samples of pooled DNA, in which equimolar amounts of DNA from multiple individuals are mixed prior to sequencing with HTS. The resulting sequence data, however, poses a bioinformatics challenge: the discrimination of sequencing errors from real sequence variants present at a low frequency in the DNA pool. Results: Our method vipR uses data from multiple DNA pools in order to compensate for differences in sequencing error rates along the sequenced region. More precisely, instead of aiming at discriminating sequence variants from sequencing errors, vipR identifies sequence positions that exhibit significantly different minor allele frequencies in at least two DNA pools using the Skellam distribution. The performance of vipR was compared with three other models on data from a targeted resequencing study of the TMEM132D locus in 600 individuals distributed over four DNA pools. Performance of the methods was computed on SNPs that were also genotyped individually using a MALDI-TOF technique. On a set of 82 sequence variants, vipR achieved an average sensitivity of 0.80 at an average specificity of 0.92, thus outperforming the reference methods by at least 0.17 in specificity at comparable sensitivity. Availability: The code of vipR is freely available via: http://sourceforge.net/projects/htsvipr/ Contact: altmann@mpipsykl.mpg.de Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117388/ /pubmed/21685105 http://dx.doi.org/10.1093/bioinformatics/btr205 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Altmann, Andre Weber, Peter Quast, Carina Rex-Haffner, Monika Binder, Elisabeth B. Müller-Myhsok, Bertram vipR: variant identification in pooled DNA using R |
title | vipR: variant identification in pooled DNA using R |
title_full | vipR: variant identification in pooled DNA using R |
title_fullStr | vipR: variant identification in pooled DNA using R |
title_full_unstemmed | vipR: variant identification in pooled DNA using R |
title_short | vipR: variant identification in pooled DNA using R |
title_sort | vipr: variant identification in pooled dna using r |
topic | Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117388/ https://www.ncbi.nlm.nih.gov/pubmed/21685105 http://dx.doi.org/10.1093/bioinformatics/btr205 |
work_keys_str_mv | AT altmannandre viprvariantidentificationinpooleddnausingr AT weberpeter viprvariantidentificationinpooleddnausingr AT quastcarina viprvariantidentificationinpooleddnausingr AT rexhaffnermonika viprvariantidentificationinpooleddnausingr AT binderelisabethb viprvariantidentificationinpooleddnausingr AT mullermyhsokbertram viprvariantidentificationinpooleddnausingr |