Cargando…

VARiD: A variation detection framework for color-space and letter-space platforms

Motivation: High-throughput sequencing (HTS) technologies are transforming the study of genomic variation. The various HTS technologies have different sequencing biases and error rates, and while most HTS technologies sequence the residues of the genome directly, generating base calls for each posit...

Descripción completa

Detalles Bibliográficos
Autores principales: Dalca, Adrian V., Rumble, Stephen M., Levy, Samuel, Brudno, Michael
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881369/
https://www.ncbi.nlm.nih.gov/pubmed/20529926
http://dx.doi.org/10.1093/bioinformatics/btq184
_version_ 1782182108210397184
author Dalca, Adrian V.
Rumble, Stephen M.
Levy, Samuel
Brudno, Michael
author_facet Dalca, Adrian V.
Rumble, Stephen M.
Levy, Samuel
Brudno, Michael
author_sort Dalca, Adrian V.
collection PubMed
description Motivation: High-throughput sequencing (HTS) technologies are transforming the study of genomic variation. The various HTS technologies have different sequencing biases and error rates, and while most HTS technologies sequence the residues of the genome directly, generating base calls for each position, the Applied Biosystem's SOLiD platform generates dibase-coded (color space) sequences. While combining data from the various platforms should increase the accuracy of variation detection, to date there are only a few tools that can identify variants from color space data, and none that can analyze color space and regular (letter space) data together. Results: We present VARiD—a probabilistic method for variation detection from both letter- and color-space reads simultaneously. VARiD is based on a hidden Markov model and uses the forward-backward algorithm to accurately identify heterozygous, homozygous and tri-allelic SNPs, as well as micro-indels. Our analysis shows that VARiD performs better than the AB SOLiD toolset at detecting variants from color-space data alone, and improves the calls dramatically when letter- and color-space reads are combined. Availability: The toolset is freely available at http://compbio.cs.utoronto.ca/varid Contact: varid@cs.toronto.edu
format Text
id pubmed-2881369
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28813692010-06-08 VARiD: A variation detection framework for color-space and letter-space platforms Dalca, Adrian V. Rumble, Stephen M. Levy, Samuel Brudno, Michael Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: High-throughput sequencing (HTS) technologies are transforming the study of genomic variation. The various HTS technologies have different sequencing biases and error rates, and while most HTS technologies sequence the residues of the genome directly, generating base calls for each position, the Applied Biosystem's SOLiD platform generates dibase-coded (color space) sequences. While combining data from the various platforms should increase the accuracy of variation detection, to date there are only a few tools that can identify variants from color space data, and none that can analyze color space and regular (letter space) data together. Results: We present VARiD—a probabilistic method for variation detection from both letter- and color-space reads simultaneously. VARiD is based on a hidden Markov model and uses the forward-backward algorithm to accurately identify heterozygous, homozygous and tri-allelic SNPs, as well as micro-indels. Our analysis shows that VARiD performs better than the AB SOLiD toolset at detecting variants from color-space data alone, and improves the calls dramatically when letter- and color-space reads are combined. Availability: The toolset is freely available at http://compbio.cs.utoronto.ca/varid Contact: varid@cs.toronto.edu Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881369/ /pubmed/20529926 http://dx.doi.org/10.1093/bioinformatics/btq184 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
Dalca, Adrian V.
Rumble, Stephen M.
Levy, Samuel
Brudno, Michael
VARiD: A variation detection framework for color-space and letter-space platforms
title VARiD: A variation detection framework for color-space and letter-space platforms
title_full VARiD: A variation detection framework for color-space and letter-space platforms
title_fullStr VARiD: A variation detection framework for color-space and letter-space platforms
title_full_unstemmed VARiD: A variation detection framework for color-space and letter-space platforms
title_short VARiD: A variation detection framework for color-space and letter-space platforms
title_sort varid: a variation detection framework for color-space and letter-space platforms
topic Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881369/
https://www.ncbi.nlm.nih.gov/pubmed/20529926
http://dx.doi.org/10.1093/bioinformatics/btq184
work_keys_str_mv AT dalcaadrianv varidavariationdetectionframeworkforcolorspaceandletterspaceplatforms
AT rumblestephenm varidavariationdetectionframeworkforcolorspaceandletterspaceplatforms
AT levysamuel varidavariationdetectionframeworkforcolorspaceandletterspaceplatforms
AT brudnomichael varidavariationdetectionframeworkforcolorspaceandletterspaceplatforms