Cargando…

Local alignment of two-base encoded DNA sequence

BACKGROUND: DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare s...

Descripción completa

Detalles Bibliográficos
Autores principales: Homer, Nils, Merriman, Barry, Nelson, Stanley F
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2709925/
https://www.ncbi.nlm.nih.gov/pubmed/19508732
http://dx.doi.org/10.1186/1471-2105-10-175
_version_ 1782169339608170496
author Homer, Nils
Merriman, Barry
Nelson, Stanley F
author_facet Homer, Nils
Merriman, Barry
Nelson, Stanley F
author_sort Homer, Nils
collection PubMed
description BACKGROUND: DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. RESULTS: We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. CONCLUSION: The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data.
format Text
id pubmed-2709925
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27099252009-07-14 Local alignment of two-base encoded DNA sequence Homer, Nils Merriman, Barry Nelson, Stanley F BMC Bioinformatics Research Article BACKGROUND: DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. RESULTS: We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. CONCLUSION: The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. BioMed Central 2009-06-09 /pmc/articles/PMC2709925/ /pubmed/19508732 http://dx.doi.org/10.1186/1471-2105-10-175 Text en Copyright © 2009 Homer et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Homer, Nils
Merriman, Barry
Nelson, Stanley F
Local alignment of two-base encoded DNA sequence
title Local alignment of two-base encoded DNA sequence
title_full Local alignment of two-base encoded DNA sequence
title_fullStr Local alignment of two-base encoded DNA sequence
title_full_unstemmed Local alignment of two-base encoded DNA sequence
title_short Local alignment of two-base encoded DNA sequence
title_sort local alignment of two-base encoded dna sequence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2709925/
https://www.ncbi.nlm.nih.gov/pubmed/19508732
http://dx.doi.org/10.1186/1471-2105-10-175
work_keys_str_mv AT homernils localalignmentoftwobaseencodeddnasequence
AT merrimanbarry localalignmentoftwobaseencodeddnasequence
AT nelsonstanleyf localalignmentoftwobaseencodeddnasequence