Cargando…

COSINE: non-seeding method for mapping long noisy sequences

Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE...

Descripción completa

Detalles Bibliográficos
Autores principales: Afshar, Pegah Tootoonchi, Wong, Wing Hung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737678/
https://www.ncbi.nlm.nih.gov/pubmed/28586438
http://dx.doi.org/10.1093/nar/gkx511
_version_ 1783287567434645504
author Afshar, Pegah Tootoonchi
Wong, Wing Hung
author_facet Afshar, Pegah Tootoonchi
Wong, Wing Hung
author_sort Afshar, Pegah Tootoonchi
collection PubMed
description Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3–4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.
format Online
Article
Text
id pubmed-5737678
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-57376782018-01-04 COSINE: non-seeding method for mapping long noisy sequences Afshar, Pegah Tootoonchi Wong, Wing Hung Nucleic Acids Res Methods Online Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3–4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods. Oxford University Press 2017-08-21 2017-06-06 /pmc/articles/PMC5737678/ /pubmed/28586438 http://dx.doi.org/10.1093/nar/gkx511 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Afshar, Pegah Tootoonchi
Wong, Wing Hung
COSINE: non-seeding method for mapping long noisy sequences
title COSINE: non-seeding method for mapping long noisy sequences
title_full COSINE: non-seeding method for mapping long noisy sequences
title_fullStr COSINE: non-seeding method for mapping long noisy sequences
title_full_unstemmed COSINE: non-seeding method for mapping long noisy sequences
title_short COSINE: non-seeding method for mapping long noisy sequences
title_sort cosine: non-seeding method for mapping long noisy sequences
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737678/
https://www.ncbi.nlm.nih.gov/pubmed/28586438
http://dx.doi.org/10.1093/nar/gkx511
work_keys_str_mv AT afsharpegahtootoonchi cosinenonseedingmethodformappinglongnoisysequences
AT wongwinghung cosinenonseedingmethodformappinglongnoisysequences