Cargando…

Jointly aligning a group of DNA reads improves accuracy of identifying large deletions

Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of geno...

Descripción completa

Detalles Bibliográficos
Autores principales: Shrestha, Anish M S, Frith, Martin C, Asai, Kiyoshi, Richard, Hugues
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5815140/
https://www.ncbi.nlm.nih.gov/pubmed/29182778
http://dx.doi.org/10.1093/nar/gkx1175
_version_ 1783300451540664320
author Shrestha, Anish M S
Frith, Martin C
Asai, Kiyoshi
Richard, Hugues
author_facet Shrestha, Anish M S
Frith, Martin C
Asai, Kiyoshi
Richard, Hugues
author_sort Shrestha, Anish M S
collection PubMed
description Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of genomic rearrangements to accumulate in repeat-rich regions imposes severe ambiguities in these alignments, and consequently on the variant calls—with current read lengths, this affects more than one third of known large deletions in the C. Venter genome. We present a method to jointly align reads to a genome, whereby alignment ambiguity of one read can be disambiguated by other reads. We show this leads to a significant improvement in the accuracy of identifying large deletions (≥20 bases), while imposing minimal computational overhead and maintaining an overall running time that is at par with current tools. A software implementation is available as an open-source Python program called JRA at https://bitbucket.org/jointreadalignment/jra-src.
format Online
Article
Text
id pubmed-5815140
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58151402018-02-23 Jointly aligning a group of DNA reads improves accuracy of identifying large deletions Shrestha, Anish M S Frith, Martin C Asai, Kiyoshi Richard, Hugues Nucleic Acids Res Methods Online Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of genomic rearrangements to accumulate in repeat-rich regions imposes severe ambiguities in these alignments, and consequently on the variant calls—with current read lengths, this affects more than one third of known large deletions in the C. Venter genome. We present a method to jointly align reads to a genome, whereby alignment ambiguity of one read can be disambiguated by other reads. We show this leads to a significant improvement in the accuracy of identifying large deletions (≥20 bases), while imposing minimal computational overhead and maintaining an overall running time that is at par with current tools. A software implementation is available as an open-source Python program called JRA at https://bitbucket.org/jointreadalignment/jra-src. Oxford University Press 2018-02-16 2017-11-22 /pmc/articles/PMC5815140/ /pubmed/29182778 http://dx.doi.org/10.1093/nar/gkx1175 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Shrestha, Anish M S
Frith, Martin C
Asai, Kiyoshi
Richard, Hugues
Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
title Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
title_full Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
title_fullStr Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
title_full_unstemmed Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
title_short Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
title_sort jointly aligning a group of dna reads improves accuracy of identifying large deletions
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5815140/
https://www.ncbi.nlm.nih.gov/pubmed/29182778
http://dx.doi.org/10.1093/nar/gkx1175
work_keys_str_mv AT shresthaanishms jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions
AT frithmartinc jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions
AT asaikiyoshi jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions
AT richardhugues jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions