Cargando…
Jointly aligning a group of DNA reads improves accuracy of identifying large deletions
Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of geno...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5815140/ https://www.ncbi.nlm.nih.gov/pubmed/29182778 http://dx.doi.org/10.1093/nar/gkx1175 |
_version_ | 1783300451540664320 |
---|---|
author | Shrestha, Anish M S Frith, Martin C Asai, Kiyoshi Richard, Hugues |
author_facet | Shrestha, Anish M S Frith, Martin C Asai, Kiyoshi Richard, Hugues |
author_sort | Shrestha, Anish M S |
collection | PubMed |
description | Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of genomic rearrangements to accumulate in repeat-rich regions imposes severe ambiguities in these alignments, and consequently on the variant calls—with current read lengths, this affects more than one third of known large deletions in the C. Venter genome. We present a method to jointly align reads to a genome, whereby alignment ambiguity of one read can be disambiguated by other reads. We show this leads to a significant improvement in the accuracy of identifying large deletions (≥20 bases), while imposing minimal computational overhead and maintaining an overall running time that is at par with current tools. A software implementation is available as an open-source Python program called JRA at https://bitbucket.org/jointreadalignment/jra-src. |
format | Online Article Text |
id | pubmed-5815140 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58151402018-02-23 Jointly aligning a group of DNA reads improves accuracy of identifying large deletions Shrestha, Anish M S Frith, Martin C Asai, Kiyoshi Richard, Hugues Nucleic Acids Res Methods Online Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of genomic rearrangements to accumulate in repeat-rich regions imposes severe ambiguities in these alignments, and consequently on the variant calls—with current read lengths, this affects more than one third of known large deletions in the C. Venter genome. We present a method to jointly align reads to a genome, whereby alignment ambiguity of one read can be disambiguated by other reads. We show this leads to a significant improvement in the accuracy of identifying large deletions (≥20 bases), while imposing minimal computational overhead and maintaining an overall running time that is at par with current tools. A software implementation is available as an open-source Python program called JRA at https://bitbucket.org/jointreadalignment/jra-src. Oxford University Press 2018-02-16 2017-11-22 /pmc/articles/PMC5815140/ /pubmed/29182778 http://dx.doi.org/10.1093/nar/gkx1175 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Shrestha, Anish M S Frith, Martin C Asai, Kiyoshi Richard, Hugues Jointly aligning a group of DNA reads improves accuracy of identifying large deletions |
title | Jointly aligning a group of DNA reads improves accuracy of identifying large deletions |
title_full | Jointly aligning a group of DNA reads improves accuracy of identifying large deletions |
title_fullStr | Jointly aligning a group of DNA reads improves accuracy of identifying large deletions |
title_full_unstemmed | Jointly aligning a group of DNA reads improves accuracy of identifying large deletions |
title_short | Jointly aligning a group of DNA reads improves accuracy of identifying large deletions |
title_sort | jointly aligning a group of dna reads improves accuracy of identifying large deletions |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5815140/ https://www.ncbi.nlm.nih.gov/pubmed/29182778 http://dx.doi.org/10.1093/nar/gkx1175 |
work_keys_str_mv | AT shresthaanishms jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions AT frithmartinc jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions AT asaikiyoshi jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions AT richardhugues jointlyaligningagroupofdnareadsimprovesaccuracyofidentifyinglargedeletions |