Cargando…

AGORA: Assembly Guided by Optical Restriction Alignment

BACKGROUND: Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly softwa...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Henry C, Goldstein, Steve, Mendelowitz, Lee, Zhou, Shiguo, Wetzel, Joshua, Schwartz, David C, Pop, Mihai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431216/
https://www.ncbi.nlm.nih.gov/pubmed/22856673
http://dx.doi.org/10.1186/1471-2105-13-189
_version_ 1782242039495131136
author Lin, Henry C
Goldstein, Steve
Mendelowitz, Lee
Zhou, Shiguo
Wetzel, Joshua
Schwartz, David C
Pop, Mihai
author_facet Lin, Henry C
Goldstein, Steve
Mendelowitz, Lee
Zhou, Shiguo
Wetzel, Joshua
Schwartz, David C
Pop, Mihai
author_sort Lin, Henry C
collection PubMed
description BACKGROUND: Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome. RESULTS: We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences. Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly. CONCLUSIONS: Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the potential benefit of more accurate optical mapping technologies, such as nano-coding.
format Online
Article
Text
id pubmed-3431216
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34312162012-08-31 AGORA: Assembly Guided by Optical Restriction Alignment Lin, Henry C Goldstein, Steve Mendelowitz, Lee Zhou, Shiguo Wetzel, Joshua Schwartz, David C Pop, Mihai BMC Bioinformatics Methodology Article BACKGROUND: Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome. RESULTS: We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences. Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly. CONCLUSIONS: Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the potential benefit of more accurate optical mapping technologies, such as nano-coding. BioMed Central 2012-08-02 /pmc/articles/PMC3431216/ /pubmed/22856673 http://dx.doi.org/10.1186/1471-2105-13-189 Text en Copyright ©2012 Lin et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Lin, Henry C
Goldstein, Steve
Mendelowitz, Lee
Zhou, Shiguo
Wetzel, Joshua
Schwartz, David C
Pop, Mihai
AGORA: Assembly Guided by Optical Restriction Alignment
title AGORA: Assembly Guided by Optical Restriction Alignment
title_full AGORA: Assembly Guided by Optical Restriction Alignment
title_fullStr AGORA: Assembly Guided by Optical Restriction Alignment
title_full_unstemmed AGORA: Assembly Guided by Optical Restriction Alignment
title_short AGORA: Assembly Guided by Optical Restriction Alignment
title_sort agora: assembly guided by optical restriction alignment
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431216/
https://www.ncbi.nlm.nih.gov/pubmed/22856673
http://dx.doi.org/10.1186/1471-2105-13-189
work_keys_str_mv AT linhenryc agoraassemblyguidedbyopticalrestrictionalignment
AT goldsteinsteve agoraassemblyguidedbyopticalrestrictionalignment
AT mendelowitzlee agoraassemblyguidedbyopticalrestrictionalignment
AT zhoushiguo agoraassemblyguidedbyopticalrestrictionalignment
AT wetzeljoshua agoraassemblyguidedbyopticalrestrictionalignment
AT schwartzdavidc agoraassemblyguidedbyopticalrestrictionalignment
AT popmihai agoraassemblyguidedbyopticalrestrictionalignment