Cargando…

Ragout—a reference-assisted assembly tool for bacterial genomes

Summary: Bacterial genomes are simpler than mammalian ones, and yet assembling the former from the data currently generated by high-throughput short-read sequencing machines still results in hundreds of contigs. To improve assembly quality, recent studies have utilized longer Pacific Biosciences (Pa...

Descripción completa

Detalles Bibliográficos
Autores principales: Kolmogorov, Mikhail, Raney, Brian, Paten, Benedict, Pham, Son
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058940/
https://www.ncbi.nlm.nih.gov/pubmed/24931998
http://dx.doi.org/10.1093/bioinformatics/btu280
_version_ 1782321190581305344
author Kolmogorov, Mikhail
Raney, Brian
Paten, Benedict
Pham, Son
author_facet Kolmogorov, Mikhail
Raney, Brian
Paten, Benedict
Pham, Son
author_sort Kolmogorov, Mikhail
collection PubMed
description Summary: Bacterial genomes are simpler than mammalian ones, and yet assembling the former from the data currently generated by high-throughput short-read sequencing machines still results in hundreds of contigs. To improve assembly quality, recent studies have utilized longer Pacific Biosciences (PacBio) reads or jumping libraries to connect contigs into larger scaffolds or help assemblers resolve ambiguities in repetitive regions of the genome. However, their popularity in contemporary genomic research is still limited by high cost and error rates. In this work, we explore the possibility of improving assemblies by using complete genomes from closely related species/strains. We present Ragout, a genome rearrangement approach, to address this problem. In contrast with most reference-guided algorithms, where only one reference genome is used, Ragout uses multiple references along with the evolutionary relationship among these references in order to determine the correct order of the contigs. Additionally, Ragout uses the assembly graph and multi-scale synteny blocks to reduce assembly gaps caused by small contigs from the input assembly. In simulations as well as real datasets, we believe that for common bacterial species, where many complete genome sequences from related strains have been available, the current high-throughput short-read sequencing paradigm is sufficient to obtain a single high-quality scaffold for each chromosome. Availability: The Ragout software is freely available at: https://github.com/fenderglass/Ragout. Contact: spham@salk.edu
format Online
Article
Text
id pubmed-4058940
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40589402014-06-18 Ragout—a reference-assisted assembly tool for bacterial genomes Kolmogorov, Mikhail Raney, Brian Paten, Benedict Pham, Son Bioinformatics Ismb 2014 Proceedings Papers Committee Summary: Bacterial genomes are simpler than mammalian ones, and yet assembling the former from the data currently generated by high-throughput short-read sequencing machines still results in hundreds of contigs. To improve assembly quality, recent studies have utilized longer Pacific Biosciences (PacBio) reads or jumping libraries to connect contigs into larger scaffolds or help assemblers resolve ambiguities in repetitive regions of the genome. However, their popularity in contemporary genomic research is still limited by high cost and error rates. In this work, we explore the possibility of improving assemblies by using complete genomes from closely related species/strains. We present Ragout, a genome rearrangement approach, to address this problem. In contrast with most reference-guided algorithms, where only one reference genome is used, Ragout uses multiple references along with the evolutionary relationship among these references in order to determine the correct order of the contigs. Additionally, Ragout uses the assembly graph and multi-scale synteny blocks to reduce assembly gaps caused by small contigs from the input assembly. In simulations as well as real datasets, we believe that for common bacterial species, where many complete genome sequences from related strains have been available, the current high-throughput short-read sequencing paradigm is sufficient to obtain a single high-quality scaffold for each chromosome. Availability: The Ragout software is freely available at: https://github.com/fenderglass/Ragout. Contact: spham@salk.edu Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058940/ /pubmed/24931998 http://dx.doi.org/10.1093/bioinformatics/btu280 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2014 Proceedings Papers Committee
Kolmogorov, Mikhail
Raney, Brian
Paten, Benedict
Pham, Son
Ragout—a reference-assisted assembly tool for bacterial genomes
title Ragout—a reference-assisted assembly tool for bacterial genomes
title_full Ragout—a reference-assisted assembly tool for bacterial genomes
title_fullStr Ragout—a reference-assisted assembly tool for bacterial genomes
title_full_unstemmed Ragout—a reference-assisted assembly tool for bacterial genomes
title_short Ragout—a reference-assisted assembly tool for bacterial genomes
title_sort ragout—a reference-assisted assembly tool for bacterial genomes
topic Ismb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058940/
https://www.ncbi.nlm.nih.gov/pubmed/24931998
http://dx.doi.org/10.1093/bioinformatics/btu280
work_keys_str_mv AT kolmogorovmikhail ragoutareferenceassistedassemblytoolforbacterialgenomes
AT raneybrian ragoutareferenceassistedassemblytoolforbacterialgenomes
AT patenbenedict ragoutareferenceassistedassemblytoolforbacterialgenomes
AT phamson ragoutareferenceassistedassemblytoolforbacterialgenomes