Cargando…

OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly

BACKGROUND: Genome sequencing and assembly are essential for revealing the secrets of life hidden in genomes. Because of repeats in most genomes, current programs collate sequencing data into a set of assembled sequences, called contigs, instead of a complete genome. Toward completing a genome, opti...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yi-Min, Yu, Chun-Hui, Hwang, Chi-Chuan, Liu, Tsunglin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029551/
https://www.ncbi.nlm.nih.gov/pubmed/24564959
http://dx.doi.org/10.1186/1752-0509-7-S6-S7
_version_ 1782317229226852352
author Chen, Yi-Min
Yu, Chun-Hui
Hwang, Chi-Chuan
Liu, Tsunglin
author_facet Chen, Yi-Min
Yu, Chun-Hui
Hwang, Chi-Chuan
Liu, Tsunglin
author_sort Chen, Yi-Min
collection PubMed
description BACKGROUND: Genome sequencing and assembly are essential for revealing the secrets of life hidden in genomes. Because of repeats in most genomes, current programs collate sequencing data into a set of assembled sequences, called contigs, instead of a complete genome. Toward completing a genome, optical mapping is powerful in rendering the relative order of contigs on the genome, which is called scaffolding. However, connecting the neighboring contigs with nucleotide sequences requires further efforts. Nagarajian et al. have recently proposed a software module, FINISH, to close the gaps between contigs with other contig sequences after scaffolding contigs using an optical map. The results, however, are not yet satisfying. RESULTS: To increase the accuracy of contig connections, we develop OMACC, which carefully takes into account length information in optical maps. Specifically, it rescales optical map and applies length constraint for selecting the correct contig sequences for gap closure. In addition, it uses an advanced graph search algorithm to facilitate estimating the number of repeat copies within gaps between contigs. On both simulated and real datasets, OMACC achieves a <10% false gap-closing rate, three times lower than the ~27% false rate by FINISH, while maintaining a similar sensitivity. CONCLUSION: As optical mapping is becoming popular and repeats are the bottleneck of assembly, OMACC should benefit various downstream biological studies via accurately connecting contigs into a more complete genome. AVAILABILITY: http://140.116.235.124/~tliu/omacc
format Online
Article
Text
id pubmed-4029551
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40295512014-06-06 OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly Chen, Yi-Min Yu, Chun-Hui Hwang, Chi-Chuan Liu, Tsunglin BMC Syst Biol Research BACKGROUND: Genome sequencing and assembly are essential for revealing the secrets of life hidden in genomes. Because of repeats in most genomes, current programs collate sequencing data into a set of assembled sequences, called contigs, instead of a complete genome. Toward completing a genome, optical mapping is powerful in rendering the relative order of contigs on the genome, which is called scaffolding. However, connecting the neighboring contigs with nucleotide sequences requires further efforts. Nagarajian et al. have recently proposed a software module, FINISH, to close the gaps between contigs with other contig sequences after scaffolding contigs using an optical map. The results, however, are not yet satisfying. RESULTS: To increase the accuracy of contig connections, we develop OMACC, which carefully takes into account length information in optical maps. Specifically, it rescales optical map and applies length constraint for selecting the correct contig sequences for gap closure. In addition, it uses an advanced graph search algorithm to facilitate estimating the number of repeat copies within gaps between contigs. On both simulated and real datasets, OMACC achieves a <10% false gap-closing rate, three times lower than the ~27% false rate by FINISH, while maintaining a similar sensitivity. CONCLUSION: As optical mapping is becoming popular and repeats are the bottleneck of assembly, OMACC should benefit various downstream biological studies via accurately connecting contigs into a more complete genome. AVAILABILITY: http://140.116.235.124/~tliu/omacc BioMed Central 2013-12-13 /pmc/articles/PMC4029551/ /pubmed/24564959 http://dx.doi.org/10.1186/1752-0509-7-S6-S7 Text en Copyright © 2013 Chen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Yi-Min
Yu, Chun-Hui
Hwang, Chi-Chuan
Liu, Tsunglin
OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly
title OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly
title_full OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly
title_fullStr OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly
title_full_unstemmed OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly
title_short OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly
title_sort omacc: an optical-map-assisted contig connector for improving de novo genome assembly
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029551/
https://www.ncbi.nlm.nih.gov/pubmed/24564959
http://dx.doi.org/10.1186/1752-0509-7-S6-S7
work_keys_str_mv AT chenyimin omaccanopticalmapassistedcontigconnectorforimprovingdenovogenomeassembly
AT yuchunhui omaccanopticalmapassistedcontigconnectorforimprovingdenovogenomeassembly
AT hwangchichuan omaccanopticalmapassistedcontigconnectorforimprovingdenovogenomeassembly
AT liutsunglin omaccanopticalmapassistedcontigconnectorforimprovingdenovogenomeassembly