Cargando…

SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler

BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published ge...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Ruibang, Liu, Binghang, Xie, Yinlong, Li, Zhenyu, Huang, Weihua, Yuan, Jianying, He, Guangzhu, Chen, Yanxiang, Pan, Qi, Liu, Yunjie, Tang, Jingbo, Wu, Gengxiong, Zhang, Hao, Shi, Yujian, Liu, Yong, Yu, Chang, Wang, Bo, Lu, Yao, Han, Changlei, Cheung, David W, Yiu, Siu-Ming, Peng, Shaoliang, Xiaoqian, Zhu, Liu, Guangming, Liao, Xiangke, Li, Yingrui, Yang, Huanming, Wang, Jian, Lam, Tak-Wah, Wang, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626529/
https://www.ncbi.nlm.nih.gov/pubmed/23587118
http://dx.doi.org/10.1186/2047-217X-1-18
_version_ 1782266196885766144
author Luo, Ruibang
Liu, Binghang
Xie, Yinlong
Li, Zhenyu
Huang, Weihua
Yuan, Jianying
He, Guangzhu
Chen, Yanxiang
Pan, Qi
Liu, Yunjie
Tang, Jingbo
Wu, Gengxiong
Zhang, Hao
Shi, Yujian
Liu, Yong
Yu, Chang
Wang, Bo
Lu, Yao
Han, Changlei
Cheung, David W
Yiu, Siu-Ming
Peng, Shaoliang
Xiaoqian, Zhu
Liu, Guangming
Liao, Xiangke
Li, Yingrui
Yang, Huanming
Wang, Jian
Lam, Tak-Wah
Wang, Jun
author_facet Luo, Ruibang
Liu, Binghang
Xie, Yinlong
Li, Zhenyu
Huang, Weihua
Yuan, Jianying
He, Guangzhu
Chen, Yanxiang
Pan, Qi
Liu, Yunjie
Tang, Jingbo
Wu, Gengxiong
Zhang, Hao
Shi, Yujian
Liu, Yong
Yu, Chang
Wang, Bo
Lu, Yao
Han, Changlei
Cheung, David W
Yiu, Siu-Ming
Peng, Shaoliang
Xiaoqian, Zhu
Liu, Guangming
Liao, Xiangke
Li, Yingrui
Yang, Huanming
Wang, Jian
Lam, Tak-Wah
Wang, Jun
author_sort Luo, Ruibang
collection PubMed
description BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published genomes, but it still needs improvement in continuity, accuracy and coverage, especially in repeat regions. FINDINGS: To overcome these challenges, we have developed its successor, SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome. CONCLUSIONS: Benchmark using the Assemblathon1 and GAGE datasets showed that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo and is competitive to other assemblers on both assembly length and accuracy. We also provide an updated assembly version of the 2008 Asian (YH) genome using SOAPdenovo2. Here, the contig and scaffold N50 of the YH genome were ~20.9 kbp and ~22 Mbp, respectively, which is 3-fold and 50-fold longer than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 lower during the point of largest memory consumption.
format Online
Article
Text
id pubmed-3626529
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36265292013-04-24 SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler Luo, Ruibang Liu, Binghang Xie, Yinlong Li, Zhenyu Huang, Weihua Yuan, Jianying He, Guangzhu Chen, Yanxiang Pan, Qi Liu, Yunjie Tang, Jingbo Wu, Gengxiong Zhang, Hao Shi, Yujian Liu, Yong Yu, Chang Wang, Bo Lu, Yao Han, Changlei Cheung, David W Yiu, Siu-Ming Peng, Shaoliang Xiaoqian, Zhu Liu, Guangming Liao, Xiangke Li, Yingrui Yang, Huanming Wang, Jian Lam, Tak-Wah Wang, Jun Gigascience Technical Note BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published genomes, but it still needs improvement in continuity, accuracy and coverage, especially in repeat regions. FINDINGS: To overcome these challenges, we have developed its successor, SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome. CONCLUSIONS: Benchmark using the Assemblathon1 and GAGE datasets showed that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo and is competitive to other assemblers on both assembly length and accuracy. We also provide an updated assembly version of the 2008 Asian (YH) genome using SOAPdenovo2. Here, the contig and scaffold N50 of the YH genome were ~20.9 kbp and ~22 Mbp, respectively, which is 3-fold and 50-fold longer than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 lower during the point of largest memory consumption. BioMed Central 2012-12-27 /pmc/articles/PMC3626529/ /pubmed/23587118 http://dx.doi.org/10.1186/2047-217X-1-18 Text en Copyright © 2012 Luo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Luo, Ruibang
Liu, Binghang
Xie, Yinlong
Li, Zhenyu
Huang, Weihua
Yuan, Jianying
He, Guangzhu
Chen, Yanxiang
Pan, Qi
Liu, Yunjie
Tang, Jingbo
Wu, Gengxiong
Zhang, Hao
Shi, Yujian
Liu, Yong
Yu, Chang
Wang, Bo
Lu, Yao
Han, Changlei
Cheung, David W
Yiu, Siu-Ming
Peng, Shaoliang
Xiaoqian, Zhu
Liu, Guangming
Liao, Xiangke
Li, Yingrui
Yang, Huanming
Wang, Jian
Lam, Tak-Wah
Wang, Jun
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
title SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
title_full SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
title_fullStr SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
title_full_unstemmed SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
title_short SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
title_sort soapdenovo2: an empirically improved memory-efficient short-read de novo assembler
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626529/
https://www.ncbi.nlm.nih.gov/pubmed/23587118
http://dx.doi.org/10.1186/2047-217X-1-18
work_keys_str_mv AT luoruibang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT liubinghang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT xieyinlong soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT lizhenyu soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT huangweihua soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT yuanjianying soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT heguangzhu soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT chenyanxiang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT panqi soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT liuyunjie soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT tangjingbo soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT wugengxiong soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT zhanghao soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT shiyujian soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT liuyong soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT yuchang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT wangbo soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT luyao soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT hanchanglei soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT cheungdavidw soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT yiusiuming soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT pengshaoliang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT xiaoqianzhu soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT liuguangming soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT liaoxiangke soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT liyingrui soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT yanghuanming soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT wangjian soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT lamtakwah soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler
AT wangjun soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler