Cargando…
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published ge...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626529/ https://www.ncbi.nlm.nih.gov/pubmed/23587118 http://dx.doi.org/10.1186/2047-217X-1-18 |
_version_ | 1782266196885766144 |
---|---|
author | Luo, Ruibang Liu, Binghang Xie, Yinlong Li, Zhenyu Huang, Weihua Yuan, Jianying He, Guangzhu Chen, Yanxiang Pan, Qi Liu, Yunjie Tang, Jingbo Wu, Gengxiong Zhang, Hao Shi, Yujian Liu, Yong Yu, Chang Wang, Bo Lu, Yao Han, Changlei Cheung, David W Yiu, Siu-Ming Peng, Shaoliang Xiaoqian, Zhu Liu, Guangming Liao, Xiangke Li, Yingrui Yang, Huanming Wang, Jian Lam, Tak-Wah Wang, Jun |
author_facet | Luo, Ruibang Liu, Binghang Xie, Yinlong Li, Zhenyu Huang, Weihua Yuan, Jianying He, Guangzhu Chen, Yanxiang Pan, Qi Liu, Yunjie Tang, Jingbo Wu, Gengxiong Zhang, Hao Shi, Yujian Liu, Yong Yu, Chang Wang, Bo Lu, Yao Han, Changlei Cheung, David W Yiu, Siu-Ming Peng, Shaoliang Xiaoqian, Zhu Liu, Guangming Liao, Xiangke Li, Yingrui Yang, Huanming Wang, Jian Lam, Tak-Wah Wang, Jun |
author_sort | Luo, Ruibang |
collection | PubMed |
description | BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published genomes, but it still needs improvement in continuity, accuracy and coverage, especially in repeat regions. FINDINGS: To overcome these challenges, we have developed its successor, SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome. CONCLUSIONS: Benchmark using the Assemblathon1 and GAGE datasets showed that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo and is competitive to other assemblers on both assembly length and accuracy. We also provide an updated assembly version of the 2008 Asian (YH) genome using SOAPdenovo2. Here, the contig and scaffold N50 of the YH genome were ~20.9 kbp and ~22 Mbp, respectively, which is 3-fold and 50-fold longer than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 lower during the point of largest memory consumption. |
format | Online Article Text |
id | pubmed-3626529 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36265292013-04-24 SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler Luo, Ruibang Liu, Binghang Xie, Yinlong Li, Zhenyu Huang, Weihua Yuan, Jianying He, Guangzhu Chen, Yanxiang Pan, Qi Liu, Yunjie Tang, Jingbo Wu, Gengxiong Zhang, Hao Shi, Yujian Liu, Yong Yu, Chang Wang, Bo Lu, Yao Han, Changlei Cheung, David W Yiu, Siu-Ming Peng, Shaoliang Xiaoqian, Zhu Liu, Guangming Liao, Xiangke Li, Yingrui Yang, Huanming Wang, Jian Lam, Tak-Wah Wang, Jun Gigascience Technical Note BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published genomes, but it still needs improvement in continuity, accuracy and coverage, especially in repeat regions. FINDINGS: To overcome these challenges, we have developed its successor, SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome. CONCLUSIONS: Benchmark using the Assemblathon1 and GAGE datasets showed that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo and is competitive to other assemblers on both assembly length and accuracy. We also provide an updated assembly version of the 2008 Asian (YH) genome using SOAPdenovo2. Here, the contig and scaffold N50 of the YH genome were ~20.9 kbp and ~22 Mbp, respectively, which is 3-fold and 50-fold longer than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 lower during the point of largest memory consumption. BioMed Central 2012-12-27 /pmc/articles/PMC3626529/ /pubmed/23587118 http://dx.doi.org/10.1186/2047-217X-1-18 Text en Copyright © 2012 Luo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Luo, Ruibang Liu, Binghang Xie, Yinlong Li, Zhenyu Huang, Weihua Yuan, Jianying He, Guangzhu Chen, Yanxiang Pan, Qi Liu, Yunjie Tang, Jingbo Wu, Gengxiong Zhang, Hao Shi, Yujian Liu, Yong Yu, Chang Wang, Bo Lu, Yao Han, Changlei Cheung, David W Yiu, Siu-Ming Peng, Shaoliang Xiaoqian, Zhu Liu, Guangming Liao, Xiangke Li, Yingrui Yang, Huanming Wang, Jian Lam, Tak-Wah Wang, Jun SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler |
title | SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler |
title_full | SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler |
title_fullStr | SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler |
title_full_unstemmed | SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler |
title_short | SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler |
title_sort | soapdenovo2: an empirically improved memory-efficient short-read de novo assembler |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626529/ https://www.ncbi.nlm.nih.gov/pubmed/23587118 http://dx.doi.org/10.1186/2047-217X-1-18 |
work_keys_str_mv | AT luoruibang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT liubinghang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT xieyinlong soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT lizhenyu soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT huangweihua soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT yuanjianying soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT heguangzhu soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT chenyanxiang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT panqi soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT liuyunjie soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT tangjingbo soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT wugengxiong soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT zhanghao soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT shiyujian soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT liuyong soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT yuchang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT wangbo soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT luyao soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT hanchanglei soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT cheungdavidw soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT yiusiuming soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT pengshaoliang soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT xiaoqianzhu soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT liuguangming soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT liaoxiangke soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT liyingrui soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT yanghuanming soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT wangjian soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT lamtakwah soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler AT wangjun soapdenovo2anempiricallyimprovedmemoryefficientshortreaddenovoassembler |