Cargando…
De novo meta-assembly of ultra-deep sequencing data
We introduce a new divide and conquer approach to deal with the problem of de novo genome assembly in the presence of ultra-deep sequencing data (i.e. coverage of 1000x or higher). Our proposed meta-assembler Slicembler partitions the input data into optimal-sized ‘slices’ and uses a standard assemb...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765875/ https://www.ncbi.nlm.nih.gov/pubmed/26072514 http://dx.doi.org/10.1093/bioinformatics/btv226 |
_version_ | 1782417587848609792 |
---|---|
author | Mirebrahim, Hamid Close, Timothy J. Lonardi, Stefano |
author_facet | Mirebrahim, Hamid Close, Timothy J. Lonardi, Stefano |
author_sort | Mirebrahim, Hamid |
collection | PubMed |
description | We introduce a new divide and conquer approach to deal with the problem of de novo genome assembly in the presence of ultra-deep sequencing data (i.e. coverage of 1000x or higher). Our proposed meta-assembler Slicembler partitions the input data into optimal-sized ‘slices’ and uses a standard assembly tool (e.g. Velvet, SPAdes, IDBA_UD and Ray) to assemble each slice individually. Slicembler uses majority voting among the individual assemblies to identify long contigs that can be merged to the consensus assembly. To improve its efficiency, Slicembler uses a generalized suffix tree to identify these frequent contigs (or fraction thereof). Extensive experimental results on real ultra-deep sequencing data (8000x coverage) and simulated data show that Slicembler significantly improves the quality of the assembly compared with the performance of the base assembler. In fact, most of the times, Slicembler generates error-free assemblies. We also show that Slicembler is much more resistant against high sequencing error rate than the base assembler. Availability and implementation: Slicembler can be accessed at http://slicembler.cs.ucr.edu/. Contact: hamid.mirebrahim@email.ucr.edu |
format | Online Article Text |
id | pubmed-4765875 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47658752016-03-04 De novo meta-assembly of ultra-deep sequencing data Mirebrahim, Hamid Close, Timothy J. Lonardi, Stefano Bioinformatics Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland We introduce a new divide and conquer approach to deal with the problem of de novo genome assembly in the presence of ultra-deep sequencing data (i.e. coverage of 1000x or higher). Our proposed meta-assembler Slicembler partitions the input data into optimal-sized ‘slices’ and uses a standard assembly tool (e.g. Velvet, SPAdes, IDBA_UD and Ray) to assemble each slice individually. Slicembler uses majority voting among the individual assemblies to identify long contigs that can be merged to the consensus assembly. To improve its efficiency, Slicembler uses a generalized suffix tree to identify these frequent contigs (or fraction thereof). Extensive experimental results on real ultra-deep sequencing data (8000x coverage) and simulated data show that Slicembler significantly improves the quality of the assembly compared with the performance of the base assembler. In fact, most of the times, Slicembler generates error-free assemblies. We also show that Slicembler is much more resistant against high sequencing error rate than the base assembler. Availability and implementation: Slicembler can be accessed at http://slicembler.cs.ucr.edu/. Contact: hamid.mirebrahim@email.ucr.edu Oxford University Press 2015-06-15 2015-06-10 /pmc/articles/PMC4765875/ /pubmed/26072514 http://dx.doi.org/10.1093/bioinformatics/btv226 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Mirebrahim, Hamid Close, Timothy J. Lonardi, Stefano De novo meta-assembly of ultra-deep sequencing data |
title | De novo meta-assembly of ultra-deep sequencing data |
title_full | De novo meta-assembly of ultra-deep sequencing data |
title_fullStr | De novo meta-assembly of ultra-deep sequencing data |
title_full_unstemmed | De novo meta-assembly of ultra-deep sequencing data |
title_short | De novo meta-assembly of ultra-deep sequencing data |
title_sort | de novo meta-assembly of ultra-deep sequencing data |
topic | Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765875/ https://www.ncbi.nlm.nih.gov/pubmed/26072514 http://dx.doi.org/10.1093/bioinformatics/btv226 |
work_keys_str_mv | AT mirebrahimhamid denovometaassemblyofultradeepsequencingdata AT closetimothyj denovometaassemblyofultradeepsequencingdata AT lonardistefano denovometaassemblyofultradeepsequencingdata |