Cargando…

TIGER: tiled iterative genome assembler

BACKGROUND: With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Xiao-Long, Heo, Yun, El Hajj, Izzat, Hwu, Wen-Mei, Chen, Deming, Ma, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526431/
https://www.ncbi.nlm.nih.gov/pubmed/23281792
http://dx.doi.org/10.1186/1471-2105-13-S19-S18
_version_ 1782253558799794176
author Wu, Xiao-Long
Heo, Yun
El Hajj, Izzat
Hwu, Wen-Mei
Chen, Deming
Ma, Jian
author_facet Wu, Xiao-Long
Heo, Yun
El Hajj, Izzat
Hwu, Wen-Mei
Chen, Deming
Ma, Jian
author_sort Wu, Xiao-Long
collection PubMed
description BACKGROUND: With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. RESULTS: We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. CONCLUSIONS: Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.
format Online
Article
Text
id pubmed-3526431
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35264312013-01-10 TIGER: tiled iterative genome assembler Wu, Xiao-Long Heo, Yun El Hajj, Izzat Hwu, Wen-Mei Chen, Deming Ma, Jian BMC Bioinformatics Proceedings BACKGROUND: With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. RESULTS: We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. CONCLUSIONS: Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers. BioMed Central 2012-12-19 /pmc/articles/PMC3526431/ /pubmed/23281792 http://dx.doi.org/10.1186/1471-2105-13-S19-S18 Text en Copyright ©2012 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Wu, Xiao-Long
Heo, Yun
El Hajj, Izzat
Hwu, Wen-Mei
Chen, Deming
Ma, Jian
TIGER: tiled iterative genome assembler
title TIGER: tiled iterative genome assembler
title_full TIGER: tiled iterative genome assembler
title_fullStr TIGER: tiled iterative genome assembler
title_full_unstemmed TIGER: tiled iterative genome assembler
title_short TIGER: tiled iterative genome assembler
title_sort tiger: tiled iterative genome assembler
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526431/
https://www.ncbi.nlm.nih.gov/pubmed/23281792
http://dx.doi.org/10.1186/1471-2105-13-S19-S18
work_keys_str_mv AT wuxiaolong tigertilediterativegenomeassembler
AT heoyun tigertilediterativegenomeassembler
AT elhajjizzat tigertilediterativegenomeassembler
AT hwuwenmei tigertilediterativegenomeassembler
AT chendeming tigertilediterativegenomeassembler
AT majian tigertilediterativegenomeassembler