Cargando…
GRASShopPER—An algorithm for de novo assembly based on GPU alignments
Next generation sequencers produce billions of short DNA sequences in a massively parallel manner, which causes a great computational challenge in accurately reconstructing a genome sequence de novo using these short sequences. Here, we propose the GRASShopPER assembler, which follows an approach of...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6095601/ https://www.ncbi.nlm.nih.gov/pubmed/30114279 http://dx.doi.org/10.1371/journal.pone.0202355 |
_version_ | 1783347968994181120 |
---|---|
author | Swiercz, Aleksandra Frohmberg, Wojciech Kierzynka, Michal Wojciechowski, Pawel Zurkowski, Piotr Badura, Jan Laskowski, Artur Kasprzak, Marta Blazewicz, Jacek |
author_facet | Swiercz, Aleksandra Frohmberg, Wojciech Kierzynka, Michal Wojciechowski, Pawel Zurkowski, Piotr Badura, Jan Laskowski, Artur Kasprzak, Marta Blazewicz, Jacek |
author_sort | Swiercz, Aleksandra |
collection | PubMed |
description | Next generation sequencers produce billions of short DNA sequences in a massively parallel manner, which causes a great computational challenge in accurately reconstructing a genome sequence de novo using these short sequences. Here, we propose the GRASShopPER assembler, which follows an approach of overlap-layout-consensus. It uses an efficient GPU implementation for the sequence alignment during the graph construction stage and a greedy hyper-heuristic algorithm at the fork detection stage. A two-part fork detection method allows us to identify repeated fragments of a genome and to reconstruct them without misassemblies. The assemblies of data sets of bacteria Candidatus Microthrix, nematode Caenorhabditis elegans, and human chromosome 14 were evaluated with the golden standard tool QUAST. In comparison with other assemblers, GRASShopPER provided contigs that covered the largest part of the genomes and, at the same time, kept good values of other metrics, e.g., NG50 and misassembly rate. |
format | Online Article Text |
id | pubmed-6095601 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-60956012018-08-30 GRASShopPER—An algorithm for de novo assembly based on GPU alignments Swiercz, Aleksandra Frohmberg, Wojciech Kierzynka, Michal Wojciechowski, Pawel Zurkowski, Piotr Badura, Jan Laskowski, Artur Kasprzak, Marta Blazewicz, Jacek PLoS One Research Article Next generation sequencers produce billions of short DNA sequences in a massively parallel manner, which causes a great computational challenge in accurately reconstructing a genome sequence de novo using these short sequences. Here, we propose the GRASShopPER assembler, which follows an approach of overlap-layout-consensus. It uses an efficient GPU implementation for the sequence alignment during the graph construction stage and a greedy hyper-heuristic algorithm at the fork detection stage. A two-part fork detection method allows us to identify repeated fragments of a genome and to reconstruct them without misassemblies. The assemblies of data sets of bacteria Candidatus Microthrix, nematode Caenorhabditis elegans, and human chromosome 14 were evaluated with the golden standard tool QUAST. In comparison with other assemblers, GRASShopPER provided contigs that covered the largest part of the genomes and, at the same time, kept good values of other metrics, e.g., NG50 and misassembly rate. Public Library of Science 2018-08-16 /pmc/articles/PMC6095601/ /pubmed/30114279 http://dx.doi.org/10.1371/journal.pone.0202355 Text en © 2018 Swiercz et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Swiercz, Aleksandra Frohmberg, Wojciech Kierzynka, Michal Wojciechowski, Pawel Zurkowski, Piotr Badura, Jan Laskowski, Artur Kasprzak, Marta Blazewicz, Jacek GRASShopPER—An algorithm for de novo assembly based on GPU alignments |
title | GRASShopPER—An algorithm for de novo assembly based on GPU alignments |
title_full | GRASShopPER—An algorithm for de novo assembly based on GPU alignments |
title_fullStr | GRASShopPER—An algorithm for de novo assembly based on GPU alignments |
title_full_unstemmed | GRASShopPER—An algorithm for de novo assembly based on GPU alignments |
title_short | GRASShopPER—An algorithm for de novo assembly based on GPU alignments |
title_sort | grasshopper—an algorithm for de novo assembly based on gpu alignments |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6095601/ https://www.ncbi.nlm.nih.gov/pubmed/30114279 http://dx.doi.org/10.1371/journal.pone.0202355 |
work_keys_str_mv | AT swierczaleksandra grasshopperanalgorithmfordenovoassemblybasedongpualignments AT frohmbergwojciech grasshopperanalgorithmfordenovoassemblybasedongpualignments AT kierzynkamichal grasshopperanalgorithmfordenovoassemblybasedongpualignments AT wojciechowskipawel grasshopperanalgorithmfordenovoassemblybasedongpualignments AT zurkowskipiotr grasshopperanalgorithmfordenovoassemblybasedongpualignments AT badurajan grasshopperanalgorithmfordenovoassemblybasedongpualignments AT laskowskiartur grasshopperanalgorithmfordenovoassemblybasedongpualignments AT kasprzakmarta grasshopperanalgorithmfordenovoassemblybasedongpualignments AT blazewiczjacek grasshopperanalgorithmfordenovoassemblybasedongpualignments |