Cargando…
NOVOPlasty: de novo assembly of organelle genomes from whole genome data
The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5389512/ https://www.ncbi.nlm.nih.gov/pubmed/28204566 http://dx.doi.org/10.1093/nar/gkw955 |
_version_ | 1782521281531346944 |
---|---|
author | Dierckxsens, Nicolas Mardulyn, Patrick Smits, Guillaume |
author_facet | Dierckxsens, Nicolas Mardulyn, Patrick Smits, Guillaume |
author_sort | Dierckxsens, Nicolas |
collection | PubMed |
description | The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty. |
format | Online Article Text |
id | pubmed-5389512 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-53895122017-04-24 NOVOPlasty: de novo assembly of organelle genomes from whole genome data Dierckxsens, Nicolas Mardulyn, Patrick Smits, Guillaume Nucleic Acids Res Methods Online The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty. Oxford University Press 2017-02-28 2016-10-24 /pmc/articles/PMC5389512/ /pubmed/28204566 http://dx.doi.org/10.1093/nar/gkw955 Text en © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Dierckxsens, Nicolas Mardulyn, Patrick Smits, Guillaume NOVOPlasty: de novo assembly of organelle genomes from whole genome data |
title | NOVOPlasty: de novo assembly of organelle genomes from whole genome data |
title_full | NOVOPlasty: de novo assembly of organelle genomes from whole genome data |
title_fullStr | NOVOPlasty: de novo assembly of organelle genomes from whole genome data |
title_full_unstemmed | NOVOPlasty: de novo assembly of organelle genomes from whole genome data |
title_short | NOVOPlasty: de novo assembly of organelle genomes from whole genome data |
title_sort | novoplasty: de novo assembly of organelle genomes from whole genome data |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5389512/ https://www.ncbi.nlm.nih.gov/pubmed/28204566 http://dx.doi.org/10.1093/nar/gkw955 |
work_keys_str_mv | AT dierckxsensnicolas novoplastydenovoassemblyoforganellegenomesfromwholegenomedata AT mardulynpatrick novoplastydenovoassemblyoforganellegenomesfromwholegenomedata AT smitsguillaume novoplastydenovoassemblyoforganellegenomesfromwholegenomedata |