Cargando…

Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40

The development of next-generation sequencing (NGS) technologies has dramatically increased the throughput, speed, and efficiency of genome sequencing. The short read data generated from NGS platforms, such as SOLiD and Illumina, are quite useful for mapping analysis. However, the SOLiD read data wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Umemura, Myco, Koyama, Yoshinori, Takeda, Itaru, Hagiwara, Hiroko, Ikegami, Tsutomu, Koike, Hideaki, Machida, Masayuki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3646829/
https://www.ncbi.nlm.nih.gov/pubmed/23667655
http://dx.doi.org/10.1371/journal.pone.0063673
_version_ 1782268653169803264
author Umemura, Myco
Koyama, Yoshinori
Takeda, Itaru
Hagiwara, Hiroko
Ikegami, Tsutomu
Koike, Hideaki
Machida, Masayuki
author_facet Umemura, Myco
Koyama, Yoshinori
Takeda, Itaru
Hagiwara, Hiroko
Ikegami, Tsutomu
Koike, Hideaki
Machida, Masayuki
author_sort Umemura, Myco
collection PubMed
description The development of next-generation sequencing (NGS) technologies has dramatically increased the throughput, speed, and efficiency of genome sequencing. The short read data generated from NGS platforms, such as SOLiD and Illumina, are quite useful for mapping analysis. However, the SOLiD read data with lengths of <60 bp have been considered to be too short for de novo genome sequencing. Here, to investigate whether de novo sequencing of fungal genomes is possible using only SOLiD short read sequence data, we performed de novo assembly of the Aspergillus oryzae RIB40 genome using only SOLiD read data of 50 bp generated from mate-paired libraries with 2.8- or 1.9-kb insert sizes. The assembled scaffolds showed an N50 value of 1.6 Mb, a 22-fold increase than those obtained using only SOLiD short read in other published reports. In addition, almost 99% of the reference genome was accurately aligned by the assembled scaffold fragments in long lengths. The sequences of secondary metabolite biosynthetic genes and clusters, whose products are of considerable interest in fungal studies due to their potential medicinal, agricultural, and cosmetic properties, were also highly reconstructed in the assembled scaffolds. Based on these findings, we concluded that de novo genome sequencing using only SOLiD short reads is feasible and practical for molecular biological study of fungi. We also investigated the effect of filtering low quality data, library insert size, and k-mer size on the assembly performance, and recommend for the assembly use of mild filtered read data where the N50 was not so degraded and the library has an insert size of ∼2.0 kb, and k-mer size 33.
format Online
Article
Text
id pubmed-3646829
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36468292013-05-10 Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40 Umemura, Myco Koyama, Yoshinori Takeda, Itaru Hagiwara, Hiroko Ikegami, Tsutomu Koike, Hideaki Machida, Masayuki PLoS One Research Article The development of next-generation sequencing (NGS) technologies has dramatically increased the throughput, speed, and efficiency of genome sequencing. The short read data generated from NGS platforms, such as SOLiD and Illumina, are quite useful for mapping analysis. However, the SOLiD read data with lengths of <60 bp have been considered to be too short for de novo genome sequencing. Here, to investigate whether de novo sequencing of fungal genomes is possible using only SOLiD short read sequence data, we performed de novo assembly of the Aspergillus oryzae RIB40 genome using only SOLiD read data of 50 bp generated from mate-paired libraries with 2.8- or 1.9-kb insert sizes. The assembled scaffolds showed an N50 value of 1.6 Mb, a 22-fold increase than those obtained using only SOLiD short read in other published reports. In addition, almost 99% of the reference genome was accurately aligned by the assembled scaffold fragments in long lengths. The sequences of secondary metabolite biosynthetic genes and clusters, whose products are of considerable interest in fungal studies due to their potential medicinal, agricultural, and cosmetic properties, were also highly reconstructed in the assembled scaffolds. Based on these findings, we concluded that de novo genome sequencing using only SOLiD short reads is feasible and practical for molecular biological study of fungi. We also investigated the effect of filtering low quality data, library insert size, and k-mer size on the assembly performance, and recommend for the assembly use of mild filtered read data where the N50 was not so degraded and the library has an insert size of ∼2.0 kb, and k-mer size 33. Public Library of Science 2013-05-07 /pmc/articles/PMC3646829/ /pubmed/23667655 http://dx.doi.org/10.1371/journal.pone.0063673 Text en © 2013 Umemura et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Umemura, Myco
Koyama, Yoshinori
Takeda, Itaru
Hagiwara, Hiroko
Ikegami, Tsutomu
Koike, Hideaki
Machida, Masayuki
Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40
title Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40
title_full Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40
title_fullStr Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40
title_full_unstemmed Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40
title_short Fine De Novo Sequencing of a Fungal Genome Using only SOLiD Short Read Data: Verification on Aspergillus oryzae RIB40
title_sort fine de novo sequencing of a fungal genome using only solid short read data: verification on aspergillus oryzae rib40
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3646829/
https://www.ncbi.nlm.nih.gov/pubmed/23667655
http://dx.doi.org/10.1371/journal.pone.0063673
work_keys_str_mv AT umemuramyco finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40
AT koyamayoshinori finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40
AT takedaitaru finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40
AT hagiwarahiroko finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40
AT ikegamitsutomu finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40
AT koikehideaki finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40
AT machidamasayuki finedenovosequencingofafungalgenomeusingonlysolidshortreaddataverificationonaspergillusoryzaerib40