Cargando…

Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

BACKGROUND: Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuroshu, Reginaldo M., Watanabe,, Junichi, Sugano, Sumio, Morishita, Shinichi, Suzuki, Yutaka, Kasahara, Masahiro
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2866332/
https://www.ncbi.nlm.nih.gov/pubmed/20479877
http://dx.doi.org/10.1371/journal.pone.0010517
_version_ 1782180892214558720
author Kuroshu, Reginaldo M.
Watanabe,, Junichi
Sugano, Sumio
Morishita, Shinichi
Suzuki, Yutaka
Kasahara, Masahiro
author_facet Kuroshu, Reginaldo M.
Watanabe,, Junichi
Sugano, Sumio
Morishita, Shinichi
Suzuki, Yutaka
Kasahara, Masahiro
author_sort Kuroshu, Reginaldo M.
collection PubMed
description BACKGROUND: Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. METHODOLOGY: We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. CONCLUSIONS: The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches.
format Text
id pubmed-2866332
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28663322010-05-17 Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly Kuroshu, Reginaldo M. Watanabe,, Junichi Sugano, Sumio Morishita, Shinichi Suzuki, Yutaka Kasahara, Masahiro PLoS One Research Article BACKGROUND: Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. METHODOLOGY: We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. CONCLUSIONS: The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. Public Library of Science 2010-05-07 /pmc/articles/PMC2866332/ /pubmed/20479877 http://dx.doi.org/10.1371/journal.pone.0010517 Text en Kuroshu et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kuroshu, Reginaldo M.
Watanabe,, Junichi
Sugano, Sumio
Morishita, Shinichi
Suzuki, Yutaka
Kasahara, Masahiro
Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
title Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
title_full Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
title_fullStr Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
title_full_unstemmed Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
title_short Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
title_sort cost-effective sequencing of full-length cdna clones powered by a de novo-reference hybrid assembly
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2866332/
https://www.ncbi.nlm.nih.gov/pubmed/20479877
http://dx.doi.org/10.1371/journal.pone.0010517
work_keys_str_mv AT kuroshureginaldom costeffectivesequencingoffulllengthcdnaclonespoweredbyadenovoreferencehybridassembly
AT watanabejunichi costeffectivesequencingoffulllengthcdnaclonespoweredbyadenovoreferencehybridassembly
AT suganosumio costeffectivesequencingoffulllengthcdnaclonespoweredbyadenovoreferencehybridassembly
AT morishitashinichi costeffectivesequencingoffulllengthcdnaclonespoweredbyadenovoreferencehybridassembly
AT suzukiyutaka costeffectivesequencingoffulllengthcdnaclonespoweredbyadenovoreferencehybridassembly
AT kasaharamasahiro costeffectivesequencingoffulllengthcdnaclonespoweredbyadenovoreferencehybridassembly