Cargando…

Short clones or long clones? A simulation study on the use of paired reads in metagenomics

BACKGROUND: Metagenomics is the study of environmental samples using sequencing. Rapid advances in sequencing technology are fueling a vast increase in the number and scope of metagenomics projects. Most metagenome sequencing projects so far have been based on Sanger or Roche-454 sequencing, as only...

Descripción completa

Detalles Bibliográficos
Autores principales: Mitra, Suparna, Schubach, Max, Huson, Daniel H
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009484/
https://www.ncbi.nlm.nih.gov/pubmed/20122183
http://dx.doi.org/10.1186/1471-2105-11-S1-S12
_version_ 1782194688909901824
author Mitra, Suparna
Schubach, Max
Huson, Daniel H
author_facet Mitra, Suparna
Schubach, Max
Huson, Daniel H
author_sort Mitra, Suparna
collection PubMed
description BACKGROUND: Metagenomics is the study of environmental samples using sequencing. Rapid advances in sequencing technology are fueling a vast increase in the number and scope of metagenomics projects. Most metagenome sequencing projects so far have been based on Sanger or Roche-454 sequencing, as only these technologies provide long enough reads, while Illumina sequencing has not been considered suitable for metagenomic studies due to a short read length of only 35 bp. However, now that reads of length 75 bp can be sequenced in pairs, Illumina sequencing has become a viable option for metagenome studies. RESULTS: This paper addresses the problem of taxonomical analysis of paired reads. We describe a new feature of our metagenome analysis software MEGAN that allows one to process sequencing reads in pairs and makes assignments of such reads based on the combined bit scores of their matches to reference sequences. Using this new software in a simulation study, we investigate the use of Illumina paired-sequencing in taxonomical analysis and compare the performance of single reads, short clones and long clones. In addition, we also compare against simulated Roche-454 sequencing runs. CONCLUSION: This work shows that paired reads perform better than single reads, as expected, but also, perhaps slightly less obviously, that long clones allow more specific assignments than short ones. A new version of the program MEGAN that explicitly takes paired reads into account is available from our website.
format Text
id pubmed-3009484
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30094842010-12-23 Short clones or long clones? A simulation study on the use of paired reads in metagenomics Mitra, Suparna Schubach, Max Huson, Daniel H BMC Bioinformatics Research BACKGROUND: Metagenomics is the study of environmental samples using sequencing. Rapid advances in sequencing technology are fueling a vast increase in the number and scope of metagenomics projects. Most metagenome sequencing projects so far have been based on Sanger or Roche-454 sequencing, as only these technologies provide long enough reads, while Illumina sequencing has not been considered suitable for metagenomic studies due to a short read length of only 35 bp. However, now that reads of length 75 bp can be sequenced in pairs, Illumina sequencing has become a viable option for metagenome studies. RESULTS: This paper addresses the problem of taxonomical analysis of paired reads. We describe a new feature of our metagenome analysis software MEGAN that allows one to process sequencing reads in pairs and makes assignments of such reads based on the combined bit scores of their matches to reference sequences. Using this new software in a simulation study, we investigate the use of Illumina paired-sequencing in taxonomical analysis and compare the performance of single reads, short clones and long clones. In addition, we also compare against simulated Roche-454 sequencing runs. CONCLUSION: This work shows that paired reads perform better than single reads, as expected, but also, perhaps slightly less obviously, that long clones allow more specific assignments than short ones. A new version of the program MEGAN that explicitly takes paired reads into account is available from our website. BioMed Central 2010-01-18 /pmc/articles/PMC3009484/ /pubmed/20122183 http://dx.doi.org/10.1186/1471-2105-11-S1-S12 Text en Copyright ©2010 Mitra et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Mitra, Suparna
Schubach, Max
Huson, Daniel H
Short clones or long clones? A simulation study on the use of paired reads in metagenomics
title Short clones or long clones? A simulation study on the use of paired reads in metagenomics
title_full Short clones or long clones? A simulation study on the use of paired reads in metagenomics
title_fullStr Short clones or long clones? A simulation study on the use of paired reads in metagenomics
title_full_unstemmed Short clones or long clones? A simulation study on the use of paired reads in metagenomics
title_short Short clones or long clones? A simulation study on the use of paired reads in metagenomics
title_sort short clones or long clones? a simulation study on the use of paired reads in metagenomics
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009484/
https://www.ncbi.nlm.nih.gov/pubmed/20122183
http://dx.doi.org/10.1186/1471-2105-11-S1-S12
work_keys_str_mv AT mitrasuparna shortclonesorlongclonesasimulationstudyontheuseofpairedreadsinmetagenomics
AT schubachmax shortclonesorlongclonesasimulationstudyontheuseofpairedreadsinmetagenomics
AT husondanielh shortclonesorlongclonesasimulationstudyontheuseofpairedreadsinmetagenomics