Cargando…

Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification

Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identificat...

Descripción completa

Detalles Bibliográficos
Autores principales: Pongor, Lőrinc S., Vera, Roberto, Ligeti, Balázs
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4117525/
https://www.ncbi.nlm.nih.gov/pubmed/25077800
http://dx.doi.org/10.1371/journal.pone.0103441
_version_ 1782328713396879360
author Pongor, Lőrinc S.
Vera, Roberto
Ligeti, Balázs
author_facet Pongor, Lőrinc S.
Vera, Roberto
Ligeti, Balázs
author_sort Pongor, Lőrinc S.
collection PubMed
description Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.
format Online
Article
Text
id pubmed-4117525
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41175252014-08-04 Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification Pongor, Lőrinc S. Vera, Roberto Ligeti, Balázs PLoS One Research Article Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner. Public Library of Science 2014-07-31 /pmc/articles/PMC4117525/ /pubmed/25077800 http://dx.doi.org/10.1371/journal.pone.0103441 Text en © 2014 Pongor et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Pongor, Lőrinc S.
Vera, Roberto
Ligeti, Balázs
Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification
title Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification
title_full Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification
title_fullStr Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification
title_full_unstemmed Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification
title_short Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification
title_sort fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop pc: application to metagenomic datasets and pathogen identification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4117525/
https://www.ncbi.nlm.nih.gov/pubmed/25077800
http://dx.doi.org/10.1371/journal.pone.0103441
work_keys_str_mv AT pongorlorincs fastandsensitivealignmentofmicrobialwholegenomesequencingreadstolargesequencedatasetsonadesktoppcapplicationtometagenomicdatasetsandpathogenidentification
AT veraroberto fastandsensitivealignmentofmicrobialwholegenomesequencingreadstolargesequencedatasetsonadesktoppcapplicationtometagenomicdatasetsandpathogenidentification
AT ligetibalazs fastandsensitivealignmentofmicrobialwholegenomesequencingreadstolargesequencedatasetsonadesktoppcapplicationtometagenomicdatasetsandpathogenidentification