Cargando…

BLAST-based validation of metagenomic sequence assignments

When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomi...

Descripción completa

Detalles Bibliográficos
Autores principales: Bazinet, Adam L., Ondov, Brian D., Sommer, Daniel D., Ratnayake, Shashikala
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5978398/
https://www.ncbi.nlm.nih.gov/pubmed/29868286
http://dx.doi.org/10.7717/peerj.4892
_version_ 1783327525015912448
author Bazinet, Adam L.
Ondov, Brian D.
Sommer, Daniel D.
Ratnayake, Shashikala
author_facet Bazinet, Adam L.
Ondov, Brian D.
Sommer, Daniel D.
Ratnayake, Shashikala
author_sort Bazinet, Adam L.
collection PubMed
description When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by “first-pass” classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present a BLAST-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering of BLAST results determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available.
format Online
Article
Text
id pubmed-5978398
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-59783982018-06-04 BLAST-based validation of metagenomic sequence assignments Bazinet, Adam L. Ondov, Brian D. Sommer, Daniel D. Ratnayake, Shashikala PeerJ Bioinformatics When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by “first-pass” classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present a BLAST-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering of BLAST results determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available. PeerJ Inc. 2018-05-28 /pmc/articles/PMC5978398/ /pubmed/29868286 http://dx.doi.org/10.7717/peerj.4892 Text en © 2018 Bazinet et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Bazinet, Adam L.
Ondov, Brian D.
Sommer, Daniel D.
Ratnayake, Shashikala
BLAST-based validation of metagenomic sequence assignments
title BLAST-based validation of metagenomic sequence assignments
title_full BLAST-based validation of metagenomic sequence assignments
title_fullStr BLAST-based validation of metagenomic sequence assignments
title_full_unstemmed BLAST-based validation of metagenomic sequence assignments
title_short BLAST-based validation of metagenomic sequence assignments
title_sort blast-based validation of metagenomic sequence assignments
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5978398/
https://www.ncbi.nlm.nih.gov/pubmed/29868286
http://dx.doi.org/10.7717/peerj.4892
work_keys_str_mv AT bazinetadaml blastbasedvalidationofmetagenomicsequenceassignments
AT ondovbriand blastbasedvalidationofmetagenomicsequenceassignments
AT sommerdanield blastbasedvalidationofmetagenomicsequenceassignments
AT ratnayakeshashikala blastbasedvalidationofmetagenomicsequenceassignments