Cargando…

Using populations of human and microbial genomes for organism detection in metagenomes

Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the know...

Descripción completa

Detalles Bibliográficos
Autores principales: Ames, Sasha K., Gardner, Shea N., Marti, Jose Manuel, Slezak, Tom R., Gokhale, Maya B., Allen, Jonathan E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4484388/
https://www.ncbi.nlm.nih.gov/pubmed/25926546
http://dx.doi.org/10.1101/gr.184879.114
_version_ 1782378653483532288
author Ames, Sasha K.
Gardner, Shea N.
Marti, Jose Manuel
Slezak, Tom R.
Gokhale, Maya B.
Allen, Jonathan E.
author_facet Ames, Sasha K.
Gardner, Shea N.
Marti, Jose Manuel
Slezak, Tom R.
Gokhale, Maya B.
Allen, Jonathan E.
author_sort Ames, Sasha K.
collection PubMed
description Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.
format Online
Article
Text
id pubmed-4484388
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-44843882015-07-02 Using populations of human and microbial genomes for organism detection in metagenomes Ames, Sasha K. Gardner, Shea N. Marti, Jose Manuel Slezak, Tom R. Gokhale, Maya B. Allen, Jonathan E. Genome Res Resource Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected. Cold Spring Harbor Laboratory Press 2015-07 /pmc/articles/PMC4484388/ /pubmed/25926546 http://dx.doi.org/10.1101/gr.184879.114 Text en © 2015 Ames et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Resource
Ames, Sasha K.
Gardner, Shea N.
Marti, Jose Manuel
Slezak, Tom R.
Gokhale, Maya B.
Allen, Jonathan E.
Using populations of human and microbial genomes for organism detection in metagenomes
title Using populations of human and microbial genomes for organism detection in metagenomes
title_full Using populations of human and microbial genomes for organism detection in metagenomes
title_fullStr Using populations of human and microbial genomes for organism detection in metagenomes
title_full_unstemmed Using populations of human and microbial genomes for organism detection in metagenomes
title_short Using populations of human and microbial genomes for organism detection in metagenomes
title_sort using populations of human and microbial genomes for organism detection in metagenomes
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4484388/
https://www.ncbi.nlm.nih.gov/pubmed/25926546
http://dx.doi.org/10.1101/gr.184879.114
work_keys_str_mv AT amessashak usingpopulationsofhumanandmicrobialgenomesfororganismdetectioninmetagenomes
AT gardnershean usingpopulationsofhumanandmicrobialgenomesfororganismdetectioninmetagenomes
AT martijosemanuel usingpopulationsofhumanandmicrobialgenomesfororganismdetectioninmetagenomes
AT slezaktomr usingpopulationsofhumanandmicrobialgenomesfororganismdetectioninmetagenomes
AT gokhalemayab usingpopulationsofhumanandmicrobialgenomesfororganismdetectioninmetagenomes
AT allenjonathane usingpopulationsofhumanandmicrobialgenomesfororganismdetectioninmetagenomes