Cargando…

Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data

BACKGROUND: Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1...

Descripción completa

Detalles Bibliográficos
Autores principales: Holmes, Iris, Davis Rabosky, Alison R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5907781/
https://www.ncbi.nlm.nih.gov/pubmed/29682427
http://dx.doi.org/10.7717/peerj.4662
_version_ 1783315606520463360
author Holmes, Iris
Davis Rabosky, Alison R.
author_facet Holmes, Iris
Davis Rabosky, Alison R.
author_sort Holmes, Iris
collection PubMed
description BACKGROUND: Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1) RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2) tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques. METHODS: We present an exploratory method for mining metagenomic “bycatch” sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI’s blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets. RESULTS: When we focus on sequences that align with existing references in NCBI’s GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD) sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla. DISCUSSION: Our results suggest that (1) museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2) that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3) that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches.
format Online
Article
Text
id pubmed-5907781
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-59077812018-04-22 Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data Holmes, Iris Davis Rabosky, Alison R. PeerJ Biodiversity BACKGROUND: Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1) RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2) tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques. METHODS: We present an exploratory method for mining metagenomic “bycatch” sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI’s blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets. RESULTS: When we focus on sequences that align with existing references in NCBI’s GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD) sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla. DISCUSSION: Our results suggest that (1) museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2) that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3) that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches. PeerJ Inc. 2018-04-16 /pmc/articles/PMC5907781/ /pubmed/29682427 http://dx.doi.org/10.7717/peerj.4662 Text en © 2018 Holmes and Davis Rabosky http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biodiversity
Holmes, Iris
Davis Rabosky, Alison R.
Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data
title Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data
title_full Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data
title_fullStr Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data
title_full_unstemmed Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data
title_short Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data
title_sort natural history bycatch: a pipeline for identifying metagenomic sequences in radseq data
topic Biodiversity
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5907781/
https://www.ncbi.nlm.nih.gov/pubmed/29682427
http://dx.doi.org/10.7717/peerj.4662
work_keys_str_mv AT holmesiris naturalhistorybycatchapipelineforidentifyingmetagenomicsequencesinradseqdata
AT davisraboskyalisonr naturalhistorybycatchapipelineforidentifyingmetagenomicsequencesinradseqdata