Cargando…

Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues

The bacterial microbiota of the mosquito influences numerous physiological processes of the host. As low-microbial-biomass ecosystems, mosquito tissues are prone to contamination from the laboratory environment and from reagents commonly used to isolate DNA from tissue samples. In this report, we an...

Descripción completa

Detalles Bibliográficos
Autores principales: Díaz, Sebastián, Escobar, Juan S., Avila, Frank W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8265668/
https://www.ncbi.nlm.nih.gov/pubmed/34133198
http://dx.doi.org/10.1128/mSphere.00506-21
_version_ 1783719784612888576
author Díaz, Sebastián
Escobar, Juan S.
Avila, Frank W.
author_facet Díaz, Sebastián
Escobar, Juan S.
Avila, Frank W.
author_sort Díaz, Sebastián
collection PubMed
description The bacterial microbiota of the mosquito influences numerous physiological processes of the host. As low-microbial-biomass ecosystems, mosquito tissues are prone to contamination from the laboratory environment and from reagents commonly used to isolate DNA from tissue samples. In this report, we analyzed nine 16S rRNA data sets, including new data obtained by us, to gain insight into the impact of potential contaminating sequences on the composition, diversity, and structure of the mosquito tissue microbial community. Using a clustering-free approach based on the relative abundance of amplicon sequence variants (ASVs) in tissue samples and negative controls, we identified candidate contaminating sequences that sometimes differed from, but were consistent with, results found using established methodologies. Some putative contaminating sequences belong to bacterial taxa previously identified as contaminants that are commonly found in metagenomic studies but that have also been identified as part of the mosquito core microbiota, with putative physiological relevance for the host. Using different relative abundance cutoffs, we show that contaminating sequences have a significant impact on tissue microbiota diversity and structure analysis. IMPORTANCE The study of tissue-associated microbiota from mosquitoes (primarily from the gut) has grown significantly in the last several years. Mosquito tissue samples represent a challenge for researchers given their low microbial biomass and similar taxonomic composition commonly found in the laboratory environment and in molecular reagents. Using new and published data sets that identified mosquito tissue microbiota from gut and reproductive tract tissues (and their respective negative controls), we developed a simple method to identify contamination microbiota. This approach uses an initial taxonomic identification without operational taxonomic unit (OTU) clustering and evaluates the relative abundance of control sample sequences, allowing the identification and removal of purported contaminating sequences in data sets obtained from low-microbial-biomass samples. While it was exemplified with the analysis of tissue microbiota from mosquitos, it can be extended to other data sets dealing with similar technical artifacts.
format Online
Article
Text
id pubmed-8265668
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-82656682021-07-23 Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues Díaz, Sebastián Escobar, Juan S. Avila, Frank W. mSphere Research Article The bacterial microbiota of the mosquito influences numerous physiological processes of the host. As low-microbial-biomass ecosystems, mosquito tissues are prone to contamination from the laboratory environment and from reagents commonly used to isolate DNA from tissue samples. In this report, we analyzed nine 16S rRNA data sets, including new data obtained by us, to gain insight into the impact of potential contaminating sequences on the composition, diversity, and structure of the mosquito tissue microbial community. Using a clustering-free approach based on the relative abundance of amplicon sequence variants (ASVs) in tissue samples and negative controls, we identified candidate contaminating sequences that sometimes differed from, but were consistent with, results found using established methodologies. Some putative contaminating sequences belong to bacterial taxa previously identified as contaminants that are commonly found in metagenomic studies but that have also been identified as part of the mosquito core microbiota, with putative physiological relevance for the host. Using different relative abundance cutoffs, we show that contaminating sequences have a significant impact on tissue microbiota diversity and structure analysis. IMPORTANCE The study of tissue-associated microbiota from mosquitoes (primarily from the gut) has grown significantly in the last several years. Mosquito tissue samples represent a challenge for researchers given their low microbial biomass and similar taxonomic composition commonly found in the laboratory environment and in molecular reagents. Using new and published data sets that identified mosquito tissue microbiota from gut and reproductive tract tissues (and their respective negative controls), we developed a simple method to identify contamination microbiota. This approach uses an initial taxonomic identification without operational taxonomic unit (OTU) clustering and evaluates the relative abundance of control sample sequences, allowing the identification and removal of purported contaminating sequences in data sets obtained from low-microbial-biomass samples. While it was exemplified with the analysis of tissue microbiota from mosquitos, it can be extended to other data sets dealing with similar technical artifacts. American Society for Microbiology 2021-06-16 /pmc/articles/PMC8265668/ /pubmed/34133198 http://dx.doi.org/10.1128/mSphere.00506-21 Text en Copyright © 2021 Díaz et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Díaz, Sebastián
Escobar, Juan S.
Avila, Frank W.
Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues
title Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues
title_full Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues
title_fullStr Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues
title_full_unstemmed Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues
title_short Identification and Removal of Potential Contaminants in 16S rRNA Gene Sequence Data Sets from Low-Microbial-Biomass Samples: an Example from Mosquito Tissues
title_sort identification and removal of potential contaminants in 16s rrna gene sequence data sets from low-microbial-biomass samples: an example from mosquito tissues
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8265668/
https://www.ncbi.nlm.nih.gov/pubmed/34133198
http://dx.doi.org/10.1128/mSphere.00506-21
work_keys_str_mv AT diazsebastian identificationandremovalofpotentialcontaminantsin16srrnagenesequencedatasetsfromlowmicrobialbiomasssamplesanexamplefrommosquitotissues
AT escobarjuans identificationandremovalofpotentialcontaminantsin16srrnagenesequencedatasetsfromlowmicrobialbiomasssamplesanexamplefrommosquitotissues
AT avilafrankw identificationandremovalofpotentialcontaminantsin16srrnagenesequencedatasetsfromlowmicrobialbiomasssamplesanexamplefrommosquitotissues