Cargando…

Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life

Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA...

Descripción completa

Detalles Bibliográficos
Autores principales: Sheik, Cody S., Reese, Brandi Kiel, Twing, Katrina I., Sylvan, Jason B., Grim, Sharon L., Schrenk, Matthew O., Sogin, Mitchell L., Colwell, Frederick S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5945997/
https://www.ncbi.nlm.nih.gov/pubmed/29780369
http://dx.doi.org/10.3389/fmicb.2018.00840
_version_ 1783322101733654528
author Sheik, Cody S.
Reese, Brandi Kiel
Twing, Katrina I.
Sylvan, Jason B.
Grim, Sharon L.
Schrenk, Matthew O.
Sogin, Mitchell L.
Colwell, Frederick S.
author_facet Sheik, Cody S.
Reese, Brandi Kiel
Twing, Katrina I.
Sylvan, Jason B.
Grim, Sharon L.
Schrenk, Matthew O.
Sogin, Mitchell L.
Colwell, Frederick S.
author_sort Sheik, Cody S.
collection PubMed
description Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium, Aquabacterium, Ralstonia, and Acinetobacter. While the top five most frequently observed genera were Pseudomonas, Propionibacterium, Acinetobacter, Ralstonia, and Sphingomonas. The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth’s deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset.
format Online
Article
Text
id pubmed-5945997
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-59459972018-05-18 Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life Sheik, Cody S. Reese, Brandi Kiel Twing, Katrina I. Sylvan, Jason B. Grim, Sharon L. Schrenk, Matthew O. Sogin, Mitchell L. Colwell, Frederick S. Front Microbiol Microbiology Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium, Aquabacterium, Ralstonia, and Acinetobacter. While the top five most frequently observed genera were Pseudomonas, Propionibacterium, Acinetobacter, Ralstonia, and Sphingomonas. The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth’s deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset. Frontiers Media S.A. 2018-04-30 /pmc/articles/PMC5945997/ /pubmed/29780369 http://dx.doi.org/10.3389/fmicb.2018.00840 Text en Copyright © 2018 Sheik, Reese, Twing, Sylvan, Grim, Schrenk, Sogin and Colwell. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Sheik, Cody S.
Reese, Brandi Kiel
Twing, Katrina I.
Sylvan, Jason B.
Grim, Sharon L.
Schrenk, Matthew O.
Sogin, Mitchell L.
Colwell, Frederick S.
Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life
title Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life
title_full Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life
title_fullStr Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life
title_full_unstemmed Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life
title_short Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life
title_sort identification and removal of contaminant sequences from ribosomal gene databases: lessons from the census of deep life
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5945997/
https://www.ncbi.nlm.nih.gov/pubmed/29780369
http://dx.doi.org/10.3389/fmicb.2018.00840
work_keys_str_mv AT sheikcodys identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT reesebrandikiel identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT twingkatrinai identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT sylvanjasonb identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT grimsharonl identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT schrenkmatthewo identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT soginmitchelll identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife
AT colwellfredericks identificationandremovalofcontaminantsequencesfromribosomalgenedatabaseslessonsfromthecensusofdeeplife