Cargando…

Large-scale contamination of microbial isolate genomes by Illumina PhiX control

With the rapid growth and development of sequencing technologies, genomes have become the new go-to for exploring solutions to some of the world’s biggest challenges such as searching for alternative energy sources and exploration of genomic dark matter. However, progress in sequencing has been acco...

Descripción completa

Detalles Bibliográficos
Autores principales: Mukherjee, Supratim, Huntemann, Marcel, Ivanova, Natalia, Kyrpides, Nikos C, Pati, Amrita
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4511556/
https://www.ncbi.nlm.nih.gov/pubmed/26203331
http://dx.doi.org/10.1186/1944-3277-10-18
_version_ 1782382352610099200
author Mukherjee, Supratim
Huntemann, Marcel
Ivanova, Natalia
Kyrpides, Nikos C
Pati, Amrita
author_facet Mukherjee, Supratim
Huntemann, Marcel
Ivanova, Natalia
Kyrpides, Nikos C
Pati, Amrita
author_sort Mukherjee, Supratim
collection PubMed
description With the rapid growth and development of sequencing technologies, genomes have become the new go-to for exploring solutions to some of the world’s biggest challenges such as searching for alternative energy sources and exploration of genomic dark matter. However, progress in sequencing has been accompanied by its share of errors that can occur during template or library preparation, sequencing, imaging or data analysis. In this study we screened over 18,000 publicly available microbial isolate genome sequences in the Integrated Microbial Genomes database and identified more than 1000 genomes that are contaminated with PhiX, a control frequently used during Illumina sequencing runs. Approximately 10% of these genomes have been published in literature and 129 contaminated genomes were sequenced under the Human Microbiome Project. Raw sequence reads are prone to contamination from various sources and are usually eliminated during downstream quality control steps. Detection of PhiX contaminated genomes indicates a lapse in either the application or effectiveness of proper quality control measures. The presence of PhiX contamination in several publicly available isolate genomes can result in additional errors when such data are used in comparative genomics analyses. Such contamination of public databases have far-reaching consequences in the form of erroneous data interpretation and analyses, and necessitates better measures to proofread raw sequences before releasing them to the broader scientific community.
format Online
Article
Text
id pubmed-4511556
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45115562015-07-23 Large-scale contamination of microbial isolate genomes by Illumina PhiX control Mukherjee, Supratim Huntemann, Marcel Ivanova, Natalia Kyrpides, Nikos C Pati, Amrita Stand Genomic Sci Commentary With the rapid growth and development of sequencing technologies, genomes have become the new go-to for exploring solutions to some of the world’s biggest challenges such as searching for alternative energy sources and exploration of genomic dark matter. However, progress in sequencing has been accompanied by its share of errors that can occur during template or library preparation, sequencing, imaging or data analysis. In this study we screened over 18,000 publicly available microbial isolate genome sequences in the Integrated Microbial Genomes database and identified more than 1000 genomes that are contaminated with PhiX, a control frequently used during Illumina sequencing runs. Approximately 10% of these genomes have been published in literature and 129 contaminated genomes were sequenced under the Human Microbiome Project. Raw sequence reads are prone to contamination from various sources and are usually eliminated during downstream quality control steps. Detection of PhiX contaminated genomes indicates a lapse in either the application or effectiveness of proper quality control measures. The presence of PhiX contamination in several publicly available isolate genomes can result in additional errors when such data are used in comparative genomics analyses. Such contamination of public databases have far-reaching consequences in the form of erroneous data interpretation and analyses, and necessitates better measures to proofread raw sequences before releasing them to the broader scientific community. BioMed Central 2015-03-30 /pmc/articles/PMC4511556/ /pubmed/26203331 http://dx.doi.org/10.1186/1944-3277-10-18 Text en Copyright © 2015 Mukherjee et al.; licensee BioMed Central. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Commentary
Mukherjee, Supratim
Huntemann, Marcel
Ivanova, Natalia
Kyrpides, Nikos C
Pati, Amrita
Large-scale contamination of microbial isolate genomes by Illumina PhiX control
title Large-scale contamination of microbial isolate genomes by Illumina PhiX control
title_full Large-scale contamination of microbial isolate genomes by Illumina PhiX control
title_fullStr Large-scale contamination of microbial isolate genomes by Illumina PhiX control
title_full_unstemmed Large-scale contamination of microbial isolate genomes by Illumina PhiX control
title_short Large-scale contamination of microbial isolate genomes by Illumina PhiX control
title_sort large-scale contamination of microbial isolate genomes by illumina phix control
topic Commentary
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4511556/
https://www.ncbi.nlm.nih.gov/pubmed/26203331
http://dx.doi.org/10.1186/1944-3277-10-18
work_keys_str_mv AT mukherjeesupratim largescalecontaminationofmicrobialisolategenomesbyilluminaphixcontrol
AT huntemannmarcel largescalecontaminationofmicrobialisolategenomesbyilluminaphixcontrol
AT ivanovanatalia largescalecontaminationofmicrobialisolategenomesbyilluminaphixcontrol
AT kyrpidesnikosc largescalecontaminationofmicrobialisolategenomesbyilluminaphixcontrol
AT patiamrita largescalecontaminationofmicrobialisolategenomesbyilluminaphixcontrol