Cargando…

Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to...

Descripción completa

Detalles Bibliográficos
Autores principales: Leese, Florian, Brand, Philipp, Rozenberg, Andrey, Mayer, Christoph, Agrawal, Shobhit, Dambach, Johannes, Dietz, Lars, Doemel, Jana S., Goodall-Copstake, William P., Held, Christoph, Jackson, Jennifer A., Lampert, Kathrin P., Linse, Katrin, Macher, Jan N., Nolzen, Jennifer, Raupach, Michael J., Rivera, Nicole T., Schubart, Christoph D., Striewski, Sebastian, Tollrian, Ralph, Sands, Chester J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3504011/
https://www.ncbi.nlm.nih.gov/pubmed/23185309
http://dx.doi.org/10.1371/journal.pone.0049202
_version_ 1782250553185665024
author Leese, Florian
Brand, Philipp
Rozenberg, Andrey
Mayer, Christoph
Agrawal, Shobhit
Dambach, Johannes
Dietz, Lars
Doemel, Jana S.
Goodall-Copstake, William P.
Held, Christoph
Jackson, Jennifer A.
Lampert, Kathrin P.
Linse, Katrin
Macher, Jan N.
Nolzen, Jennifer
Raupach, Michael J.
Rivera, Nicole T.
Schubart, Christoph D.
Striewski, Sebastian
Tollrian, Ralph
Sands, Chester J.
author_facet Leese, Florian
Brand, Philipp
Rozenberg, Andrey
Mayer, Christoph
Agrawal, Shobhit
Dambach, Johannes
Dietz, Lars
Doemel, Jana S.
Goodall-Copstake, William P.
Held, Christoph
Jackson, Jennifer A.
Lampert, Kathrin P.
Linse, Katrin
Macher, Jan N.
Nolzen, Jennifer
Raupach, Michael J.
Rivera, Nicole T.
Schubart, Christoph D.
Striewski, Sebastian
Tollrian, Ralph
Sands, Chester J.
author_sort Leese, Florian
collection PubMed
description High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers.
format Online
Article
Text
id pubmed-3504011
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35040112012-11-26 Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology Leese, Florian Brand, Philipp Rozenberg, Andrey Mayer, Christoph Agrawal, Shobhit Dambach, Johannes Dietz, Lars Doemel, Jana S. Goodall-Copstake, William P. Held, Christoph Jackson, Jennifer A. Lampert, Kathrin P. Linse, Katrin Macher, Jan N. Nolzen, Jennifer Raupach, Michael J. Rivera, Nicole T. Schubart, Christoph D. Striewski, Sebastian Tollrian, Ralph Sands, Chester J. PLoS One Research Article High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. Public Library of Science 2012-11-21 /pmc/articles/PMC3504011/ /pubmed/23185309 http://dx.doi.org/10.1371/journal.pone.0049202 Text en © 2012 Leese et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Leese, Florian
Brand, Philipp
Rozenberg, Andrey
Mayer, Christoph
Agrawal, Shobhit
Dambach, Johannes
Dietz, Lars
Doemel, Jana S.
Goodall-Copstake, William P.
Held, Christoph
Jackson, Jennifer A.
Lampert, Kathrin P.
Linse, Katrin
Macher, Jan N.
Nolzen, Jennifer
Raupach, Michael J.
Rivera, Nicole T.
Schubart, Christoph D.
Striewski, Sebastian
Tollrian, Ralph
Sands, Chester J.
Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology
title Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology
title_full Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology
title_fullStr Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology
title_full_unstemmed Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology
title_short Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology
title_sort exploring pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3504011/
https://www.ncbi.nlm.nih.gov/pubmed/23185309
http://dx.doi.org/10.1371/journal.pone.0049202
work_keys_str_mv AT leeseflorian exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT brandphilipp exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT rozenbergandrey exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT mayerchristoph exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT agrawalshobhit exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT dambachjohannes exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT dietzlars exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT doemeljanas exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT goodallcopstakewilliamp exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT heldchristoph exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT jacksonjennifera exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT lampertkathrinp exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT linsekatrin exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT macherjann exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT nolzenjennifer exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT raupachmichaelj exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT riveranicolet exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT schubartchristophd exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT striewskisebastian exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT tollrianralph exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT sandschesterj exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology