Cargando…

Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples

Mitochondrial-encoded genes are increasingly targeted in studies using high-throughput sequencing approaches for characterizing metazoan communities from environmental samples (e.g., plankton, meiofauna, filtered water). Yet, unlike nuclear ribosomal RNA markers, there is to date no high-quality ref...

Descripción completa

Detalles Bibliográficos
Autores principales: Machida, Ryuji J., Leray, Matthieu, Ho, Shian-Lei, Knowlton, Nancy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5349245/
https://www.ncbi.nlm.nih.gov/pubmed/28291235
http://dx.doi.org/10.1038/sdata.2017.27
_version_ 1782514438798049280
author Machida, Ryuji J.
Leray, Matthieu
Ho, Shian-Lei
Knowlton, Nancy
author_facet Machida, Ryuji J.
Leray, Matthieu
Ho, Shian-Lei
Knowlton, Nancy
author_sort Machida, Ryuji J.
collection PubMed
description Mitochondrial-encoded genes are increasingly targeted in studies using high-throughput sequencing approaches for characterizing metazoan communities from environmental samples (e.g., plankton, meiofauna, filtered water). Yet, unlike nuclear ribosomal RNA markers, there is to date no high-quality reference dataset available for taxonomic assignments. Here, we retrieved all metazoan mitochondrial gene sequences from GenBank, and then quality filtered and formatted the datasets for taxonomic assignments using taxonomic assignment tools. The reference datasets—‘Midori references’—are available for download at www.reference-midori.info. Two versions are provided: (I) Midori-UNIQUE that contains all unique haplotypes associated with each species and (II) Midori-LONGEST that contains a single sequence, the longest, for each species. Overall, the mitochondrial Cytochrome oxidase subunit I gene was the most sequence-rich gene. However, sequences of the mitochondrial large ribosomal subunit RNA and Cytochrome b apoenzyme genes were observed for a large number of species in some phyla. The Midori reference is compatible with some taxonomic assignment software. Therefore, automated high-throughput sequence taxonomic assignments can be particularly effective using these datasets.
format Online
Article
Text
id pubmed-5349245
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53492452017-03-17 Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples Machida, Ryuji J. Leray, Matthieu Ho, Shian-Lei Knowlton, Nancy Sci Data Data Descriptor Mitochondrial-encoded genes are increasingly targeted in studies using high-throughput sequencing approaches for characterizing metazoan communities from environmental samples (e.g., plankton, meiofauna, filtered water). Yet, unlike nuclear ribosomal RNA markers, there is to date no high-quality reference dataset available for taxonomic assignments. Here, we retrieved all metazoan mitochondrial gene sequences from GenBank, and then quality filtered and formatted the datasets for taxonomic assignments using taxonomic assignment tools. The reference datasets—‘Midori references’—are available for download at www.reference-midori.info. Two versions are provided: (I) Midori-UNIQUE that contains all unique haplotypes associated with each species and (II) Midori-LONGEST that contains a single sequence, the longest, for each species. Overall, the mitochondrial Cytochrome oxidase subunit I gene was the most sequence-rich gene. However, sequences of the mitochondrial large ribosomal subunit RNA and Cytochrome b apoenzyme genes were observed for a large number of species in some phyla. The Midori reference is compatible with some taxonomic assignment software. Therefore, automated high-throughput sequence taxonomic assignments can be particularly effective using these datasets. Nature Publishing Group 2017-03-14 /pmc/articles/PMC5349245/ /pubmed/28291235 http://dx.doi.org/10.1038/sdata.2017.27 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0 This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 Metadata associated with this Data Descriptor is available at http://www.nature.com/sdata/ and is released under the CC0 waiver to maximize reuse.
spellingShingle Data Descriptor
Machida, Ryuji J.
Leray, Matthieu
Ho, Shian-Lei
Knowlton, Nancy
Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
title Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
title_full Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
title_fullStr Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
title_full_unstemmed Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
title_short Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
title_sort metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5349245/
https://www.ncbi.nlm.nih.gov/pubmed/28291235
http://dx.doi.org/10.1038/sdata.2017.27
work_keys_str_mv AT machidaryujij metazoanmitochondrialgenesequencereferencedatasetsfortaxonomicassignmentofenvironmentalsamples
AT leraymatthieu metazoanmitochondrialgenesequencereferencedatasetsfortaxonomicassignmentofenvironmentalsamples
AT hoshianlei metazoanmitochondrialgenesequencereferencedatasetsfortaxonomicassignmentofenvironmentalsamples
AT knowltonnancy metazoanmitochondrialgenesequencereferencedatasetsfortaxonomicassignmentofenvironmentalsamples