Cargando…

MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences

Nucleotide sequences reference collections or databases are fundamental components in DNA barcoding and metabarcoding data analyses pipelines. In such analyses, the accurate taxonomic assignment is a crucial aspect, relying directly on the availability of comprehensive and curated reference sequence...

Descripción completa

Detalles Bibliográficos
Autores principales: Balech, Bachir, Sandionigi, Anna, Marzano, Marinella, Pesole, Graziano, Santamaria, Monica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9216479/
https://www.ncbi.nlm.nih.gov/pubmed/35134858
http://dx.doi.org/10.1093/database/baab084
_version_ 1784731431538262016
author Balech, Bachir
Sandionigi, Anna
Marzano, Marinella
Pesole, Graziano
Santamaria, Monica
author_facet Balech, Bachir
Sandionigi, Anna
Marzano, Marinella
Pesole, Graziano
Santamaria, Monica
author_sort Balech, Bachir
collection PubMed
description Nucleotide sequences reference collections or databases are fundamental components in DNA barcoding and metabarcoding data analyses pipelines. In such analyses, the accurate taxonomic assignment is a crucial aspect, relying directly on the availability of comprehensive and curated reference sequence collection and its taxonomy information. The currently wide use of the mitochondrial cytochrome oxidase subunit-I (COXI) as a standard DNA barcode marker in metazoan biodiversity studies highlights the need to shed light on the availability of the related relevant information from different data sources and their eventual integration. To adequately address data integration process, many aspects should be markedly considered starting from DNA sequence curation followed by taxonomy alignment with solid reference backbone and metadata harmonization according to universal standards. Here, we present MetaCOXI, an integrated collection of curated metazoan COXI DNA sequences with their associated harmonized taxonomy and metadata. This collection was built on the two most extensive available data resources, namely the European Nucleotide Archive (ENA) and the Barcode of Life Data System (BOLD). The current release contains more than 5.6 million entries (39.1% unique to BOLD, 3.6% unique to ENA, and 57.2% shared between both), their related taxonomic classification based on NCBI reference taxonomy, and their available main metadata relevant to environmental DNA studies, such as geographical coordinates, sampling country and host species. MetaCOXI is available in standard universal formats (‘fasta’ for sequences & ‘tsv’ for taxonomy and metadata), which can be easily incorporated in standard or specific DNA barcoding and/or metabarcoding data analysis pipelines. Database URL: https://github.com/bachob5/MetaCOXI
format Online
Article
Text
id pubmed-9216479
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92164792022-06-23 MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences Balech, Bachir Sandionigi, Anna Marzano, Marinella Pesole, Graziano Santamaria, Monica Database (Oxford) Original Article Nucleotide sequences reference collections or databases are fundamental components in DNA barcoding and metabarcoding data analyses pipelines. In such analyses, the accurate taxonomic assignment is a crucial aspect, relying directly on the availability of comprehensive and curated reference sequence collection and its taxonomy information. The currently wide use of the mitochondrial cytochrome oxidase subunit-I (COXI) as a standard DNA barcode marker in metazoan biodiversity studies highlights the need to shed light on the availability of the related relevant information from different data sources and their eventual integration. To adequately address data integration process, many aspects should be markedly considered starting from DNA sequence curation followed by taxonomy alignment with solid reference backbone and metadata harmonization according to universal standards. Here, we present MetaCOXI, an integrated collection of curated metazoan COXI DNA sequences with their associated harmonized taxonomy and metadata. This collection was built on the two most extensive available data resources, namely the European Nucleotide Archive (ENA) and the Barcode of Life Data System (BOLD). The current release contains more than 5.6 million entries (39.1% unique to BOLD, 3.6% unique to ENA, and 57.2% shared between both), their related taxonomic classification based on NCBI reference taxonomy, and their available main metadata relevant to environmental DNA studies, such as geographical coordinates, sampling country and host species. MetaCOXI is available in standard universal formats (‘fasta’ for sequences & ‘tsv’ for taxonomy and metadata), which can be easily incorporated in standard or specific DNA barcoding and/or metabarcoding data analysis pipelines. Database URL: https://github.com/bachob5/MetaCOXI Oxford University Press 2022-02-05 /pmc/articles/PMC9216479/ /pubmed/35134858 http://dx.doi.org/10.1093/database/baab084 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Balech, Bachir
Sandionigi, Anna
Marzano, Marinella
Pesole, Graziano
Santamaria, Monica
MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences
title MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences
title_full MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences
title_fullStr MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences
title_full_unstemmed MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences
title_short MetaCOXI: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-I DNA sequences
title_sort metacoxi: an integrated collection of metazoan mitochondrial cytochrome oxidase subunit-i dna sequences
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9216479/
https://www.ncbi.nlm.nih.gov/pubmed/35134858
http://dx.doi.org/10.1093/database/baab084
work_keys_str_mv AT balechbachir metacoxianintegratedcollectionofmetazoanmitochondrialcytochromeoxidasesubunitidnasequences
AT sandionigianna metacoxianintegratedcollectionofmetazoanmitochondrialcytochromeoxidasesubunitidnasequences
AT marzanomarinella metacoxianintegratedcollectionofmetazoanmitochondrialcytochromeoxidasesubunitidnasequences
AT pesolegraziano metacoxianintegratedcollectionofmetazoanmitochondrialcytochromeoxidasesubunitidnasequences
AT santamariamonica metacoxianintegratedcollectionofmetazoanmitochondrialcytochromeoxidasesubunitidnasequences