Cargando…

Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database

Around 27,000 prokaryote genomes are presently deposited in the Genome database of GenBank at the National Center for Biotechnology Information (NCBI) and this number is exponentially growing. However, it is not known how many of these genomes correspond correctly to their designated taxon. The taxo...

Descripción completa

Detalles Bibliográficos
Autores principales: Beaz-Hidalgo, Roxana, Hossain, Mohammad J., Liles, Mark R., Figueras, Maria-Jose
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4301921/
https://www.ncbi.nlm.nih.gov/pubmed/25607802
http://dx.doi.org/10.1371/journal.pone.0115813
_version_ 1782353717986590720
author Beaz-Hidalgo, Roxana
Hossain, Mohammad J.
Liles, Mark R.
Figueras, Maria-Jose
author_facet Beaz-Hidalgo, Roxana
Hossain, Mohammad J.
Liles, Mark R.
Figueras, Maria-Jose
author_sort Beaz-Hidalgo, Roxana
collection PubMed
description Around 27,000 prokaryote genomes are presently deposited in the Genome database of GenBank at the National Center for Biotechnology Information (NCBI) and this number is exponentially growing. However, it is not known how many of these genomes correspond correctly to their designated taxon. The taxonomic affiliation of 44 Aeromonas genomes (only five of these are type strains) deposited at the NCBI was determined by a multilocus phylogenetic analysis (MLPA) and by pairwise average nucleotide identity (ANI). Discordant results in relation to taxa assignation were found for 14 (35.9%) of the 39 non-type strain genomes on the basis of both the MLPA and ANI results. Data presented in this study also demonstrated that if the genome of the type strain is not available, a genome of the same species correctly identified can be used as a reference for ANI calculations. Of the three ANI calculating tools compared (ANI calculator, EzGenome and JSpecies), EzGenome and JSpecies provided very similar results. However, the ANI calculator provided higher intra- and inter-species values than the other two tools (differences within the ranges 0.06–0.82% and 0.92–3.38%, respectively). Nevertheless each of these tools produced the same species classification for the studied Aeromonas genomes. To avoid possible misinterpretations with the ANI calculator, particularly when values are at the borderline of the 95% cutoff, one of the other calculation tools (EzGenome or JSpecies) should be used in combination. It is recommended that once a genome sequence is obtained the correct taxonomic affiliation is verified using ANI or a MLPA before it is submitted to the NCBI and that researchers should amend the existing taxonomic errors present in databases.
format Online
Article
Text
id pubmed-4301921
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43019212015-01-30 Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database Beaz-Hidalgo, Roxana Hossain, Mohammad J. Liles, Mark R. Figueras, Maria-Jose PLoS One Research Article Around 27,000 prokaryote genomes are presently deposited in the Genome database of GenBank at the National Center for Biotechnology Information (NCBI) and this number is exponentially growing. However, it is not known how many of these genomes correspond correctly to their designated taxon. The taxonomic affiliation of 44 Aeromonas genomes (only five of these are type strains) deposited at the NCBI was determined by a multilocus phylogenetic analysis (MLPA) and by pairwise average nucleotide identity (ANI). Discordant results in relation to taxa assignation were found for 14 (35.9%) of the 39 non-type strain genomes on the basis of both the MLPA and ANI results. Data presented in this study also demonstrated that if the genome of the type strain is not available, a genome of the same species correctly identified can be used as a reference for ANI calculations. Of the three ANI calculating tools compared (ANI calculator, EzGenome and JSpecies), EzGenome and JSpecies provided very similar results. However, the ANI calculator provided higher intra- and inter-species values than the other two tools (differences within the ranges 0.06–0.82% and 0.92–3.38%, respectively). Nevertheless each of these tools produced the same species classification for the studied Aeromonas genomes. To avoid possible misinterpretations with the ANI calculator, particularly when values are at the borderline of the 95% cutoff, one of the other calculation tools (EzGenome or JSpecies) should be used in combination. It is recommended that once a genome sequence is obtained the correct taxonomic affiliation is verified using ANI or a MLPA before it is submitted to the NCBI and that researchers should amend the existing taxonomic errors present in databases. Public Library of Science 2015-01-21 /pmc/articles/PMC4301921/ /pubmed/25607802 http://dx.doi.org/10.1371/journal.pone.0115813 Text en © 2015 Beaz-Hidalgo et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Beaz-Hidalgo, Roxana
Hossain, Mohammad J.
Liles, Mark R.
Figueras, Maria-Jose
Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database
title Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database
title_full Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database
title_fullStr Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database
title_full_unstemmed Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database
title_short Strategies to Avoid Wrongly Labelled Genomes Using as Example the Detected Wrong Taxonomic Affiliation for Aeromonas Genomes in the GenBank Database
title_sort strategies to avoid wrongly labelled genomes using as example the detected wrong taxonomic affiliation for aeromonas genomes in the genbank database
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4301921/
https://www.ncbi.nlm.nih.gov/pubmed/25607802
http://dx.doi.org/10.1371/journal.pone.0115813
work_keys_str_mv AT beazhidalgoroxana strategiestoavoidwronglylabelledgenomesusingasexamplethedetectedwrongtaxonomicaffiliationforaeromonasgenomesinthegenbankdatabase
AT hossainmohammadj strategiestoavoidwronglylabelledgenomesusingasexamplethedetectedwrongtaxonomicaffiliationforaeromonasgenomesinthegenbankdatabase
AT lilesmarkr strategiestoavoidwronglylabelledgenomesusingasexamplethedetectedwrongtaxonomicaffiliationforaeromonasgenomesinthegenbankdatabase
AT figuerasmariajose strategiestoavoidwronglylabelledgenomesusingasexamplethedetectedwrongtaxonomicaffiliationforaeromonasgenomesinthegenbankdatabase