Cargando…

Improving the annotation of the Heterorhabditis bacteriophora genome

BACKGROUND: Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to...

Descripción completa

Detalles Bibliográficos
Autores principales: McLean, Florence, Berger, Duncan, Laetsch, Dominik R, Schwartz, Hillel T, Blaxter, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5906903/
https://www.ncbi.nlm.nih.gov/pubmed/29617768
http://dx.doi.org/10.1093/gigascience/giy034
_version_ 1783315446711189504
author McLean, Florence
Berger, Duncan
Laetsch, Dominik R
Schwartz, Hillel T
Blaxter, Mark
author_facet McLean, Florence
Berger, Duncan
Laetsch, Dominik R
Schwartz, Hillel T
Blaxter, Mark
author_sort McLean, Florence
collection PubMed
description BACKGROUND: Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to control insect pests in horticulture. The genome sequence for this species was reported to encode an unusually high proportion of unique proteins and a paucity of secreted proteins compared to other related nematodes. FINDINGS: We revisited the H. bacteriophora genome assembly and gene predictions to determine whether these unusual characteristics were biological or methodological in origin. We mapped an independent resequencing dataset to the genome and used the blobtools pipeline to identify potential contaminants. While present (0.2% of the genome span, 0.4% of predicted proteins), assembly contamination was not significant. CONCLUSIONS: Re-prediction of the gene set using BRAKER1 and published transcriptome data generated a predicted proteome that was very different from the published one. The new gene set had a much reduced complement of unique proteins, better completeness values that were in line with other related species’ genomes, and an increased number of proteins predicted to be secreted. It is thus likely that methodological issues drove the apparent uniqueness of the initial H. bacteriophora genome annotation and that similar contamination and misannotation issues affect other published genome assemblies.
format Online
Article
Text
id pubmed-5906903
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-59069032018-04-24 Improving the annotation of the Heterorhabditis bacteriophora genome McLean, Florence Berger, Duncan Laetsch, Dominik R Schwartz, Hillel T Blaxter, Mark Gigascience Data Note BACKGROUND: Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to control insect pests in horticulture. The genome sequence for this species was reported to encode an unusually high proportion of unique proteins and a paucity of secreted proteins compared to other related nematodes. FINDINGS: We revisited the H. bacteriophora genome assembly and gene predictions to determine whether these unusual characteristics were biological or methodological in origin. We mapped an independent resequencing dataset to the genome and used the blobtools pipeline to identify potential contaminants. While present (0.2% of the genome span, 0.4% of predicted proteins), assembly contamination was not significant. CONCLUSIONS: Re-prediction of the gene set using BRAKER1 and published transcriptome data generated a predicted proteome that was very different from the published one. The new gene set had a much reduced complement of unique proteins, better completeness values that were in line with other related species’ genomes, and an increased number of proteins predicted to be secreted. It is thus likely that methodological issues drove the apparent uniqueness of the initial H. bacteriophora genome annotation and that similar contamination and misannotation issues affect other published genome assemblies. Oxford University Press 2018-04-02 /pmc/articles/PMC5906903/ /pubmed/29617768 http://dx.doi.org/10.1093/gigascience/giy034 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Note
McLean, Florence
Berger, Duncan
Laetsch, Dominik R
Schwartz, Hillel T
Blaxter, Mark
Improving the annotation of the Heterorhabditis bacteriophora genome
title Improving the annotation of the Heterorhabditis bacteriophora genome
title_full Improving the annotation of the Heterorhabditis bacteriophora genome
title_fullStr Improving the annotation of the Heterorhabditis bacteriophora genome
title_full_unstemmed Improving the annotation of the Heterorhabditis bacteriophora genome
title_short Improving the annotation of the Heterorhabditis bacteriophora genome
title_sort improving the annotation of the heterorhabditis bacteriophora genome
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5906903/
https://www.ncbi.nlm.nih.gov/pubmed/29617768
http://dx.doi.org/10.1093/gigascience/giy034
work_keys_str_mv AT mcleanflorence improvingtheannotationoftheheterorhabditisbacteriophoragenome
AT bergerduncan improvingtheannotationoftheheterorhabditisbacteriophoragenome
AT laetschdominikr improvingtheannotationoftheheterorhabditisbacteriophoragenome
AT schwartzhillelt improvingtheannotationoftheheterorhabditisbacteriophoragenome
AT blaxtermark improvingtheannotationoftheheterorhabditisbacteriophoragenome