Cargando…

Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea

Background: Annotation ambiguities and annotation errors are a general challenge in genomics. While a reliable protein function assignment can be obtained by experimental characterization, this is expensive and time-consuming, and the number of such Gold Standard Proteins (GSP) with experimental sup...

Descripción completa

Detalles Bibliográficos
Autores principales: Pfeiffer, Friedhelm, Dyall-Smith, Mike
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8305020/
https://www.ncbi.nlm.nih.gov/pubmed/34202810
http://dx.doi.org/10.3390/genes12070963
_version_ 1783727474422579200
author Pfeiffer, Friedhelm
Dyall-Smith, Mike
author_facet Pfeiffer, Friedhelm
Dyall-Smith, Mike
author_sort Pfeiffer, Friedhelm
collection PubMed
description Background: Annotation ambiguities and annotation errors are a general challenge in genomics. While a reliable protein function assignment can be obtained by experimental characterization, this is expensive and time-consuming, and the number of such Gold Standard Proteins (GSP) with experimental support remains very low compared to proteins annotated by sequence homology, usually through automated pipelines. Even a GSP may give a misleading assignment when used as a reference: the homolog may be close enough to support isofunctionality, but the substrate of the GSP is absent from the species being annotated. In such cases, the enzymes cannot be isofunctional. Here, we examined a variety of such issues in halophilic archaea (class Halobacteria), with a strong focus on the model haloarchaeon Haloferax volcanii. Results: Annotated proteins of Hfx. volcanii were identified for which public databases tend to assign a function that is probably incorrect. In some cases, an alternative, probably correct, function can be predicted or inferred from the available evidence, but this has not been adopted by public databases because experimental validation is lacking. In other cases, a probably invalid specific function is predicted by homology, and while there is evidence that this assigned function is unlikely, the true function remains elusive. We listed 50 of those cases, each with detailed background information, so that a conclusion about the most likely biological function can be drawn. For reasons of brevity and comprehension, only the key aspects are listed in the main text, with detailed information being provided in a corresponding section of the Supplementary Materials. Conclusions: Compiling, describing and summarizing these open annotation issues and functional predictions will benefit the scientific community in the general effort to improve the evaluation of protein function assignments and more thoroughly detail them. By highlighting the gaps and likely annotation errors currently in the databases, we hope this study will provide a framework for experimentalists to systematically confirm (or disprove) our function predictions or to uncover yet more unexpected functions.
format Online
Article
Text
id pubmed-8305020
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83050202021-07-25 Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea Pfeiffer, Friedhelm Dyall-Smith, Mike Genes (Basel) Article Background: Annotation ambiguities and annotation errors are a general challenge in genomics. While a reliable protein function assignment can be obtained by experimental characterization, this is expensive and time-consuming, and the number of such Gold Standard Proteins (GSP) with experimental support remains very low compared to proteins annotated by sequence homology, usually through automated pipelines. Even a GSP may give a misleading assignment when used as a reference: the homolog may be close enough to support isofunctionality, but the substrate of the GSP is absent from the species being annotated. In such cases, the enzymes cannot be isofunctional. Here, we examined a variety of such issues in halophilic archaea (class Halobacteria), with a strong focus on the model haloarchaeon Haloferax volcanii. Results: Annotated proteins of Hfx. volcanii were identified for which public databases tend to assign a function that is probably incorrect. In some cases, an alternative, probably correct, function can be predicted or inferred from the available evidence, but this has not been adopted by public databases because experimental validation is lacking. In other cases, a probably invalid specific function is predicted by homology, and while there is evidence that this assigned function is unlikely, the true function remains elusive. We listed 50 of those cases, each with detailed background information, so that a conclusion about the most likely biological function can be drawn. For reasons of brevity and comprehension, only the key aspects are listed in the main text, with detailed information being provided in a corresponding section of the Supplementary Materials. Conclusions: Compiling, describing and summarizing these open annotation issues and functional predictions will benefit the scientific community in the general effort to improve the evaluation of protein function assignments and more thoroughly detail them. By highlighting the gaps and likely annotation errors currently in the databases, we hope this study will provide a framework for experimentalists to systematically confirm (or disprove) our function predictions or to uncover yet more unexpected functions. MDPI 2021-06-24 /pmc/articles/PMC8305020/ /pubmed/34202810 http://dx.doi.org/10.3390/genes12070963 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Pfeiffer, Friedhelm
Dyall-Smith, Mike
Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea
title Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea
title_full Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea
title_fullStr Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea
title_full_unstemmed Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea
title_short Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea
title_sort open issues for protein function assignment in haloferax volcanii and other halophilic archaea
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8305020/
https://www.ncbi.nlm.nih.gov/pubmed/34202810
http://dx.doi.org/10.3390/genes12070963
work_keys_str_mv AT pfeifferfriedhelm openissuesforproteinfunctionassignmentinhaloferaxvolcaniiandotherhalophilicarchaea
AT dyallsmithmike openissuesforproteinfunctionassignmentinhaloferaxvolcaniiandotherhalophilicarchaea