Cargando…

Profiling the orphan enzymes

The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of known enzyme activities is still lacking an associ...

Descripción completa

Detalles Bibliográficos
Autores principales: Sorokina, Maria, Stam, Mark, Médigue, Claudine, Lespinet, Olivier, Vallenet, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084501/
https://www.ncbi.nlm.nih.gov/pubmed/24906382
http://dx.doi.org/10.1186/1745-6150-9-10
_version_ 1782324547139141632
author Sorokina, Maria
Stam, Mark
Médigue, Claudine
Lespinet, Olivier
Vallenet, David
author_facet Sorokina, Maria
Stam, Mark
Médigue, Claudine
Lespinet, Olivier
Vallenet, David
author_sort Sorokina, Maria
collection PubMed
description The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of known enzyme activities is still lacking an associated protein sequence. These particular activities are called “orphan enzymes”. The present review proposes an update of previous surveys on orphan enzymes by mining the current content of public databases. While the percentage of orphan enzyme activities has decreased from 38% to 22% in ten years, there are still more than 1,000 orphans among the 5,000 entries of the Enzyme Commission (EC) classification. Taking into account all the reactions present in metabolic databases, this proportion dramatically increases to reach nearly 50% of orphans and many of them are not associated to a known pathway. We extended our survey to “local orphan enzymes” that are activities which have no representative sequence in a given clade, but have at least one in organisms belonging to other clades. We observe an important bias in Archaea and find that in general more than 30% of the EC activities have incomplete sequence information in at least one superkingdom. To estimate if candidate proteins for local orphans could be retrieved by homology search, we applied a simple strategy based on the PRIAM software and noticed that candidates may be proposed for an important fraction of local orphan enzymes. Finally, by studying relation between protein domains and catalyzed activities, it appears that newly discovered enzymes are mostly associated with already known enzyme domains. Thus, the exploration of the promiscuity and the multifunctional aspect of known enzyme families may solve part of the orphan enzyme issue. We conclude this review with a presentation of recent initiatives in finding proteins for orphan enzymes and in extending the enzyme world by the discovery of new activities. REVIEWERS: This article was reviewed by Michael Galperin, Daniel Haft and Daniel Kahn.
format Online
Article
Text
id pubmed-4084501
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40845012014-07-08 Profiling the orphan enzymes Sorokina, Maria Stam, Mark Médigue, Claudine Lespinet, Olivier Vallenet, David Biol Direct Review The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of known enzyme activities is still lacking an associated protein sequence. These particular activities are called “orphan enzymes”. The present review proposes an update of previous surveys on orphan enzymes by mining the current content of public databases. While the percentage of orphan enzyme activities has decreased from 38% to 22% in ten years, there are still more than 1,000 orphans among the 5,000 entries of the Enzyme Commission (EC) classification. Taking into account all the reactions present in metabolic databases, this proportion dramatically increases to reach nearly 50% of orphans and many of them are not associated to a known pathway. We extended our survey to “local orphan enzymes” that are activities which have no representative sequence in a given clade, but have at least one in organisms belonging to other clades. We observe an important bias in Archaea and find that in general more than 30% of the EC activities have incomplete sequence information in at least one superkingdom. To estimate if candidate proteins for local orphans could be retrieved by homology search, we applied a simple strategy based on the PRIAM software and noticed that candidates may be proposed for an important fraction of local orphan enzymes. Finally, by studying relation between protein domains and catalyzed activities, it appears that newly discovered enzymes are mostly associated with already known enzyme domains. Thus, the exploration of the promiscuity and the multifunctional aspect of known enzyme families may solve part of the orphan enzyme issue. We conclude this review with a presentation of recent initiatives in finding proteins for orphan enzymes and in extending the enzyme world by the discovery of new activities. REVIEWERS: This article was reviewed by Michael Galperin, Daniel Haft and Daniel Kahn. BioMed Central 2014-06-06 /pmc/articles/PMC4084501/ /pubmed/24906382 http://dx.doi.org/10.1186/1745-6150-9-10 Text en Copyright © 2014 Sorokina et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Review
Sorokina, Maria
Stam, Mark
Médigue, Claudine
Lespinet, Olivier
Vallenet, David
Profiling the orphan enzymes
title Profiling the orphan enzymes
title_full Profiling the orphan enzymes
title_fullStr Profiling the orphan enzymes
title_full_unstemmed Profiling the orphan enzymes
title_short Profiling the orphan enzymes
title_sort profiling the orphan enzymes
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084501/
https://www.ncbi.nlm.nih.gov/pubmed/24906382
http://dx.doi.org/10.1186/1745-6150-9-10
work_keys_str_mv AT sorokinamaria profilingtheorphanenzymes
AT stammark profilingtheorphanenzymes
AT medigueclaudine profilingtheorphanenzymes
AT lespinetolivier profilingtheorphanenzymes
AT vallenetdavid profilingtheorphanenzymes