Cargando…

mebipred: identifying metal-binding potential in protein sequence

MOTIVATION: metal-binding proteins have a central role in maintaining life processes. Nearly one-third of known protein structures contain metal ions that are used for a variety of needs, such as catalysis, DNA/RNA binding, protein structure stability, etc. Identifying metal-binding proteins is thus...

Descripción completa

Detalles Bibliográficos
Autores principales: Aptekmann, A A, Buongiorno, J, Giovannelli, D, Glamoclija, M, Ferreiro, D U, Bromberg, Y
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272798/
https://www.ncbi.nlm.nih.gov/pubmed/35639953
http://dx.doi.org/10.1093/bioinformatics/btac358
_version_ 1784744946568265728
author Aptekmann, A A
Buongiorno, J
Giovannelli, D
Glamoclija, M
Ferreiro, D U
Bromberg, Y
author_facet Aptekmann, A A
Buongiorno, J
Giovannelli, D
Glamoclija, M
Ferreiro, D U
Bromberg, Y
author_sort Aptekmann, A A
collection PubMed
description MOTIVATION: metal-binding proteins have a central role in maintaining life processes. Nearly one-third of known protein structures contain metal ions that are used for a variety of needs, such as catalysis, DNA/RNA binding, protein structure stability, etc. Identifying metal-binding proteins is thus crucial for understanding the mechanisms of cellular activity. However, experimental annotation of protein metal-binding potential is severely lacking, while computational techniques are often imprecise and of limited applicability. RESULTS: we developed a novel machine learning-based method, mebipred, for identifying metal-binding proteins from sequence-derived features. This method is over 80% accurate in recognizing proteins that bind metal ion-containing ligands; the specific identity of 11 ubiquitously present metal ions can also be annotated. mebipred is reference-free, i.e. no sequence alignments are involved, and is thus faster than alignment-based methods; it is also more accurate than other sequence-based prediction methods. Additionally, mebipred can identify protein metal-binding capabilities from short sequence stretches, e.g. translated sequencing reads, and, thus, may be useful for the annotation of metal requirements of metagenomic samples. We performed an analysis of available microbiome data and found that ocean, hot spring sediments and soil microbiomes use a more diverse set of metals than human host-related ones. For human microbiomes, physiological conditions explain the observed metal preferences. Similarly, subtle changes in ocean sample ion concentration affect the abundance of relevant metal-binding proteins. These results highlight mebipred’s utility in analyzing microbiome metal requirements. AVAILABILITY AND IMPLEMENTATION: mebipred is available as a web server at services.bromberglab.org/mebipred and as a standalone package at https://pypi.org/project/mymetal/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9272798
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92727982022-07-11 mebipred: identifying metal-binding potential in protein sequence Aptekmann, A A Buongiorno, J Giovannelli, D Glamoclija, M Ferreiro, D U Bromberg, Y Bioinformatics Original Papers MOTIVATION: metal-binding proteins have a central role in maintaining life processes. Nearly one-third of known protein structures contain metal ions that are used for a variety of needs, such as catalysis, DNA/RNA binding, protein structure stability, etc. Identifying metal-binding proteins is thus crucial for understanding the mechanisms of cellular activity. However, experimental annotation of protein metal-binding potential is severely lacking, while computational techniques are often imprecise and of limited applicability. RESULTS: we developed a novel machine learning-based method, mebipred, for identifying metal-binding proteins from sequence-derived features. This method is over 80% accurate in recognizing proteins that bind metal ion-containing ligands; the specific identity of 11 ubiquitously present metal ions can also be annotated. mebipred is reference-free, i.e. no sequence alignments are involved, and is thus faster than alignment-based methods; it is also more accurate than other sequence-based prediction methods. Additionally, mebipred can identify protein metal-binding capabilities from short sequence stretches, e.g. translated sequencing reads, and, thus, may be useful for the annotation of metal requirements of metagenomic samples. We performed an analysis of available microbiome data and found that ocean, hot spring sediments and soil microbiomes use a more diverse set of metals than human host-related ones. For human microbiomes, physiological conditions explain the observed metal preferences. Similarly, subtle changes in ocean sample ion concentration affect the abundance of relevant metal-binding proteins. These results highlight mebipred’s utility in analyzing microbiome metal requirements. AVAILABILITY AND IMPLEMENTATION: mebipred is available as a web server at services.bromberglab.org/mebipred and as a standalone package at https://pypi.org/project/mymetal/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-05-27 /pmc/articles/PMC9272798/ /pubmed/35639953 http://dx.doi.org/10.1093/bioinformatics/btac358 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Aptekmann, A A
Buongiorno, J
Giovannelli, D
Glamoclija, M
Ferreiro, D U
Bromberg, Y
mebipred: identifying metal-binding potential in protein sequence
title mebipred: identifying metal-binding potential in protein sequence
title_full mebipred: identifying metal-binding potential in protein sequence
title_fullStr mebipred: identifying metal-binding potential in protein sequence
title_full_unstemmed mebipred: identifying metal-binding potential in protein sequence
title_short mebipred: identifying metal-binding potential in protein sequence
title_sort mebipred: identifying metal-binding potential in protein sequence
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272798/
https://www.ncbi.nlm.nih.gov/pubmed/35639953
http://dx.doi.org/10.1093/bioinformatics/btac358
work_keys_str_mv AT aptekmannaa mebipredidentifyingmetalbindingpotentialinproteinsequence
AT buongiornoj mebipredidentifyingmetalbindingpotentialinproteinsequence
AT giovannellid mebipredidentifyingmetalbindingpotentialinproteinsequence
AT glamoclijam mebipredidentifyingmetalbindingpotentialinproteinsequence
AT ferreirodu mebipredidentifyingmetalbindingpotentialinproteinsequence
AT brombergy mebipredidentifyingmetalbindingpotentialinproteinsequence