Cargando…

LEGO-CSM: a tool for functional characterization of proteins

MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Thanh Binh, de Sá, Alex G C, Rodrigues, Carlos H M, Pires, Douglas E V, Ascher, David B
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329489/
https://www.ncbi.nlm.nih.gov/pubmed/37382560
http://dx.doi.org/10.1093/bioinformatics/btad402
_version_ 1785070028767363072
author Nguyen, Thanh Binh
de Sá, Alex G C
Rodrigues, Carlos H M
Pires, Douglas E V
Ascher, David B
author_facet Nguyen, Thanh Binh
de Sá, Alex G C
Rodrigues, Carlos H M
Pires, Douglas E V
Ascher, David B
author_sort Nguyen, Thanh Binh
collection PubMed
description MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms. RESULTS: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.93 for subcellular localization, up to 0.93 for EC, and up to 0.81 for GO terms on independent blind tests. AVAILABILITY AND IMPLEMENTATION: LEGO-CSM’s web server is freely available at https://biosig.lab.uq.edu.au/lego_csm. In addition, all datasets used to train and test LEGO-CSM’s models can be downloaded at https://biosig.lab.uq.edu.au/lego_csm/data.
format Online
Article
Text
id pubmed-10329489
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103294892023-07-09 LEGO-CSM: a tool for functional characterization of proteins Nguyen, Thanh Binh de Sá, Alex G C Rodrigues, Carlos H M Pires, Douglas E V Ascher, David B Bioinformatics Applications Note MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms. RESULTS: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.93 for subcellular localization, up to 0.93 for EC, and up to 0.81 for GO terms on independent blind tests. AVAILABILITY AND IMPLEMENTATION: LEGO-CSM’s web server is freely available at https://biosig.lab.uq.edu.au/lego_csm. In addition, all datasets used to train and test LEGO-CSM’s models can be downloaded at https://biosig.lab.uq.edu.au/lego_csm/data. Oxford University Press 2023-06-29 /pmc/articles/PMC10329489/ /pubmed/37382560 http://dx.doi.org/10.1093/bioinformatics/btad402 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Nguyen, Thanh Binh
de Sá, Alex G C
Rodrigues, Carlos H M
Pires, Douglas E V
Ascher, David B
LEGO-CSM: a tool for functional characterization of proteins
title LEGO-CSM: a tool for functional characterization of proteins
title_full LEGO-CSM: a tool for functional characterization of proteins
title_fullStr LEGO-CSM: a tool for functional characterization of proteins
title_full_unstemmed LEGO-CSM: a tool for functional characterization of proteins
title_short LEGO-CSM: a tool for functional characterization of proteins
title_sort lego-csm: a tool for functional characterization of proteins
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329489/
https://www.ncbi.nlm.nih.gov/pubmed/37382560
http://dx.doi.org/10.1093/bioinformatics/btad402
work_keys_str_mv AT nguyenthanhbinh legocsmatoolforfunctionalcharacterizationofproteins
AT desaalexgc legocsmatoolforfunctionalcharacterizationofproteins
AT rodriguescarloshm legocsmatoolforfunctionalcharacterizationofproteins
AT piresdouglasev legocsmatoolforfunctionalcharacterizationofproteins
AT ascherdavidb legocsmatoolforfunctionalcharacterizationofproteins