Cargando…
LEGO-CSM: a tool for functional characterization of proteins
MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329489/ https://www.ncbi.nlm.nih.gov/pubmed/37382560 http://dx.doi.org/10.1093/bioinformatics/btad402 |
_version_ | 1785070028767363072 |
---|---|
author | Nguyen, Thanh Binh de Sá, Alex G C Rodrigues, Carlos H M Pires, Douglas E V Ascher, David B |
author_facet | Nguyen, Thanh Binh de Sá, Alex G C Rodrigues, Carlos H M Pires, Douglas E V Ascher, David B |
author_sort | Nguyen, Thanh Binh |
collection | PubMed |
description | MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms. RESULTS: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.93 for subcellular localization, up to 0.93 for EC, and up to 0.81 for GO terms on independent blind tests. AVAILABILITY AND IMPLEMENTATION: LEGO-CSM’s web server is freely available at https://biosig.lab.uq.edu.au/lego_csm. In addition, all datasets used to train and test LEGO-CSM’s models can be downloaded at https://biosig.lab.uq.edu.au/lego_csm/data. |
format | Online Article Text |
id | pubmed-10329489 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-103294892023-07-09 LEGO-CSM: a tool for functional characterization of proteins Nguyen, Thanh Binh de Sá, Alex G C Rodrigues, Carlos H M Pires, Douglas E V Ascher, David B Bioinformatics Applications Note MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms. RESULTS: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.93 for subcellular localization, up to 0.93 for EC, and up to 0.81 for GO terms on independent blind tests. AVAILABILITY AND IMPLEMENTATION: LEGO-CSM’s web server is freely available at https://biosig.lab.uq.edu.au/lego_csm. In addition, all datasets used to train and test LEGO-CSM’s models can be downloaded at https://biosig.lab.uq.edu.au/lego_csm/data. Oxford University Press 2023-06-29 /pmc/articles/PMC10329489/ /pubmed/37382560 http://dx.doi.org/10.1093/bioinformatics/btad402 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Nguyen, Thanh Binh de Sá, Alex G C Rodrigues, Carlos H M Pires, Douglas E V Ascher, David B LEGO-CSM: a tool for functional characterization of proteins |
title | LEGO-CSM: a tool for functional characterization of proteins |
title_full | LEGO-CSM: a tool for functional characterization of proteins |
title_fullStr | LEGO-CSM: a tool for functional characterization of proteins |
title_full_unstemmed | LEGO-CSM: a tool for functional characterization of proteins |
title_short | LEGO-CSM: a tool for functional characterization of proteins |
title_sort | lego-csm: a tool for functional characterization of proteins |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329489/ https://www.ncbi.nlm.nih.gov/pubmed/37382560 http://dx.doi.org/10.1093/bioinformatics/btad402 |
work_keys_str_mv | AT nguyenthanhbinh legocsmatoolforfunctionalcharacterizationofproteins AT desaalexgc legocsmatoolforfunctionalcharacterizationofproteins AT rodriguescarloshm legocsmatoolforfunctionalcharacterizationofproteins AT piresdouglasev legocsmatoolforfunctionalcharacterizationofproteins AT ascherdavidb legocsmatoolforfunctionalcharacterizationofproteins |