Cargando…

Wei2GO: weighted sequence similarity-based protein function prediction

BACKGROUND: Protein function prediction is an important part of bioinformatics and genomics studies. There are many different predictors available, however most of these are in the form of web-servers instead of open-source locally installable versions. Such local versions are necessary to perform l...

Descripción completa

Detalles Bibliográficos
Autor principal: Reijnders, Maarten J.M.F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8855713/
https://www.ncbi.nlm.nih.gov/pubmed/35186498
http://dx.doi.org/10.7717/peerj.12931
_version_ 1784653705679732736
author Reijnders, Maarten J.M.F.
author_facet Reijnders, Maarten J.M.F.
author_sort Reijnders, Maarten J.M.F.
collection PubMed
description BACKGROUND: Protein function prediction is an important part of bioinformatics and genomics studies. There are many different predictors available, however most of these are in the form of web-servers instead of open-source locally installable versions. Such local versions are necessary to perform large scale genomics studies due to the presence of limitations imposed by web servers such as queues, prediction speed, and updatability of databases. METHODS: This paper describes Wei2GO: a weighted sequence similarity and python-based open-source protein function prediction software. It uses DIAMOND and HMMScan sequence alignment searches against the UniProtKB and Pfam databases respectively, transfers Gene Ontology terms from the reference protein to the query protein, and uses a weighing algorithm to calculate a score for the Gene Ontology annotations. RESULTS: Wei2GO is compared against the Argot2 and Argot2.5 web servers, which use a similar concept, and DeepGOPlus which acts as a reference. Wei2GO shows an increase in performance according to precision and recall curves, F(max) scores, and S(min) scores for biological process and molecular function ontologies. Computational time compared to Argot2 and Argot2.5 is decreased from several hours to several minutes. AVAILABILITY: Wei2GO is written in Python 3, and can be found at https://gitlab.com/mreijnders/Wei2GO.
format Online
Article
Text
id pubmed-8855713
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-88557132022-02-19 Wei2GO: weighted sequence similarity-based protein function prediction Reijnders, Maarten J.M.F. PeerJ Bioinformatics BACKGROUND: Protein function prediction is an important part of bioinformatics and genomics studies. There are many different predictors available, however most of these are in the form of web-servers instead of open-source locally installable versions. Such local versions are necessary to perform large scale genomics studies due to the presence of limitations imposed by web servers such as queues, prediction speed, and updatability of databases. METHODS: This paper describes Wei2GO: a weighted sequence similarity and python-based open-source protein function prediction software. It uses DIAMOND and HMMScan sequence alignment searches against the UniProtKB and Pfam databases respectively, transfers Gene Ontology terms from the reference protein to the query protein, and uses a weighing algorithm to calculate a score for the Gene Ontology annotations. RESULTS: Wei2GO is compared against the Argot2 and Argot2.5 web servers, which use a similar concept, and DeepGOPlus which acts as a reference. Wei2GO shows an increase in performance according to precision and recall curves, F(max) scores, and S(min) scores for biological process and molecular function ontologies. Computational time compared to Argot2 and Argot2.5 is decreased from several hours to several minutes. AVAILABILITY: Wei2GO is written in Python 3, and can be found at https://gitlab.com/mreijnders/Wei2GO. PeerJ Inc. 2022-02-15 /pmc/articles/PMC8855713/ /pubmed/35186498 http://dx.doi.org/10.7717/peerj.12931 Text en © 2022 Reijnders https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Reijnders, Maarten J.M.F.
Wei2GO: weighted sequence similarity-based protein function prediction
title Wei2GO: weighted sequence similarity-based protein function prediction
title_full Wei2GO: weighted sequence similarity-based protein function prediction
title_fullStr Wei2GO: weighted sequence similarity-based protein function prediction
title_full_unstemmed Wei2GO: weighted sequence similarity-based protein function prediction
title_short Wei2GO: weighted sequence similarity-based protein function prediction
title_sort wei2go: weighted sequence similarity-based protein function prediction
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8855713/
https://www.ncbi.nlm.nih.gov/pubmed/35186498
http://dx.doi.org/10.7717/peerj.12931
work_keys_str_mv AT reijndersmaartenjmf wei2goweightedsequencesimilaritybasedproteinfunctionprediction