Cargando…

CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation

Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarkin...

Descripción completa

Detalles Bibliográficos
Autores principales: Reijnders, Maarten J. M. F., Waterhouse, Robert M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9132264/
https://www.ncbi.nlm.nih.gov/pubmed/35560159
http://dx.doi.org/10.1371/journal.pcbi.1010075
_version_ 1784713339832631296
author Reijnders, Maarten J. M. F.
Waterhouse, Robert M.
author_facet Reijnders, Maarten J. M. F.
Waterhouse, Robert M.
author_sort Reijnders, Maarten J. M. F.
collection PubMed
description Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community’s best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.
format Online
Article
Text
id pubmed-9132264
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-91322642022-05-26 CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation Reijnders, Maarten J. M. F. Waterhouse, Robert M. PLoS Comput Biol Research Article Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community’s best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations. Public Library of Science 2022-05-13 /pmc/articles/PMC9132264/ /pubmed/35560159 http://dx.doi.org/10.1371/journal.pcbi.1010075 Text en © 2022 Reijnders, Waterhouse https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Reijnders, Maarten J. M. F.
Waterhouse, Robert M.
CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation
title CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation
title_full CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation
title_fullStr CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation
title_full_unstemmed CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation
title_short CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation
title_sort crowdgo: machine learning and semantic similarity guided consensus gene ontology annotation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9132264/
https://www.ncbi.nlm.nih.gov/pubmed/35560159
http://dx.doi.org/10.1371/journal.pcbi.1010075
work_keys_str_mv AT reijndersmaartenjmf crowdgomachinelearningandsemanticsimilarityguidedconsensusgeneontologyannotation
AT waterhouserobertm crowdgomachinelearningandsemanticsimilarityguidedconsensusgeneontologyannotation