Cargando…
Automatic Gene Function Prediction in the 2020’s
The current rate at which new DNA and protein sequences are being generated is too fast to experimentally discover the functions of those sequences, emphasizing the need for accurate Automatic Function Prediction (AFP) methods. AFP has been an active and growing research field for decades and has ma...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7692357/ https://www.ncbi.nlm.nih.gov/pubmed/33120976 http://dx.doi.org/10.3390/genes11111264 |
_version_ | 1783614492187295744 |
---|---|
author | Makrodimitris, Stavros van Ham, Roeland C. H. J. Reinders, Marcel J. T. |
author_facet | Makrodimitris, Stavros van Ham, Roeland C. H. J. Reinders, Marcel J. T. |
author_sort | Makrodimitris, Stavros |
collection | PubMed |
description | The current rate at which new DNA and protein sequences are being generated is too fast to experimentally discover the functions of those sequences, emphasizing the need for accurate Automatic Function Prediction (AFP) methods. AFP has been an active and growing research field for decades and has made considerable progress in that time. However, it is certainly not solved. In this paper, we describe challenges that the AFP field still has to overcome in the future to increase its applicability. The challenges we consider are how to: (1) include condition-specific functional annotation, (2) predict functions for non-model species, (3) include new informative data sources, (4) deal with the biases of Gene Ontology (GO) annotations, and (5) maximally exploit the GO to obtain performance gains. We also provide recommendations for addressing those challenges, by adapting (1) the way we represent proteins and genes, (2) the way we represent gene functions, and (3) the algorithms that perform the prediction from gene to function. Together, we show that AFP is still a vibrant research area that can benefit from continuing advances in machine learning with which AFP in the 2020s can again take a large step forward reinforcing the power of computational biology. |
format | Online Article Text |
id | pubmed-7692357 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-76923572020-11-28 Automatic Gene Function Prediction in the 2020’s Makrodimitris, Stavros van Ham, Roeland C. H. J. Reinders, Marcel J. T. Genes (Basel) Review The current rate at which new DNA and protein sequences are being generated is too fast to experimentally discover the functions of those sequences, emphasizing the need for accurate Automatic Function Prediction (AFP) methods. AFP has been an active and growing research field for decades and has made considerable progress in that time. However, it is certainly not solved. In this paper, we describe challenges that the AFP field still has to overcome in the future to increase its applicability. The challenges we consider are how to: (1) include condition-specific functional annotation, (2) predict functions for non-model species, (3) include new informative data sources, (4) deal with the biases of Gene Ontology (GO) annotations, and (5) maximally exploit the GO to obtain performance gains. We also provide recommendations for addressing those challenges, by adapting (1) the way we represent proteins and genes, (2) the way we represent gene functions, and (3) the algorithms that perform the prediction from gene to function. Together, we show that AFP is still a vibrant research area that can benefit from continuing advances in machine learning with which AFP in the 2020s can again take a large step forward reinforcing the power of computational biology. MDPI 2020-10-27 /pmc/articles/PMC7692357/ /pubmed/33120976 http://dx.doi.org/10.3390/genes11111264 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Review Makrodimitris, Stavros van Ham, Roeland C. H. J. Reinders, Marcel J. T. Automatic Gene Function Prediction in the 2020’s |
title | Automatic Gene Function Prediction in the 2020’s |
title_full | Automatic Gene Function Prediction in the 2020’s |
title_fullStr | Automatic Gene Function Prediction in the 2020’s |
title_full_unstemmed | Automatic Gene Function Prediction in the 2020’s |
title_short | Automatic Gene Function Prediction in the 2020’s |
title_sort | automatic gene function prediction in the 2020’s |
topic | Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7692357/ https://www.ncbi.nlm.nih.gov/pubmed/33120976 http://dx.doi.org/10.3390/genes11111264 |
work_keys_str_mv | AT makrodimitrisstavros automaticgenefunctionpredictioninthe2020s AT vanhamroelandchj automaticgenefunctionpredictioninthe2020s AT reindersmarceljt automaticgenefunctionpredictioninthe2020s |