Cargando…

Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes

A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathwa...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Christopher Y., Wong, Aaron K., Greene, Casey S., Rowland, Jessica, Guan, Yuanfang, Bongo, Lars A., Burdine, Rebecca D., Troyanskaya, Olga G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3597527/
https://www.ncbi.nlm.nih.gov/pubmed/23516347
http://dx.doi.org/10.1371/journal.pcbi.1002957
_version_ 1782262641770627072
author Park, Christopher Y.
Wong, Aaron K.
Greene, Casey S.
Rowland, Jessica
Guan, Yuanfang
Bongo, Lars A.
Burdine, Rebecca D.
Troyanskaya, Olga G.
author_facet Park, Christopher Y.
Wong, Aaron K.
Greene, Casey S.
Rowland, Jessica
Guan, Yuanfang
Bongo, Lars A.
Burdine, Rebecca D.
Troyanskaya, Olga G.
author_sort Park, Christopher Y.
collection PubMed
description A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not already well studied. Many of these processes are well studied in some organism, but not necessarily in an investigator's organism of interest. Sequence-based search methods (e.g. BLAST) have been used to transfer such annotation information between organisms. We demonstrate that functional genomics can complement traditional sequence similarity to improve the transfer of gene annotations between organisms. Our method transfers annotations only when functionally appropriate as determined by genomic data and can be used with any prediction algorithm to combine transferred gene function knowledge with organism-specific high-throughput data to enable accurate function prediction. We show that diverse state-of-art machine learning algorithms leveraging functional knowledge transfer (FKT) dramatically improve their accuracy in predicting gene-pathway membership, particularly for processes with little experimental knowledge in an organism. We also show that our method compares favorably to annotation transfer by sequence similarity. Next, we deploy FKT with state-of-the-art SVM classifier to predict novel genes to 11,000 biological processes across six diverse organisms and expand the coverage of accurate function predictions to processes that are often ignored because of a dearth of annotated genes in an organism. Finally, we perform in vivo experimental investigation in Danio rerio and confirm the regulatory role of our top predicted novel gene, wnt5b, in leftward cell migration during heart development. FKT is immediately applicable to many bioinformatics techniques and will help biologists systematically integrate prior knowledge from diverse systems to direct targeted experiments in their organism of study.
format Online
Article
Text
id pubmed-3597527
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35975272013-03-20 Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes Park, Christopher Y. Wong, Aaron K. Greene, Casey S. Rowland, Jessica Guan, Yuanfang Bongo, Lars A. Burdine, Rebecca D. Troyanskaya, Olga G. PLoS Comput Biol Research Article A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not already well studied. Many of these processes are well studied in some organism, but not necessarily in an investigator's organism of interest. Sequence-based search methods (e.g. BLAST) have been used to transfer such annotation information between organisms. We demonstrate that functional genomics can complement traditional sequence similarity to improve the transfer of gene annotations between organisms. Our method transfers annotations only when functionally appropriate as determined by genomic data and can be used with any prediction algorithm to combine transferred gene function knowledge with organism-specific high-throughput data to enable accurate function prediction. We show that diverse state-of-art machine learning algorithms leveraging functional knowledge transfer (FKT) dramatically improve their accuracy in predicting gene-pathway membership, particularly for processes with little experimental knowledge in an organism. We also show that our method compares favorably to annotation transfer by sequence similarity. Next, we deploy FKT with state-of-the-art SVM classifier to predict novel genes to 11,000 biological processes across six diverse organisms and expand the coverage of accurate function predictions to processes that are often ignored because of a dearth of annotated genes in an organism. Finally, we perform in vivo experimental investigation in Danio rerio and confirm the regulatory role of our top predicted novel gene, wnt5b, in leftward cell migration during heart development. FKT is immediately applicable to many bioinformatics techniques and will help biologists systematically integrate prior knowledge from diverse systems to direct targeted experiments in their organism of study. Public Library of Science 2013-03-14 /pmc/articles/PMC3597527/ /pubmed/23516347 http://dx.doi.org/10.1371/journal.pcbi.1002957 Text en © 2013 Park et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Park, Christopher Y.
Wong, Aaron K.
Greene, Casey S.
Rowland, Jessica
Guan, Yuanfang
Bongo, Lars A.
Burdine, Rebecca D.
Troyanskaya, Olga G.
Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
title Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
title_full Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
title_fullStr Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
title_full_unstemmed Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
title_short Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
title_sort functional knowledge transfer for high-accuracy prediction of under-studied biological processes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3597527/
https://www.ncbi.nlm.nih.gov/pubmed/23516347
http://dx.doi.org/10.1371/journal.pcbi.1002957
work_keys_str_mv AT parkchristophery functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT wongaaronk functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT greenecaseys functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT rowlandjessica functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT guanyuanfang functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT bongolarsa functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT burdinerebeccad functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses
AT troyanskayaolgag functionalknowledgetransferforhighaccuracypredictionofunderstudiedbiologicalprocesses