Cargando…

Fast integration of heterogeneous data sources for predicting gene function with limited annotation

Motivation: Many algorithms that integrate multiple functional association networks for predicting gene function construct a composite network as a weighted sum of the individual networks and then use the composite network to predict gene function. The weight assigned to an individual network repres...

Descripción completa

Detalles Bibliográficos
Autores principales: Mostafavi, Sara, Morris, Quaid
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2894508/
https://www.ncbi.nlm.nih.gov/pubmed/20507895
http://dx.doi.org/10.1093/bioinformatics/btq262
_version_ 1782183196218097664
author Mostafavi, Sara
Morris, Quaid
author_facet Mostafavi, Sara
Morris, Quaid
author_sort Mostafavi, Sara
collection PubMed
description Motivation: Many algorithms that integrate multiple functional association networks for predicting gene function construct a composite network as a weighted sum of the individual networks and then use the composite network to predict gene function. The weight assigned to an individual network represents the usefulness of that network in predicting a given gene function. However, because many categories of gene function have a small number of annotations, the process of assigning these network weights is prone to overfitting. Results: Here, we address this problem by proposing a novel approach to combining multiple functional association networks. In particular, we present a method where network weights are simultaneously optimized on sets of related function categories. The method is simpler and faster than existing approaches. Further, we show that it produces composite networks with improved function prediction accuracy using five example species (yeast, mouse, fly, Esherichia coli and human). Availability: Networks and code are available from: http://morrislab.med.utoronto.ca/˜sara/SW Contact: smostafavi@cs.toronto.edu; quaid.morris@utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2894508
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28945082010-07-01 Fast integration of heterogeneous data sources for predicting gene function with limited annotation Mostafavi, Sara Morris, Quaid Bioinformatics Original Papers Motivation: Many algorithms that integrate multiple functional association networks for predicting gene function construct a composite network as a weighted sum of the individual networks and then use the composite network to predict gene function. The weight assigned to an individual network represents the usefulness of that network in predicting a given gene function. However, because many categories of gene function have a small number of annotations, the process of assigning these network weights is prone to overfitting. Results: Here, we address this problem by proposing a novel approach to combining multiple functional association networks. In particular, we present a method where network weights are simultaneously optimized on sets of related function categories. The method is simpler and faster than existing approaches. Further, we show that it produces composite networks with improved function prediction accuracy using five example species (yeast, mouse, fly, Esherichia coli and human). Availability: Networks and code are available from: http://morrislab.med.utoronto.ca/˜sara/SW Contact: smostafavi@cs.toronto.edu; quaid.morris@utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-07-15 2010-05-27 /pmc/articles/PMC2894508/ /pubmed/20507895 http://dx.doi.org/10.1093/bioinformatics/btq262 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Mostafavi, Sara
Morris, Quaid
Fast integration of heterogeneous data sources for predicting gene function with limited annotation
title Fast integration of heterogeneous data sources for predicting gene function with limited annotation
title_full Fast integration of heterogeneous data sources for predicting gene function with limited annotation
title_fullStr Fast integration of heterogeneous data sources for predicting gene function with limited annotation
title_full_unstemmed Fast integration of heterogeneous data sources for predicting gene function with limited annotation
title_short Fast integration of heterogeneous data sources for predicting gene function with limited annotation
title_sort fast integration of heterogeneous data sources for predicting gene function with limited annotation
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2894508/
https://www.ncbi.nlm.nih.gov/pubmed/20507895
http://dx.doi.org/10.1093/bioinformatics/btq262
work_keys_str_mv AT mostafavisara fastintegrationofheterogeneousdatasourcesforpredictinggenefunctionwithlimitedannotation
AT morrisquaid fastintegrationofheterogeneousdatasourcesforpredictinggenefunctionwithlimitedannotation