Cargando…

On protocols and measures for the validation of supervised methods for the inference of biological networks

Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, com...

Descripción completa

Detalles Bibliográficos
Autores principales: Schrynemackers, Marie, Küffner, Robert, Geurts, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848415/
https://www.ncbi.nlm.nih.gov/pubmed/24348517
http://dx.doi.org/10.3389/fgene.2013.00262
_version_ 1782293754568245248
author Schrynemackers, Marie
Küffner, Robert
Geurts, Pierre
author_facet Schrynemackers, Marie
Küffner, Robert
Geurts, Pierre
author_sort Schrynemackers, Marie
collection PubMed
description Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs.
format Online
Article
Text
id pubmed-3848415
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-38484152013-12-17 On protocols and measures for the validation of supervised methods for the inference of biological networks Schrynemackers, Marie Küffner, Robert Geurts, Pierre Front Genet Genetics Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs. Frontiers Media S.A. 2013-12-03 /pmc/articles/PMC3848415/ /pubmed/24348517 http://dx.doi.org/10.3389/fgene.2013.00262 Text en Copyright © 2013 Schrynemackers, Küffner and Geurts. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Schrynemackers, Marie
Küffner, Robert
Geurts, Pierre
On protocols and measures for the validation of supervised methods for the inference of biological networks
title On protocols and measures for the validation of supervised methods for the inference of biological networks
title_full On protocols and measures for the validation of supervised methods for the inference of biological networks
title_fullStr On protocols and measures for the validation of supervised methods for the inference of biological networks
title_full_unstemmed On protocols and measures for the validation of supervised methods for the inference of biological networks
title_short On protocols and measures for the validation of supervised methods for the inference of biological networks
title_sort on protocols and measures for the validation of supervised methods for the inference of biological networks
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848415/
https://www.ncbi.nlm.nih.gov/pubmed/24348517
http://dx.doi.org/10.3389/fgene.2013.00262
work_keys_str_mv AT schrynemackersmarie onprotocolsandmeasuresforthevalidationofsupervisedmethodsfortheinferenceofbiologicalnetworks
AT kuffnerrobert onprotocolsandmeasuresforthevalidationofsupervisedmethodsfortheinferenceofbiologicalnetworks
AT geurtspierre onprotocolsandmeasuresforthevalidationofsupervisedmethodsfortheinferenceofbiologicalnetworks