Cargando…
On protocols and measures for the validation of supervised methods for the inference of biological networks
Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, com...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848415/ https://www.ncbi.nlm.nih.gov/pubmed/24348517 http://dx.doi.org/10.3389/fgene.2013.00262 |
_version_ | 1782293754568245248 |
---|---|
author | Schrynemackers, Marie Küffner, Robert Geurts, Pierre |
author_facet | Schrynemackers, Marie Küffner, Robert Geurts, Pierre |
author_sort | Schrynemackers, Marie |
collection | PubMed |
description | Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs. |
format | Online Article Text |
id | pubmed-3848415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-38484152013-12-17 On protocols and measures for the validation of supervised methods for the inference of biological networks Schrynemackers, Marie Küffner, Robert Geurts, Pierre Front Genet Genetics Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs. Frontiers Media S.A. 2013-12-03 /pmc/articles/PMC3848415/ /pubmed/24348517 http://dx.doi.org/10.3389/fgene.2013.00262 Text en Copyright © 2013 Schrynemackers, Küffner and Geurts. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Schrynemackers, Marie Küffner, Robert Geurts, Pierre On protocols and measures for the validation of supervised methods for the inference of biological networks |
title | On protocols and measures for the validation of supervised methods for the inference of biological networks |
title_full | On protocols and measures for the validation of supervised methods for the inference of biological networks |
title_fullStr | On protocols and measures for the validation of supervised methods for the inference of biological networks |
title_full_unstemmed | On protocols and measures for the validation of supervised methods for the inference of biological networks |
title_short | On protocols and measures for the validation of supervised methods for the inference of biological networks |
title_sort | on protocols and measures for the validation of supervised methods for the inference of biological networks |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848415/ https://www.ncbi.nlm.nih.gov/pubmed/24348517 http://dx.doi.org/10.3389/fgene.2013.00262 |
work_keys_str_mv | AT schrynemackersmarie onprotocolsandmeasuresforthevalidationofsupervisedmethodsfortheinferenceofbiologicalnetworks AT kuffnerrobert onprotocolsandmeasuresforthevalidationofsupervisedmethodsfortheinferenceofbiologicalnetworks AT geurtspierre onprotocolsandmeasuresforthevalidationofsupervisedmethodsfortheinferenceofbiologicalnetworks |