Cargando…

ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines

High prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. For computational biology, positional oligomer importance matrices (POIMs) have bee...

Descripción completa

Detalles Bibliográficos
Autores principales:	Vidovic, Marina M. -C., Kloft, Marius, Müller, Klaus-Robert, Görnitz, Nico
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5367830/ https://www.ncbi.nlm.nih.gov/pubmed/28346487 http://dx.doi.org/10.1371/journal.pone.0174392

_version_	1782517842077286400
author	Vidovic, Marina M. -C. Kloft, Marius Müller, Klaus-Robert Görnitz, Nico
author_facet	Vidovic, Marina M. -C. Kloft, Marius Müller, Klaus-Robert Görnitz, Nico
author_sort	Vidovic, Marina M. -C.
collection	PubMed
description	High prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. For computational biology, positional oligomer importance matrices (POIMs) have been successfully applied to explain the decision of support vector machines (SVMs) using weighted-degree (WD) kernels. To extract relevant biological motifs from POIMs, the motifPOIM method has been devised and showed promising results on real-world data. Our contribution in this paper is twofold: as an extension to POIMs, we propose gPOIM, a general measure of feature importance for arbitrary learning machines and feature sets (including, but not limited to, SVMs and CNNs) and devise a sampling strategy for efficient computation. As a second contribution, we derive a convex formulation of motifPOIMs that leads to more reliable motif extraction from gPOIMs. Empirical evaluations confirm the usefulness of our approach on artificially generated data as well as on real-world datasets.
format	Online Article Text
id	pubmed-5367830
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-53678302017-04-06 ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines Vidovic, Marina M. -C. Kloft, Marius Müller, Klaus-Robert Görnitz, Nico PLoS One Research Article High prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. For computational biology, positional oligomer importance matrices (POIMs) have been successfully applied to explain the decision of support vector machines (SVMs) using weighted-degree (WD) kernels. To extract relevant biological motifs from POIMs, the motifPOIM method has been devised and showed promising results on real-world data. Our contribution in this paper is twofold: as an extension to POIMs, we propose gPOIM, a general measure of feature importance for arbitrary learning machines and feature sets (including, but not limited to, SVMs and CNNs) and devise a sampling strategy for efficient computation. As a second contribution, we derive a convex formulation of motifPOIMs that leads to more reliable motif extraction from gPOIMs. Empirical evaluations confirm the usefulness of our approach on artificially generated data as well as on real-world datasets. Public Library of Science 2017-03-27 /pmc/articles/PMC5367830/ /pubmed/28346487 http://dx.doi.org/10.1371/journal.pone.0174392 Text en © 2017 Vidovic et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Vidovic, Marina M. -C. Kloft, Marius Müller, Klaus-Robert Görnitz, Nico ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines
title	ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines
title_full	ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines
title_fullStr	ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines
title_full_unstemmed	ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines
title_short	ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines
title_sort	ml2motif—reliable extraction of discriminative sequence motifs from learning machines
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5367830/ https://www.ncbi.nlm.nih.gov/pubmed/28346487 http://dx.doi.org/10.1371/journal.pone.0174392
work_keys_str_mv	AT vidovicmarinamc ml2motifreliableextractionofdiscriminativesequencemotifsfromlearningmachines AT kloftmarius ml2motifreliableextractionofdiscriminativesequencemotifsfromlearningmachines AT mullerklausrobert ml2motifreliableextractionofdiscriminativesequencemotifsfromlearningmachines AT gornitznico ml2motifreliableextractionofdiscriminativesequencemotifsfromlearningmachines

ML2Motif—Reliable extraction of discriminative sequence motifs from learning machines

Ejemplares similares