Cargando…
Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator
BACKGROUND: Identification of non-trivial and meaningful patterns in omics data is one of the most important biological tasks. The patterns help to better understand biological systems and interpret experimental outcomes. A well-established method serving to explain such biological data is Gene Set...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7466824/ https://www.ncbi.nlm.nih.gov/pubmed/32905086 http://dx.doi.org/10.1186/s13040-020-00219-6 |
_version_ | 1783577898799595520 |
---|---|
author | Malinka, František železný, Filip Kléma, Jiří |
author_facet | Malinka, František železný, Filip Kléma, Jiří |
author_sort | Malinka, František |
collection | PubMed |
description | BACKGROUND: Identification of non-trivial and meaningful patterns in omics data is one of the most important biological tasks. The patterns help to better understand biological systems and interpret experimental outcomes. A well-established method serving to explain such biological data is Gene Set Enrichment Analysis. However, this type of analysis is restricted to a specific type of evaluation. Abstracting from details, the analyst provides a sorted list of genes and ontological annotations of the individual genes; the method outputs a subset of ontological terms enriched in the gene list. Here, in contrary to enrichment analysis, we introduce a new tool/framework that allows for the induction of more complex patterns of 2-dimensional binary omics data. This extension allows to discover and describe semantically coherent biclusters. RESULTS: We present a new rapid method called sem1R that reveals interpretable hidden rules in omics data. These rules capture semantic differences between two classes: a target class as a collection of positive examples and a non-target class containing negative examples. The method is inspired by the CN2 rule learner and introduces a new refinement operator that exploits prior knowledge in the form of ontologies. In our work this knowledge serves to create accurate and interpretable rules. The novel refinement operator uses two reduction procedures: Redundant Generalization and Redundant Non-potential, both of which help to dramatically prune the rule space and consequently, speed-up the entire process of rule induction in comparison with the traditional refinement operator as is presented in CN2. CONCLUSIONS: Efficiency and effectivity of the novel refinement operator were tested on three real different gene expression datasets. Concretely, the Dresden Ovary Dataset, DISC, and m2816 were employed. The experiments show that the ontology-based refinement operator speeds-up the pattern induction drastically. The algorithm is written in C++ and is published as an R package available at http://github.com/fmalinka/sem1r. |
format | Online Article Text |
id | pubmed-7466824 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74668242020-09-03 Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator Malinka, František železný, Filip Kléma, Jiří BioData Min Research BACKGROUND: Identification of non-trivial and meaningful patterns in omics data is one of the most important biological tasks. The patterns help to better understand biological systems and interpret experimental outcomes. A well-established method serving to explain such biological data is Gene Set Enrichment Analysis. However, this type of analysis is restricted to a specific type of evaluation. Abstracting from details, the analyst provides a sorted list of genes and ontological annotations of the individual genes; the method outputs a subset of ontological terms enriched in the gene list. Here, in contrary to enrichment analysis, we introduce a new tool/framework that allows for the induction of more complex patterns of 2-dimensional binary omics data. This extension allows to discover and describe semantically coherent biclusters. RESULTS: We present a new rapid method called sem1R that reveals interpretable hidden rules in omics data. These rules capture semantic differences between two classes: a target class as a collection of positive examples and a non-target class containing negative examples. The method is inspired by the CN2 rule learner and introduces a new refinement operator that exploits prior knowledge in the form of ontologies. In our work this knowledge serves to create accurate and interpretable rules. The novel refinement operator uses two reduction procedures: Redundant Generalization and Redundant Non-potential, both of which help to dramatically prune the rule space and consequently, speed-up the entire process of rule induction in comparison with the traditional refinement operator as is presented in CN2. CONCLUSIONS: Efficiency and effectivity of the novel refinement operator were tested on three real different gene expression datasets. Concretely, the Dresden Ovary Dataset, DISC, and m2816 were employed. The experiments show that the ontology-based refinement operator speeds-up the pattern induction drastically. The algorithm is written in C++ and is published as an R package available at http://github.com/fmalinka/sem1r. BioMed Central 2020-09-01 /pmc/articles/PMC7466824/ /pubmed/32905086 http://dx.doi.org/10.1186/s13040-020-00219-6 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Malinka, František železný, Filip Kléma, Jiří Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
title | Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
title_full | Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
title_fullStr | Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
title_full_unstemmed | Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
title_short | Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
title_sort | finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7466824/ https://www.ncbi.nlm.nih.gov/pubmed/32905086 http://dx.doi.org/10.1186/s13040-020-00219-6 |
work_keys_str_mv | AT malinkafrantisek findingsemanticpatternsinomicsdatausingconceptrulelearningwithanontologybasedrefinementoperator AT zeleznyfilip findingsemanticpatternsinomicsdatausingconceptrulelearningwithanontologybasedrefinementoperator AT klemajiri findingsemanticpatternsinomicsdatausingconceptrulelearningwithanontologybasedrefinementoperator |