Cargando…

SegMine workflows for semantic microarray data analysis in Orange4WS

BACKGROUND: In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as work...

Descripción completa

Detalles Bibliográficos
Autores principales: Podpečan, Vid, Lavrač, Nada, Mozetič, Igor, Novak, Petra Kralj, Trajkovski, Igor, Langohr, Laura, Kulovesi, Kimmo, Toivonen, Hannu, Petek, Marko, Motaln, Helena, Gruden, Kristina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3216973/
https://www.ncbi.nlm.nih.gov/pubmed/22029475
http://dx.doi.org/10.1186/1471-2105-12-416
_version_ 1782216570907394048
author Podpečan, Vid
Lavrač, Nada
Mozetič, Igor
Novak, Petra Kralj
Trajkovski, Igor
Langohr, Laura
Kulovesi, Kimmo
Toivonen, Hannu
Petek, Marko
Motaln, Helena
Gruden, Kristina
author_facet Podpečan, Vid
Lavrač, Nada
Mozetič, Igor
Novak, Petra Kralj
Trajkovski, Igor
Langohr, Laura
Kulovesi, Kimmo
Toivonen, Hannu
Petek, Marko
Motaln, Helena
Gruden, Kristina
author_sort Podpečan, Vid
collection PubMed
description BACKGROUND: In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases). Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets. RESULTS: We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes. CONCLUSIONS: Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.
format Online
Article
Text
id pubmed-3216973
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32169732011-11-16 SegMine workflows for semantic microarray data analysis in Orange4WS Podpečan, Vid Lavrač, Nada Mozetič, Igor Novak, Petra Kralj Trajkovski, Igor Langohr, Laura Kulovesi, Kimmo Toivonen, Hannu Petek, Marko Motaln, Helena Gruden, Kristina BMC Bioinformatics Methodology Article BACKGROUND: In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases). Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets. RESULTS: We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes. CONCLUSIONS: Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment. BioMed Central 2011-10-26 /pmc/articles/PMC3216973/ /pubmed/22029475 http://dx.doi.org/10.1186/1471-2105-12-416 Text en Copyright ©2011 Podpečan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Podpečan, Vid
Lavrač, Nada
Mozetič, Igor
Novak, Petra Kralj
Trajkovski, Igor
Langohr, Laura
Kulovesi, Kimmo
Toivonen, Hannu
Petek, Marko
Motaln, Helena
Gruden, Kristina
SegMine workflows for semantic microarray data analysis in Orange4WS
title SegMine workflows for semantic microarray data analysis in Orange4WS
title_full SegMine workflows for semantic microarray data analysis in Orange4WS
title_fullStr SegMine workflows for semantic microarray data analysis in Orange4WS
title_full_unstemmed SegMine workflows for semantic microarray data analysis in Orange4WS
title_short SegMine workflows for semantic microarray data analysis in Orange4WS
title_sort segmine workflows for semantic microarray data analysis in orange4ws
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3216973/
https://www.ncbi.nlm.nih.gov/pubmed/22029475
http://dx.doi.org/10.1186/1471-2105-12-416
work_keys_str_mv AT podpecanvid segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT lavracnada segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT mozeticigor segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT novakpetrakralj segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT trajkovskiigor segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT langohrlaura segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT kulovesikimmo segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT toivonenhannu segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT petekmarko segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT motalnhelena segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws
AT grudenkristina segmineworkflowsforsemanticmicroarraydataanalysisinorange4ws