Cargando…

A semantic proteomics dashboard (SemPoD) for data management in translational research

BACKGROUND: One of the primary challenges in translational research data management is breaking down the barriers between the multiple data silos and the integration of 'omics data with clinical information to complete the cycle from the bench to the bedside. The role of contextual metadata, al...

Descripción completa

Detalles Bibliográficos
Autores principales: Jayapandian, Catherine P, Zhao, Meng, Ewing, Rob M, Zhang, Guo-Qiang, Sahoo, Satya S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3524316/
https://www.ncbi.nlm.nih.gov/pubmed/23282161
http://dx.doi.org/10.1186/1752-0509-6-S3-S20
_version_ 1782253311591710720
author Jayapandian, Catherine P
Zhao, Meng
Ewing, Rob M
Zhang, Guo-Qiang
Sahoo, Satya S
author_facet Jayapandian, Catherine P
Zhao, Meng
Ewing, Rob M
Zhang, Guo-Qiang
Sahoo, Satya S
author_sort Jayapandian, Catherine P
collection PubMed
description BACKGROUND: One of the primary challenges in translational research data management is breaking down the barriers between the multiple data silos and the integration of 'omics data with clinical information to complete the cycle from the bench to the bedside. The role of contextual metadata, also called provenance information, is a key factor ineffective data integration, reproducibility of results, correct attribution of original source, and answering research queries involving "What", "Where", "When", "Which", "Who", "How", and "Why" (also known as the W7 model). But, at present there is limited or no effective approach to managing and leveraging provenance information for integrating data across studies or projects. Hence, there is an urgent need for a paradigm shift in creating a "provenance-aware" informatics platform to address this challenge. We introduce an ontology-driven, intuitive Semantic Proteomics Dashboard (SemPoD) that uses provenance together with domain information (semantic provenance) to enable researchers to query, compare, and correlate different types of data across multiple projects, and allow integration with legacy data to support their ongoing research. RESULTS: The SemPoD platform, currently in use at the Case Center for Proteomics and Bioinformatics (CPB), consists of three components: (a) Ontology-driven Visual Query Composer, (b) Result Explorer, and (c) Query Manager. Currently, SemPoD allows provenance-aware querying of 1153 mass-spectrometry experiments from 20 different projects. SemPod uses the systems molecular biology provenance ontology (SysPro) to support a dynamic query composition interface, which automatically updates the components of the query interface based on previous user selections and efficientlyprunes the result set usinga "smart filtering" approach. The SysPro ontology re-uses terms from the PROV-ontology (PROV-O) being developed by the World Wide Web Consortium (W3C) provenance working group, the minimum information required for reporting a molecular interaction experiment (MIMIx), and the minimum information about a proteomics experiment (MIAPE) guidelines. The SemPoD was evaluated both in terms of user feedback and as scalability of the system. CONCLUSIONS: SemPoD is an intuitive and powerful provenance ontology-driven data access and query platform that uses the MIAPE and MIMIx metadata guideline to create an integrated view over large-scale systems molecular biology datasets. SemPoD leverages the SysPro ontology to create an intuitive dashboard for biologists to compose queries, explore the results, and use a query manager for storing queries for later use. SemPoD can be deployed over many existing database applications storing 'omics data, including, as illustrated here, the LabKey data-management system. The initial user feedback evaluating the usability and functionality of SemPoD has been very positive and it is being considered for wider deployment beyond the proteomics domain, and in other 'omics' centers.
format Online
Article
Text
id pubmed-3524316
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35243162012-12-21 A semantic proteomics dashboard (SemPoD) for data management in translational research Jayapandian, Catherine P Zhao, Meng Ewing, Rob M Zhang, Guo-Qiang Sahoo, Satya S BMC Syst Biol Research BACKGROUND: One of the primary challenges in translational research data management is breaking down the barriers between the multiple data silos and the integration of 'omics data with clinical information to complete the cycle from the bench to the bedside. The role of contextual metadata, also called provenance information, is a key factor ineffective data integration, reproducibility of results, correct attribution of original source, and answering research queries involving "What", "Where", "When", "Which", "Who", "How", and "Why" (also known as the W7 model). But, at present there is limited or no effective approach to managing and leveraging provenance information for integrating data across studies or projects. Hence, there is an urgent need for a paradigm shift in creating a "provenance-aware" informatics platform to address this challenge. We introduce an ontology-driven, intuitive Semantic Proteomics Dashboard (SemPoD) that uses provenance together with domain information (semantic provenance) to enable researchers to query, compare, and correlate different types of data across multiple projects, and allow integration with legacy data to support their ongoing research. RESULTS: The SemPoD platform, currently in use at the Case Center for Proteomics and Bioinformatics (CPB), consists of three components: (a) Ontology-driven Visual Query Composer, (b) Result Explorer, and (c) Query Manager. Currently, SemPoD allows provenance-aware querying of 1153 mass-spectrometry experiments from 20 different projects. SemPod uses the systems molecular biology provenance ontology (SysPro) to support a dynamic query composition interface, which automatically updates the components of the query interface based on previous user selections and efficientlyprunes the result set usinga "smart filtering" approach. The SysPro ontology re-uses terms from the PROV-ontology (PROV-O) being developed by the World Wide Web Consortium (W3C) provenance working group, the minimum information required for reporting a molecular interaction experiment (MIMIx), and the minimum information about a proteomics experiment (MIAPE) guidelines. The SemPoD was evaluated both in terms of user feedback and as scalability of the system. CONCLUSIONS: SemPoD is an intuitive and powerful provenance ontology-driven data access and query platform that uses the MIAPE and MIMIx metadata guideline to create an integrated view over large-scale systems molecular biology datasets. SemPoD leverages the SysPro ontology to create an intuitive dashboard for biologists to compose queries, explore the results, and use a query manager for storing queries for later use. SemPoD can be deployed over many existing database applications storing 'omics data, including, as illustrated here, the LabKey data-management system. The initial user feedback evaluating the usability and functionality of SemPoD has been very positive and it is being considered for wider deployment beyond the proteomics domain, and in other 'omics' centers. BioMed Central 2012-12-17 /pmc/articles/PMC3524316/ /pubmed/23282161 http://dx.doi.org/10.1186/1752-0509-6-S3-S20 Text en Copyright ©2012 Jayapandian et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Jayapandian, Catherine P
Zhao, Meng
Ewing, Rob M
Zhang, Guo-Qiang
Sahoo, Satya S
A semantic proteomics dashboard (SemPoD) for data management in translational research
title A semantic proteomics dashboard (SemPoD) for data management in translational research
title_full A semantic proteomics dashboard (SemPoD) for data management in translational research
title_fullStr A semantic proteomics dashboard (SemPoD) for data management in translational research
title_full_unstemmed A semantic proteomics dashboard (SemPoD) for data management in translational research
title_short A semantic proteomics dashboard (SemPoD) for data management in translational research
title_sort semantic proteomics dashboard (sempod) for data management in translational research
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3524316/
https://www.ncbi.nlm.nih.gov/pubmed/23282161
http://dx.doi.org/10.1186/1752-0509-6-S3-S20
work_keys_str_mv AT jayapandiancatherinep asemanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT zhaomeng asemanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT ewingrobm asemanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT zhangguoqiang asemanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT sahoosatyas asemanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT jayapandiancatherinep semanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT zhaomeng semanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT ewingrobm semanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT zhangguoqiang semanticproteomicsdashboardsempodfordatamanagementintranslationalresearch
AT sahoosatyas semanticproteomicsdashboardsempodfordatamanagementintranslationalresearch