Cargando…

An ICT infrastructure to integrate clinical and molecular data in oncology research

BACKGROUND: The ONCO-i2b2 platform is a bioinformatics tool designed to integrate clinical and research data and support translational research in oncology. It is implemented by the University of Pavia and the IRCCS Fondazione Maugeri hospital (FSM), and grounded on the software developed by the Inf...

Descripción completa

Detalles Bibliográficos
Autores principales: Segagni, Daniele, Tibollo, Valentina, Dagliati, Arianna, Zambelli, Alberto, Priori, Silvia G, Bellazzi, Riccardo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3303735/
https://www.ncbi.nlm.nih.gov/pubmed/22536972
http://dx.doi.org/10.1186/1471-2105-13-S4-S5
_version_ 1782226781180264448
author Segagni, Daniele
Tibollo, Valentina
Dagliati, Arianna
Zambelli, Alberto
Priori, Silvia G
Bellazzi, Riccardo
author_facet Segagni, Daniele
Tibollo, Valentina
Dagliati, Arianna
Zambelli, Alberto
Priori, Silvia G
Bellazzi, Riccardo
author_sort Segagni, Daniele
collection PubMed
description BACKGROUND: The ONCO-i2b2 platform is a bioinformatics tool designed to integrate clinical and research data and support translational research in oncology. It is implemented by the University of Pavia and the IRCCS Fondazione Maugeri hospital (FSM), and grounded on the software developed by the Informatics for Integrating Biology and the Bedside (i2b2) research center. I2b2 has delivered an open source suite based on a data warehouse, which is efficiently interrogated to find sets of interesting patients through a query tool interface. METHODS: Onco-i2b2 integrates data coming from multiple sources and allows the users to jointly query them. I2b2 data are then stored in a data warehouse, where facts are hierarchically structured as ontologies. Onco-i2b2 gathers data from the FSM pathology unit (PU) database and from the hospital biobank and merges them with the clinical information from the hospital information system. Our main effort was to provide a robust integrated research environment, giving a particular emphasis to the integration process and facing different challenges, consecutively listed: biospecimen samples privacy and anonymization; synchronization of the biobank database with the i2b2 data warehouse through a series of Extract, Transform, Load (ETL) operations; development and integration of a Natural Language Processing (NLP) module, to retrieve coded information, such as SNOMED terms and malignant tumors (TNM) classifications, and clinical tests results from unstructured medical records. Furthermore, we have developed an internal SNOMED ontology rested on the NCBO BioPortal web services. RESULTS: Onco-i2b2 manages data of more than 6,500 patients with breast cancer diagnosis collected between 2001 and 2011 (over 390 of them have at least one biological sample in the cancer biobank), more than 47,000 visits and 96,000 observations over 960 medical concepts. CONCLUSIONS: Onco-i2b2 is a concrete example of how integrated Information and Communication Technology architecture can be implemented to support translational research. The next steps of our project will involve the extension of its capabilities by implementing new plug-in devoted to bioinformatics data analysis as well as a temporal query module.
format Online
Article
Text
id pubmed-3303735
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33037352012-03-15 An ICT infrastructure to integrate clinical and molecular data in oncology research Segagni, Daniele Tibollo, Valentina Dagliati, Arianna Zambelli, Alberto Priori, Silvia G Bellazzi, Riccardo BMC Bioinformatics Research BACKGROUND: The ONCO-i2b2 platform is a bioinformatics tool designed to integrate clinical and research data and support translational research in oncology. It is implemented by the University of Pavia and the IRCCS Fondazione Maugeri hospital (FSM), and grounded on the software developed by the Informatics for Integrating Biology and the Bedside (i2b2) research center. I2b2 has delivered an open source suite based on a data warehouse, which is efficiently interrogated to find sets of interesting patients through a query tool interface. METHODS: Onco-i2b2 integrates data coming from multiple sources and allows the users to jointly query them. I2b2 data are then stored in a data warehouse, where facts are hierarchically structured as ontologies. Onco-i2b2 gathers data from the FSM pathology unit (PU) database and from the hospital biobank and merges them with the clinical information from the hospital information system. Our main effort was to provide a robust integrated research environment, giving a particular emphasis to the integration process and facing different challenges, consecutively listed: biospecimen samples privacy and anonymization; synchronization of the biobank database with the i2b2 data warehouse through a series of Extract, Transform, Load (ETL) operations; development and integration of a Natural Language Processing (NLP) module, to retrieve coded information, such as SNOMED terms and malignant tumors (TNM) classifications, and clinical tests results from unstructured medical records. Furthermore, we have developed an internal SNOMED ontology rested on the NCBO BioPortal web services. RESULTS: Onco-i2b2 manages data of more than 6,500 patients with breast cancer diagnosis collected between 2001 and 2011 (over 390 of them have at least one biological sample in the cancer biobank), more than 47,000 visits and 96,000 observations over 960 medical concepts. CONCLUSIONS: Onco-i2b2 is a concrete example of how integrated Information and Communication Technology architecture can be implemented to support translational research. The next steps of our project will involve the extension of its capabilities by implementing new plug-in devoted to bioinformatics data analysis as well as a temporal query module. BioMed Central 2012-03-28 /pmc/articles/PMC3303735/ /pubmed/22536972 http://dx.doi.org/10.1186/1471-2105-13-S4-S5 Text en Copyright ©2012 Segagni et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Segagni, Daniele
Tibollo, Valentina
Dagliati, Arianna
Zambelli, Alberto
Priori, Silvia G
Bellazzi, Riccardo
An ICT infrastructure to integrate clinical and molecular data in oncology research
title An ICT infrastructure to integrate clinical and molecular data in oncology research
title_full An ICT infrastructure to integrate clinical and molecular data in oncology research
title_fullStr An ICT infrastructure to integrate clinical and molecular data in oncology research
title_full_unstemmed An ICT infrastructure to integrate clinical and molecular data in oncology research
title_short An ICT infrastructure to integrate clinical and molecular data in oncology research
title_sort ict infrastructure to integrate clinical and molecular data in oncology research
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3303735/
https://www.ncbi.nlm.nih.gov/pubmed/22536972
http://dx.doi.org/10.1186/1471-2105-13-S4-S5
work_keys_str_mv AT segagnidaniele anictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT tibollovalentina anictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT dagliatiarianna anictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT zambellialberto anictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT priorisilviag anictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT bellazziriccardo anictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT segagnidaniele ictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT tibollovalentina ictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT dagliatiarianna ictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT zambellialberto ictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT priorisilviag ictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch
AT bellazziriccardo ictinfrastructuretointegrateclinicalandmoleculardatainoncologyresearch