Cargando…

BigQ: a NoSQL based framework to handle genomic variants in i2b2

BACKGROUND: Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human...

Descripción completa

Detalles Bibliográficos
Autores principales: Gabetta, Matteo, Limongelli, Ivan, Rizzo, Ettore, Riva, Alberto, Segagni, Daniele, Bellazzi, Riccardo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4696314/
https://www.ncbi.nlm.nih.gov/pubmed/26714792
http://dx.doi.org/10.1186/s12859-015-0861-0
_version_ 1782407775062589440
author Gabetta, Matteo
Limongelli, Ivan
Rizzo, Ettore
Riva, Alberto
Segagni, Daniele
Bellazzi, Riccardo
author_facet Gabetta, Matteo
Limongelli, Ivan
Rizzo, Ettore
Riva, Alberto
Segagni, Daniele
Bellazzi, Riccardo
author_sort Gabetta, Matteo
collection PubMed
description BACKGROUND: Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data. RESULTS: We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants. CONCLUSIONS: In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0861-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4696314
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46963142015-12-31 BigQ: a NoSQL based framework to handle genomic variants in i2b2 Gabetta, Matteo Limongelli, Ivan Rizzo, Ettore Riva, Alberto Segagni, Daniele Bellazzi, Riccardo BMC Bioinformatics Software BACKGROUND: Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data. RESULTS: We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants. CONCLUSIONS: In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0861-0) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-29 /pmc/articles/PMC4696314/ /pubmed/26714792 http://dx.doi.org/10.1186/s12859-015-0861-0 Text en © Gabetta et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Gabetta, Matteo
Limongelli, Ivan
Rizzo, Ettore
Riva, Alberto
Segagni, Daniele
Bellazzi, Riccardo
BigQ: a NoSQL based framework to handle genomic variants in i2b2
title BigQ: a NoSQL based framework to handle genomic variants in i2b2
title_full BigQ: a NoSQL based framework to handle genomic variants in i2b2
title_fullStr BigQ: a NoSQL based framework to handle genomic variants in i2b2
title_full_unstemmed BigQ: a NoSQL based framework to handle genomic variants in i2b2
title_short BigQ: a NoSQL based framework to handle genomic variants in i2b2
title_sort bigq: a nosql based framework to handle genomic variants in i2b2
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4696314/
https://www.ncbi.nlm.nih.gov/pubmed/26714792
http://dx.doi.org/10.1186/s12859-015-0861-0
work_keys_str_mv AT gabettamatteo bigqanosqlbasedframeworktohandlegenomicvariantsini2b2
AT limongelliivan bigqanosqlbasedframeworktohandlegenomicvariantsini2b2
AT rizzoettore bigqanosqlbasedframeworktohandlegenomicvariantsini2b2
AT rivaalberto bigqanosqlbasedframeworktohandlegenomicvariantsini2b2
AT segagnidaniele bigqanosqlbasedframeworktohandlegenomicvariantsini2b2
AT bellazziriccardo bigqanosqlbasedframeworktohandlegenomicvariantsini2b2