Cargando…

Toward a Literature-Driven Definition of Big Data in Healthcare

Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset...

Descripción completa

Detalles Bibliográficos
Autores principales: Baro, Emilie, Degoul, Samuel, Beuscart, Régis, Chazard, Emmanuel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468280/
https://www.ncbi.nlm.nih.gov/pubmed/26137488
http://dx.doi.org/10.1155/2015/639021
_version_ 1782376481652998144
author Baro, Emilie
Degoul, Samuel
Beuscart, Régis
Chazard, Emmanuel
author_facet Baro, Emilie
Degoul, Samuel
Beuscart, Régis
Chazard, Emmanuel
author_sort Baro, Emilie
collection PubMed
description Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log⁡(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data.
format Online
Article
Text
id pubmed-4468280
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-44682802015-07-01 Toward a Literature-Driven Definition of Big Data in Healthcare Baro, Emilie Degoul, Samuel Beuscart, Régis Chazard, Emmanuel Biomed Res Int Review Article Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log⁡(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data. Hindawi Publishing Corporation 2015 2015-06-02 /pmc/articles/PMC4468280/ /pubmed/26137488 http://dx.doi.org/10.1155/2015/639021 Text en Copyright © 2015 Emilie Baro et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review Article
Baro, Emilie
Degoul, Samuel
Beuscart, Régis
Chazard, Emmanuel
Toward a Literature-Driven Definition of Big Data in Healthcare
title Toward a Literature-Driven Definition of Big Data in Healthcare
title_full Toward a Literature-Driven Definition of Big Data in Healthcare
title_fullStr Toward a Literature-Driven Definition of Big Data in Healthcare
title_full_unstemmed Toward a Literature-Driven Definition of Big Data in Healthcare
title_short Toward a Literature-Driven Definition of Big Data in Healthcare
title_sort toward a literature-driven definition of big data in healthcare
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468280/
https://www.ncbi.nlm.nih.gov/pubmed/26137488
http://dx.doi.org/10.1155/2015/639021
work_keys_str_mv AT baroemilie towardaliteraturedrivendefinitionofbigdatainhealthcare
AT degoulsamuel towardaliteraturedrivendefinitionofbigdatainhealthcare
AT beuscartregis towardaliteraturedrivendefinitionofbigdatainhealthcare
AT chazardemmanuel towardaliteraturedrivendefinitionofbigdatainhealthcare