Cargando…

Privacy-by-Design: Understanding Data Access Models for Secondary Data

Today there is a constant flow of data into, out of, and between ever-larger and ever-more complex databases about people. Together, these digital traces collectively capture our social genome , the footprints of our society. The burgeoning field of population informatics is the systematic study of...

Descripción completa

Detalles Bibliográficos
Autores principales: Kum, Hye-Chung, Ahalt, Stanley
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 201
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3845756/
https://www.ncbi.nlm.nih.gov/pubmed/24303251
_version_ 1782293361472831488
author Kum, Hye-Chung
Ahalt, Stanley
author_facet Kum, Hye-Chung
Ahalt, Stanley
author_sort Kum, Hye-Chung
collection PubMed
description Today there is a constant flow of data into, out of, and between ever-larger and ever-more complex databases about people. Together, these digital traces collectively capture our social genome , the footprints of our society. The burgeoning field of population informatics is the systematic study of populations via secondary analysis of such massive data collections (termed “big data”) about people. In particular, health informatics analyzes electronic health records to improve health outcomes for a population. Privacy protection in such secondary data analysis research is complex and requires a holistic approach which combines technology, statistics, policy and a shift in culture of information accountability through transparency rather than secrecy. We review state of the art in privacy protection technology and policy frameworks from widely different fields, and synthesize the findings to present a comprehensive system of privacy protection in population informatics research using the privacy-by-design approach. Based on common activities in the workflow, we describe the pros and cons of four different data access models – restricted access, controlled access, monitored access, and open access – that minimize risk and maximize usability of data. We then evaluate the system by analyzing the risk and usability of data through a realistic example. We conclude that deployed together the four data access models can provide a comprehensive system for privacy protection, balancing the risk and usability of secondary data in population informatics research.
format Online
Article
Text
id pubmed-3845756
institution National Center for Biotechnology Information
language English
publishDate 201
publisher American Medical Informatics Association
record_format MEDLINE/PubMed
spelling pubmed-38457562013-12-03 Privacy-by-Design: Understanding Data Access Models for Secondary Data Kum, Hye-Chung Ahalt, Stanley AMIA Jt Summits Transl Sci Proc Articles Today there is a constant flow of data into, out of, and between ever-larger and ever-more complex databases about people. Together, these digital traces collectively capture our social genome , the footprints of our society. The burgeoning field of population informatics is the systematic study of populations via secondary analysis of such massive data collections (termed “big data”) about people. In particular, health informatics analyzes electronic health records to improve health outcomes for a population. Privacy protection in such secondary data analysis research is complex and requires a holistic approach which combines technology, statistics, policy and a shift in culture of information accountability through transparency rather than secrecy. We review state of the art in privacy protection technology and policy frameworks from widely different fields, and synthesize the findings to present a comprehensive system of privacy protection in population informatics research using the privacy-by-design approach. Based on common activities in the workflow, we describe the pros and cons of four different data access models – restricted access, controlled access, monitored access, and open access – that minimize risk and maximize usability of data. We then evaluate the system by analyzing the risk and usability of data through a realistic example. We conclude that deployed together the four data access models can provide a comprehensive system for privacy protection, balancing the risk and usability of secondary data in population informatics research. American Medical Informatics Association 2013 -03- 18 /pmc/articles/PMC3845756/ /pubmed/24303251 Text en ©2013 AMIA - All rights reserved.
spellingShingle Articles
Kum, Hye-Chung
Ahalt, Stanley
Privacy-by-Design: Understanding Data Access Models for Secondary Data
title Privacy-by-Design: Understanding Data Access Models for Secondary Data
title_full Privacy-by-Design: Understanding Data Access Models for Secondary Data
title_fullStr Privacy-by-Design: Understanding Data Access Models for Secondary Data
title_full_unstemmed Privacy-by-Design: Understanding Data Access Models for Secondary Data
title_short Privacy-by-Design: Understanding Data Access Models for Secondary Data
title_sort privacy-by-design: understanding data access models for secondary data
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3845756/
https://www.ncbi.nlm.nih.gov/pubmed/24303251
work_keys_str_mv AT kumhyechung privacybydesignunderstandingdataaccessmodelsforsecondarydata
AT ahaltstanley privacybydesignunderstandingdataaccessmodelsforsecondarydata