Cargando…

Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea

OBJECTIVES: This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. METHODS: We collected raw data, ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Hyun Sang, Cho, Hune, Kim, Hwa Sun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korean Society of Medical Informatics 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4871843/
https://www.ncbi.nlm.nih.gov/pubmed/27200223
http://dx.doi.org/10.4258/hir.2016.22.2.129
_version_ 1782432635363000320
author Park, Hyun Sang
Cho, Hune
Kim, Hwa Sun
author_facet Park, Hyun Sang
Cho, Hune
Kim, Hwa Sun
author_sort Park, Hyun Sang
collection PubMed
description OBJECTIVES: This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. METHODS: We collected raw data, managed independently by 15 regional biobanks, for database modeling and analyzed and defined the metadata of the items. We also built a three-step (high, middle, and low) classification system for classifying the item concepts based on the metadata. To generate clear meanings of the items, clinical items were defined using the Systematized Nomenclature of Medicine Clinical Terms, and specimen items were defined using the Logical Observation Identifiers Names and Codes. To optimize database performance, we set up a multi-column index based on the classification system and the international standard code. RESULTS: As a result of subdividing 7,197,252 raw data items collected, we refined the metadata into 1,796 clinical items and 1,792 specimen items. The classification system consists of 15 high, 163 middle, and 3,588 low class items. International standard codes were linked to 69.9% of the clinical items and 71.7% of the specimen items. The database consists of 18 tables based on a table from MySQL Server 5.6. As a result of the performance evaluation, the multi-column index shortened query time by as much as nine times. CONCLUSIONS: The database developed was based on an international standard terminology system, providing an infrastructure that can integrate the 7,197,252 raw data items managed by the 15 regional biobanks. In particular, it resolved the inevitable interoperability issues in the exchange of information among the biobanks, and provided a solution to the synonym problem, which arises when the same concept is expressed in a variety of ways.
format Online
Article
Text
id pubmed-4871843
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Korean Society of Medical Informatics
record_format MEDLINE/PubMed
spelling pubmed-48718432016-05-19 Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea Park, Hyun Sang Cho, Hune Kim, Hwa Sun Healthc Inform Res Original Article OBJECTIVES: This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. METHODS: We collected raw data, managed independently by 15 regional biobanks, for database modeling and analyzed and defined the metadata of the items. We also built a three-step (high, middle, and low) classification system for classifying the item concepts based on the metadata. To generate clear meanings of the items, clinical items were defined using the Systematized Nomenclature of Medicine Clinical Terms, and specimen items were defined using the Logical Observation Identifiers Names and Codes. To optimize database performance, we set up a multi-column index based on the classification system and the international standard code. RESULTS: As a result of subdividing 7,197,252 raw data items collected, we refined the metadata into 1,796 clinical items and 1,792 specimen items. The classification system consists of 15 high, 163 middle, and 3,588 low class items. International standard codes were linked to 69.9% of the clinical items and 71.7% of the specimen items. The database consists of 18 tables based on a table from MySQL Server 5.6. As a result of the performance evaluation, the multi-column index shortened query time by as much as nine times. CONCLUSIONS: The database developed was based on an international standard terminology system, providing an infrastructure that can integrate the 7,197,252 raw data items managed by the 15 regional biobanks. In particular, it resolved the inevitable interoperability issues in the exchange of information among the biobanks, and provided a solution to the synonym problem, which arises when the same concept is expressed in a variety of ways. Korean Society of Medical Informatics 2016-04 2016-04-30 /pmc/articles/PMC4871843/ /pubmed/27200223 http://dx.doi.org/10.4258/hir.2016.22.2.129 Text en © 2016 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Park, Hyun Sang
Cho, Hune
Kim, Hwa Sun
Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
title Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
title_full Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
title_fullStr Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
title_full_unstemmed Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
title_short Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
title_sort development of an integrated biospecimen database among the regional biobanks in korea
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4871843/
https://www.ncbi.nlm.nih.gov/pubmed/27200223
http://dx.doi.org/10.4258/hir.2016.22.2.129
work_keys_str_mv AT parkhyunsang developmentofanintegratedbiospecimendatabaseamongtheregionalbiobanksinkorea
AT chohune developmentofanintegratedbiospecimendatabaseamongtheregionalbiobanksinkorea
AT kimhwasun developmentofanintegratedbiospecimendatabaseamongtheregionalbiobanksinkorea