Cargando…
Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea
OBJECTIVES: This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. METHODS: We collected raw data, ma...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korean Society of Medical Informatics
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4871843/ https://www.ncbi.nlm.nih.gov/pubmed/27200223 http://dx.doi.org/10.4258/hir.2016.22.2.129 |
_version_ | 1782432635363000320 |
---|---|
author | Park, Hyun Sang Cho, Hune Kim, Hwa Sun |
author_facet | Park, Hyun Sang Cho, Hune Kim, Hwa Sun |
author_sort | Park, Hyun Sang |
collection | PubMed |
description | OBJECTIVES: This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. METHODS: We collected raw data, managed independently by 15 regional biobanks, for database modeling and analyzed and defined the metadata of the items. We also built a three-step (high, middle, and low) classification system for classifying the item concepts based on the metadata. To generate clear meanings of the items, clinical items were defined using the Systematized Nomenclature of Medicine Clinical Terms, and specimen items were defined using the Logical Observation Identifiers Names and Codes. To optimize database performance, we set up a multi-column index based on the classification system and the international standard code. RESULTS: As a result of subdividing 7,197,252 raw data items collected, we refined the metadata into 1,796 clinical items and 1,792 specimen items. The classification system consists of 15 high, 163 middle, and 3,588 low class items. International standard codes were linked to 69.9% of the clinical items and 71.7% of the specimen items. The database consists of 18 tables based on a table from MySQL Server 5.6. As a result of the performance evaluation, the multi-column index shortened query time by as much as nine times. CONCLUSIONS: The database developed was based on an international standard terminology system, providing an infrastructure that can integrate the 7,197,252 raw data items managed by the 15 regional biobanks. In particular, it resolved the inevitable interoperability issues in the exchange of information among the biobanks, and provided a solution to the synonym problem, which arises when the same concept is expressed in a variety of ways. |
format | Online Article Text |
id | pubmed-4871843 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Korean Society of Medical Informatics |
record_format | MEDLINE/PubMed |
spelling | pubmed-48718432016-05-19 Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea Park, Hyun Sang Cho, Hune Kim, Hwa Sun Healthc Inform Res Original Article OBJECTIVES: This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. METHODS: We collected raw data, managed independently by 15 regional biobanks, for database modeling and analyzed and defined the metadata of the items. We also built a three-step (high, middle, and low) classification system for classifying the item concepts based on the metadata. To generate clear meanings of the items, clinical items were defined using the Systematized Nomenclature of Medicine Clinical Terms, and specimen items were defined using the Logical Observation Identifiers Names and Codes. To optimize database performance, we set up a multi-column index based on the classification system and the international standard code. RESULTS: As a result of subdividing 7,197,252 raw data items collected, we refined the metadata into 1,796 clinical items and 1,792 specimen items. The classification system consists of 15 high, 163 middle, and 3,588 low class items. International standard codes were linked to 69.9% of the clinical items and 71.7% of the specimen items. The database consists of 18 tables based on a table from MySQL Server 5.6. As a result of the performance evaluation, the multi-column index shortened query time by as much as nine times. CONCLUSIONS: The database developed was based on an international standard terminology system, providing an infrastructure that can integrate the 7,197,252 raw data items managed by the 15 regional biobanks. In particular, it resolved the inevitable interoperability issues in the exchange of information among the biobanks, and provided a solution to the synonym problem, which arises when the same concept is expressed in a variety of ways. Korean Society of Medical Informatics 2016-04 2016-04-30 /pmc/articles/PMC4871843/ /pubmed/27200223 http://dx.doi.org/10.4258/hir.2016.22.2.129 Text en © 2016 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Park, Hyun Sang Cho, Hune Kim, Hwa Sun Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea |
title | Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea |
title_full | Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea |
title_fullStr | Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea |
title_full_unstemmed | Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea |
title_short | Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea |
title_sort | development of an integrated biospecimen database among the regional biobanks in korea |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4871843/ https://www.ncbi.nlm.nih.gov/pubmed/27200223 http://dx.doi.org/10.4258/hir.2016.22.2.129 |
work_keys_str_mv | AT parkhyunsang developmentofanintegratedbiospecimendatabaseamongtheregionalbiobanksinkorea AT chohune developmentofanintegratedbiospecimendatabaseamongtheregionalbiobanksinkorea AT kimhwasun developmentofanintegratedbiospecimendatabaseamongtheregionalbiobanksinkorea |