Cargando…
Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the d...
Autores principales: | , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society of Clinical Oncology
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7469618/ https://www.ncbi.nlm.nih.gov/pubmed/32755461 http://dx.doi.org/10.1200/CCI.20.00037 |
_version_ | 1783578440604057600 |
---|---|
author | Thomas, Stacy Lichtenberg, Tara Dang, Kristen Fitzsimons, Michael Grossman, Robert L. Kundra, Ritika Lavery, Jessica A. Lenoue-Newton, Michele L. Panageas, Katherine S. Sawyers, Charles Schultz, Nikolaus D. Sirintrapun, Sahussapont J. Topaloglu, Umit Welch, Angelica Yu, Thomas Zehir, Ahmet Gardos, Stuart |
author_facet | Thomas, Stacy Lichtenberg, Tara Dang, Kristen Fitzsimons, Michael Grossman, Robert L. Kundra, Ritika Lavery, Jessica A. Lenoue-Newton, Michele L. Panageas, Katherine S. Sawyers, Charles Schultz, Nikolaus D. Sirintrapun, Sahussapont J. Topaloglu, Umit Welch, Angelica Yu, Thomas Zehir, Ahmet Gardos, Stuart |
author_sort | Thomas, Stacy |
collection | PubMed |
description | PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. MATERIALS AND METHODS: The American Association for Cancer Research’s Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute’s Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. RESULTS: After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. CONCLUSION: The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems. |
format | Online Article Text |
id | pubmed-7469618 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | American Society of Clinical Oncology |
record_format | MEDLINE/PubMed |
spelling | pubmed-74696182021-08-05 Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling Thomas, Stacy Lichtenberg, Tara Dang, Kristen Fitzsimons, Michael Grossman, Robert L. Kundra, Ritika Lavery, Jessica A. Lenoue-Newton, Michele L. Panageas, Katherine S. Sawyers, Charles Schultz, Nikolaus D. Sirintrapun, Sahussapont J. Topaloglu, Umit Welch, Angelica Yu, Thomas Zehir, Ahmet Gardos, Stuart JCO Clin Cancer Inform Original Reports PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. MATERIALS AND METHODS: The American Association for Cancer Research’s Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute’s Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. RESULTS: After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. CONCLUSION: The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems. American Society of Clinical Oncology 2020-08-05 /pmc/articles/PMC7469618/ /pubmed/32755461 http://dx.doi.org/10.1200/CCI.20.00037 Text en © 2020 by American Society of Clinical Oncology https://creativecommons.org/licenses/by/4.0/ Licensed under the Creative Commons Attribution 4.0 License: https://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Original Reports Thomas, Stacy Lichtenberg, Tara Dang, Kristen Fitzsimons, Michael Grossman, Robert L. Kundra, Ritika Lavery, Jessica A. Lenoue-Newton, Michele L. Panageas, Katherine S. Sawyers, Charles Schultz, Nikolaus D. Sirintrapun, Sahussapont J. Topaloglu, Umit Welch, Angelica Yu, Thomas Zehir, Ahmet Gardos, Stuart Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling |
title | Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling |
title_full | Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling |
title_fullStr | Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling |
title_full_unstemmed | Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling |
title_short | Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling |
title_sort | linked entity attribute pair (leap): a harmonization framework for data pooling |
topic | Original Reports |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7469618/ https://www.ncbi.nlm.nih.gov/pubmed/32755461 http://dx.doi.org/10.1200/CCI.20.00037 |
work_keys_str_mv | AT thomasstacy linkedentityattributepairleapaharmonizationframeworkfordatapooling AT lichtenbergtara linkedentityattributepairleapaharmonizationframeworkfordatapooling AT dangkristen linkedentityattributepairleapaharmonizationframeworkfordatapooling AT fitzsimonsmichael linkedentityattributepairleapaharmonizationframeworkfordatapooling AT grossmanrobertl linkedentityattributepairleapaharmonizationframeworkfordatapooling AT kundraritika linkedentityattributepairleapaharmonizationframeworkfordatapooling AT laveryjessicaa linkedentityattributepairleapaharmonizationframeworkfordatapooling AT lenouenewtonmichelel linkedentityattributepairleapaharmonizationframeworkfordatapooling AT panageaskatherines linkedentityattributepairleapaharmonizationframeworkfordatapooling AT sawyerscharles linkedentityattributepairleapaharmonizationframeworkfordatapooling AT schultznikolausd linkedentityattributepairleapaharmonizationframeworkfordatapooling AT sirintrapunsahussapontj linkedentityattributepairleapaharmonizationframeworkfordatapooling AT topalogluumit linkedentityattributepairleapaharmonizationframeworkfordatapooling AT welchangelica linkedentityattributepairleapaharmonizationframeworkfordatapooling AT yuthomas linkedentityattributepairleapaharmonizationframeworkfordatapooling AT zehirahmet linkedentityattributepairleapaharmonizationframeworkfordatapooling AT gardosstuart linkedentityattributepairleapaharmonizationframeworkfordatapooling |