Cargando…

Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling

PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the d...

Descripción completa

Detalles Bibliográficos
Autores principales: Thomas, Stacy, Lichtenberg, Tara, Dang, Kristen, Fitzsimons, Michael, Grossman, Robert L., Kundra, Ritika, Lavery, Jessica A., Lenoue-Newton, Michele L., Panageas, Katherine S., Sawyers, Charles, Schultz, Nikolaus D., Sirintrapun, Sahussapont J., Topaloglu, Umit, Welch, Angelica, Yu, Thomas, Zehir, Ahmet, Gardos, Stuart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Clinical Oncology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7469618/
https://www.ncbi.nlm.nih.gov/pubmed/32755461
http://dx.doi.org/10.1200/CCI.20.00037
_version_ 1783578440604057600
author Thomas, Stacy
Lichtenberg, Tara
Dang, Kristen
Fitzsimons, Michael
Grossman, Robert L.
Kundra, Ritika
Lavery, Jessica A.
Lenoue-Newton, Michele L.
Panageas, Katherine S.
Sawyers, Charles
Schultz, Nikolaus D.
Sirintrapun, Sahussapont J.
Topaloglu, Umit
Welch, Angelica
Yu, Thomas
Zehir, Ahmet
Gardos, Stuart
author_facet Thomas, Stacy
Lichtenberg, Tara
Dang, Kristen
Fitzsimons, Michael
Grossman, Robert L.
Kundra, Ritika
Lavery, Jessica A.
Lenoue-Newton, Michele L.
Panageas, Katherine S.
Sawyers, Charles
Schultz, Nikolaus D.
Sirintrapun, Sahussapont J.
Topaloglu, Umit
Welch, Angelica
Yu, Thomas
Zehir, Ahmet
Gardos, Stuart
author_sort Thomas, Stacy
collection PubMed
description PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. MATERIALS AND METHODS: The American Association for Cancer Research’s Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute’s Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. RESULTS: After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. CONCLUSION: The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems.
format Online
Article
Text
id pubmed-7469618
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society of Clinical Oncology
record_format MEDLINE/PubMed
spelling pubmed-74696182021-08-05 Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling Thomas, Stacy Lichtenberg, Tara Dang, Kristen Fitzsimons, Michael Grossman, Robert L. Kundra, Ritika Lavery, Jessica A. Lenoue-Newton, Michele L. Panageas, Katherine S. Sawyers, Charles Schultz, Nikolaus D. Sirintrapun, Sahussapont J. Topaloglu, Umit Welch, Angelica Yu, Thomas Zehir, Ahmet Gardos, Stuart JCO Clin Cancer Inform Original Reports PURPOSE: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. MATERIALS AND METHODS: The American Association for Cancer Research’s Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute’s Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. RESULTS: After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. CONCLUSION: The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems. American Society of Clinical Oncology 2020-08-05 /pmc/articles/PMC7469618/ /pubmed/32755461 http://dx.doi.org/10.1200/CCI.20.00037 Text en © 2020 by American Society of Clinical Oncology https://creativecommons.org/licenses/by/4.0/ Licensed under the Creative Commons Attribution 4.0 License: https://creativecommons.org/licenses/by/4.0/
spellingShingle Original Reports
Thomas, Stacy
Lichtenberg, Tara
Dang, Kristen
Fitzsimons, Michael
Grossman, Robert L.
Kundra, Ritika
Lavery, Jessica A.
Lenoue-Newton, Michele L.
Panageas, Katherine S.
Sawyers, Charles
Schultz, Nikolaus D.
Sirintrapun, Sahussapont J.
Topaloglu, Umit
Welch, Angelica
Yu, Thomas
Zehir, Ahmet
Gardos, Stuart
Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
title Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
title_full Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
title_fullStr Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
title_full_unstemmed Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
title_short Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling
title_sort linked entity attribute pair (leap): a harmonization framework for data pooling
topic Original Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7469618/
https://www.ncbi.nlm.nih.gov/pubmed/32755461
http://dx.doi.org/10.1200/CCI.20.00037
work_keys_str_mv AT thomasstacy linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT lichtenbergtara linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT dangkristen linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT fitzsimonsmichael linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT grossmanrobertl linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT kundraritika linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT laveryjessicaa linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT lenouenewtonmichelel linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT panageaskatherines linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT sawyerscharles linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT schultznikolausd linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT sirintrapunsahussapontj linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT topalogluumit linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT welchangelica linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT yuthomas linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT zehirahmet linkedentityattributepairleapaharmonizationframeworkfordatapooling
AT gardosstuart linkedentityattributepairleapaharmonizationframeworkfordatapooling