Cargando…

GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data

OBJECTIVES: Clinico-genomic data (CGD) acquired through routine clinical practice has the potential to improve our understanding of clinical oncology. However, these data often reside in heterogeneous and semistructured data, resulting in prolonged time-to-analyses. MATERIALS AND METHODS: We created...

Descripción completa

Detalles Bibliográficos
Autores principales: Miller, David M, Shalhout, Sophia Z
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8476929/
https://www.ncbi.nlm.nih.gov/pubmed/34595403
http://dx.doi.org/10.1093/jamiaopen/ooab082
_version_ 1784575727974219776
author Miller, David M
Shalhout, Sophia Z
author_facet Miller, David M
Shalhout, Sophia Z
author_sort Miller, David M
collection PubMed
description OBJECTIVES: Clinico-genomic data (CGD) acquired through routine clinical practice has the potential to improve our understanding of clinical oncology. However, these data often reside in heterogeneous and semistructured data, resulting in prolonged time-to-analyses. MATERIALS AND METHODS: We created GENETEX: an R package and Shiny application for text mining genomic reports from electronic health record (EHR) and direct import into Research Electronic Data Capture (REDCap). RESULTS: GENETEX facilitates the abstraction of CGD from EHR and streamlines the capture of structured data into REDCap. Its functions include natural language processing of key genomic information, transformation of semistructured data into structured data, and importation into REDCap. When evaluated with manual abstraction, GENETEX had >99% agreement and captured CGD in approximately one-fifth the time. CONCLUSIONS: GENETEX is freely available under the Massachusetts Institute of Technology license and can be obtained from GitHub (https://github.com/TheMillerLab/genetex). GENETEX is executed in R and deployed as a Shiny application for non-R users. It produces high-fidelity abstraction of CGD in a fraction of the time.
format Online
Article
Text
id pubmed-8476929
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84769292021-09-29 GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data Miller, David M Shalhout, Sophia Z JAMIA Open Application Notes OBJECTIVES: Clinico-genomic data (CGD) acquired through routine clinical practice has the potential to improve our understanding of clinical oncology. However, these data often reside in heterogeneous and semistructured data, resulting in prolonged time-to-analyses. MATERIALS AND METHODS: We created GENETEX: an R package and Shiny application for text mining genomic reports from electronic health record (EHR) and direct import into Research Electronic Data Capture (REDCap). RESULTS: GENETEX facilitates the abstraction of CGD from EHR and streamlines the capture of structured data into REDCap. Its functions include natural language processing of key genomic information, transformation of semistructured data into structured data, and importation into REDCap. When evaluated with manual abstraction, GENETEX had >99% agreement and captured CGD in approximately one-fifth the time. CONCLUSIONS: GENETEX is freely available under the Massachusetts Institute of Technology license and can be obtained from GitHub (https://github.com/TheMillerLab/genetex). GENETEX is executed in R and deployed as a Shiny application for non-R users. It produces high-fidelity abstraction of CGD in a fraction of the time. Oxford University Press 2021-09-28 /pmc/articles/PMC8476929/ /pubmed/34595403 http://dx.doi.org/10.1093/jamiaopen/ooab082 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Application Notes
Miller, David M
Shalhout, Sophia Z
GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data
title GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data
title_full GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data
title_fullStr GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data
title_full_unstemmed GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data
title_short GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data
title_sort genetex—a genomics report text mining r package and shiny application designed to capture real-world clinico-genomic data
topic Application Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8476929/
https://www.ncbi.nlm.nih.gov/pubmed/34595403
http://dx.doi.org/10.1093/jamiaopen/ooab082
work_keys_str_mv AT millerdavidm genetexagenomicsreporttextminingrpackageandshinyapplicationdesignedtocapturerealworldclinicogenomicdata
AT shalhoutsophiaz genetexagenomicsreporttextminingrpackageandshinyapplicationdesignedtocapturerealworldclinicogenomicdata