Cargando…

2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III

OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal se...

Descripción completa

Detalles Bibliográficos
Autores principales: Shao, Jianyin, Gouripeddi, Ram, Facelli, Julio C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cambridge University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6799800/
http://dx.doi.org/10.1017/cts.2017.59
_version_ 1783460368023027712
author Shao, Jianyin
Gouripeddi, Ram
Facelli, Julio C.
author_facet Shao, Jianyin
Gouripeddi, Ram
Facelli, Julio C.
author_sort Shao, Jianyin
collection PubMed
description OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal set of semantic concepts that can describe clinical trials and patients for efficient computational matching of clinical trial descriptions to potential participants at large scale. METHODS/STUDY POPULATION: We downloaded the free text describing the eligibility criteria of all clinical trials reported to ClinicalTrials.gov as of July 28, 2015, ~195,000 trials and ~2,000,000 clinical notes from MIMIC-III. Using MetaMap 2014 we extracted UMLS concepts (CUIs) from the collected text. We calculated the frequency of presence of the semantic concepts in the texts describing the clinical trials eligibility criteria and patient notes. RESULTS/ANTICIPATED RESULTS: The results show a classical power distribution, Y=2(10) X ((−2.043)), R (2)=0.9599, for clinical trial eligibility criteria and Y=5(13) X ((−2.684)), R (2)=0.9477 for MIMIC patient notes, where Y represents the number of documents in which a concept appears and X is the cardinal order the concept ordered from more to less frequent. From this distribution, it is possible to realize that from the over, 100,000 concepts in UMLS, there are only ~60,000 and 50,000 concepts that appear in less than 10 clinical trial eligibility descriptions and MIMIC-III patient clinical notes, respectively. This indicates that it would be possible to describe clinical trials and patient notes with a relatively small number of concepts, making the search space for matching patients to clinical trials a relatively small sub-space of the overall UMLS search space. DISCUSSION/SIGNIFICANCE OF IMPACT: Our results showing that the concepts used to describe clinical trial eligibility criteria and patient clinical notes follow a power distribution can lead to tractable computational approaches to automatically match patients to clinical trials at large scale by considerably reducing the search space. While automatic patient matching is not the panacea for improving clinical trial recruitment, better low cost computational preselection processes can allow the limited human resources assigned to patient recruitment to be redirected to the most promising targets for recruitment.
format Online
Article
Text
id pubmed-6799800
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Cambridge University Press
record_format MEDLINE/PubMed
spelling pubmed-67998002019-10-28 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III Shao, Jianyin Gouripeddi, Ram Facelli, Julio C. J Clin Transl Sci Biomedical Informatics/Health Informatics OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal set of semantic concepts that can describe clinical trials and patients for efficient computational matching of clinical trial descriptions to potential participants at large scale. METHODS/STUDY POPULATION: We downloaded the free text describing the eligibility criteria of all clinical trials reported to ClinicalTrials.gov as of July 28, 2015, ~195,000 trials and ~2,000,000 clinical notes from MIMIC-III. Using MetaMap 2014 we extracted UMLS concepts (CUIs) from the collected text. We calculated the frequency of presence of the semantic concepts in the texts describing the clinical trials eligibility criteria and patient notes. RESULTS/ANTICIPATED RESULTS: The results show a classical power distribution, Y=2(10) X ((−2.043)), R (2)=0.9599, for clinical trial eligibility criteria and Y=5(13) X ((−2.684)), R (2)=0.9477 for MIMIC patient notes, where Y represents the number of documents in which a concept appears and X is the cardinal order the concept ordered from more to less frequent. From this distribution, it is possible to realize that from the over, 100,000 concepts in UMLS, there are only ~60,000 and 50,000 concepts that appear in less than 10 clinical trial eligibility descriptions and MIMIC-III patient clinical notes, respectively. This indicates that it would be possible to describe clinical trials and patient notes with a relatively small number of concepts, making the search space for matching patients to clinical trials a relatively small sub-space of the overall UMLS search space. DISCUSSION/SIGNIFICANCE OF IMPACT: Our results showing that the concepts used to describe clinical trial eligibility criteria and patient clinical notes follow a power distribution can lead to tractable computational approaches to automatically match patients to clinical trials at large scale by considerably reducing the search space. While automatic patient matching is not the panacea for improving clinical trial recruitment, better low cost computational preselection processes can allow the limited human resources assigned to patient recruitment to be redirected to the most promising targets for recruitment. Cambridge University Press 2018-05-10 /pmc/articles/PMC6799800/ http://dx.doi.org/10.1017/cts.2017.59 Text en © The Association for Clinical and Translational Science 2018 http://creativecommons.org/licenses/by/4.0/ This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Biomedical Informatics/Health Informatics
Shao, Jianyin
Gouripeddi, Ram
Facelli, Julio C.
2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
title 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
title_full 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
title_fullStr 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
title_full_unstemmed 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
title_short 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
title_sort 2166: semantic characterization of clinical trial descriptions from clincaltrials.gov and patient notes from mimic-iii
topic Biomedical Informatics/Health Informatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6799800/
http://dx.doi.org/10.1017/cts.2017.59
work_keys_str_mv AT shaojianyin 2166semanticcharacterizationofclinicaltrialdescriptionsfromclincaltrialsgovandpatientnotesfrommimiciii
AT gouripeddiram 2166semanticcharacterizationofclinicaltrialdescriptionsfromclincaltrialsgovandpatientnotesfrommimiciii
AT facellijulioc 2166semanticcharacterizationofclinicaltrialdescriptionsfromclincaltrialsgovandpatientnotesfrommimiciii