Cargando…
2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III
OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal se...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cambridge University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6799800/ http://dx.doi.org/10.1017/cts.2017.59 |
_version_ | 1783460368023027712 |
---|---|
author | Shao, Jianyin Gouripeddi, Ram Facelli, Julio C. |
author_facet | Shao, Jianyin Gouripeddi, Ram Facelli, Julio C. |
author_sort | Shao, Jianyin |
collection | PubMed |
description | OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal set of semantic concepts that can describe clinical trials and patients for efficient computational matching of clinical trial descriptions to potential participants at large scale. METHODS/STUDY POPULATION: We downloaded the free text describing the eligibility criteria of all clinical trials reported to ClinicalTrials.gov as of July 28, 2015, ~195,000 trials and ~2,000,000 clinical notes from MIMIC-III. Using MetaMap 2014 we extracted UMLS concepts (CUIs) from the collected text. We calculated the frequency of presence of the semantic concepts in the texts describing the clinical trials eligibility criteria and patient notes. RESULTS/ANTICIPATED RESULTS: The results show a classical power distribution, Y=2(10) X ((−2.043)), R (2)=0.9599, for clinical trial eligibility criteria and Y=5(13) X ((−2.684)), R (2)=0.9477 for MIMIC patient notes, where Y represents the number of documents in which a concept appears and X is the cardinal order the concept ordered from more to less frequent. From this distribution, it is possible to realize that from the over, 100,000 concepts in UMLS, there are only ~60,000 and 50,000 concepts that appear in less than 10 clinical trial eligibility descriptions and MIMIC-III patient clinical notes, respectively. This indicates that it would be possible to describe clinical trials and patient notes with a relatively small number of concepts, making the search space for matching patients to clinical trials a relatively small sub-space of the overall UMLS search space. DISCUSSION/SIGNIFICANCE OF IMPACT: Our results showing that the concepts used to describe clinical trial eligibility criteria and patient clinical notes follow a power distribution can lead to tractable computational approaches to automatically match patients to clinical trials at large scale by considerably reducing the search space. While automatic patient matching is not the panacea for improving clinical trial recruitment, better low cost computational preselection processes can allow the limited human resources assigned to patient recruitment to be redirected to the most promising targets for recruitment. |
format | Online Article Text |
id | pubmed-6799800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Cambridge University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-67998002019-10-28 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III Shao, Jianyin Gouripeddi, Ram Facelli, Julio C. J Clin Transl Sci Biomedical Informatics/Health Informatics OBJECTIVES/SPECIFIC AIMS: This poster presents a detailed characterization of the distribution of semantic concepts used in the text describing eligibility criteria of clinical trials reported to ClincalTrials.gov and patient notes from MIMIC-III. The final goal of this study is to find a minimal set of semantic concepts that can describe clinical trials and patients for efficient computational matching of clinical trial descriptions to potential participants at large scale. METHODS/STUDY POPULATION: We downloaded the free text describing the eligibility criteria of all clinical trials reported to ClinicalTrials.gov as of July 28, 2015, ~195,000 trials and ~2,000,000 clinical notes from MIMIC-III. Using MetaMap 2014 we extracted UMLS concepts (CUIs) from the collected text. We calculated the frequency of presence of the semantic concepts in the texts describing the clinical trials eligibility criteria and patient notes. RESULTS/ANTICIPATED RESULTS: The results show a classical power distribution, Y=2(10) X ((−2.043)), R (2)=0.9599, for clinical trial eligibility criteria and Y=5(13) X ((−2.684)), R (2)=0.9477 for MIMIC patient notes, where Y represents the number of documents in which a concept appears and X is the cardinal order the concept ordered from more to less frequent. From this distribution, it is possible to realize that from the over, 100,000 concepts in UMLS, there are only ~60,000 and 50,000 concepts that appear in less than 10 clinical trial eligibility descriptions and MIMIC-III patient clinical notes, respectively. This indicates that it would be possible to describe clinical trials and patient notes with a relatively small number of concepts, making the search space for matching patients to clinical trials a relatively small sub-space of the overall UMLS search space. DISCUSSION/SIGNIFICANCE OF IMPACT: Our results showing that the concepts used to describe clinical trial eligibility criteria and patient clinical notes follow a power distribution can lead to tractable computational approaches to automatically match patients to clinical trials at large scale by considerably reducing the search space. While automatic patient matching is not the panacea for improving clinical trial recruitment, better low cost computational preselection processes can allow the limited human resources assigned to patient recruitment to be redirected to the most promising targets for recruitment. Cambridge University Press 2018-05-10 /pmc/articles/PMC6799800/ http://dx.doi.org/10.1017/cts.2017.59 Text en © The Association for Clinical and Translational Science 2018 http://creativecommons.org/licenses/by/4.0/ This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Biomedical Informatics/Health Informatics Shao, Jianyin Gouripeddi, Ram Facelli, Julio C. 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III |
title | 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III |
title_full | 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III |
title_fullStr | 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III |
title_full_unstemmed | 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III |
title_short | 2166: Semantic characterization of clinical trial descriptions from ClincalTrials.gov and patient notes from MIMIC-III |
title_sort | 2166: semantic characterization of clinical trial descriptions from clincaltrials.gov and patient notes from mimic-iii |
topic | Biomedical Informatics/Health Informatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6799800/ http://dx.doi.org/10.1017/cts.2017.59 |
work_keys_str_mv | AT shaojianyin 2166semanticcharacterizationofclinicaltrialdescriptionsfromclincaltrialsgovandpatientnotesfrommimiciii AT gouripeddiram 2166semanticcharacterizationofclinicaltrialdescriptionsfromclincaltrialsgovandpatientnotesfrommimiciii AT facellijulioc 2166semanticcharacterizationofclinicaltrialdescriptionsfromclincaltrialsgovandpatientnotesfrommimiciii |