Cargando…

TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4

TopEx is a natural language processing application developed to facilitate the exploration of topics and key words in a set of texts through a user interface that requires no programming or natural language processing knowledge, thus enhancing the ability of nontechnical researchers to explore and a...

Descripción completa

Detalles Bibliográficos
Autores principales: Olex, Amy L, French, Evan, Burdette, Peter, Sagiraju, Srilakshmi, Neumann, Thomas, Gal, Tamas S, McInnes, Bridget T
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9369716/
https://www.ncbi.nlm.nih.gov/pubmed/35951425
http://dx.doi.org/10.1093/database/baac063
_version_ 1784766554793050112
author Olex, Amy L
French, Evan
Burdette, Peter
Sagiraju, Srilakshmi
Neumann, Thomas
Gal, Tamas S
McInnes, Bridget T
author_facet Olex, Amy L
French, Evan
Burdette, Peter
Sagiraju, Srilakshmi
Neumann, Thomas
Gal, Tamas S
McInnes, Bridget T
author_sort Olex, Amy L
collection PubMed
description TopEx is a natural language processing application developed to facilitate the exploration of topics and key words in a set of texts through a user interface that requires no programming or natural language processing knowledge, thus enhancing the ability of nontechnical researchers to explore and analyze textual data. The underlying algorithm groups semantically similar sentences together followed by a topic analysis on each group to identify the key topics discussed in a collection of texts. Implementation is achieved via a Python library back end and a web application front end built with React and D3.js for visualizations. TopEx has been successfully used to identify themes, topics and key words in a variety of corpora, including Coronavirus disease 2019 (COVID-19) discharge summaries and tweets. Feedback from the BioCreative VII Challenge Track 4 concludes that TopEx is a useful tool for text exploration for a variety of users and tasks. DATABSE URL: http://topex.cctr.vcu.edu
format Online
Article
Text
id pubmed-9369716
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93697162022-08-12 TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4 Olex, Amy L French, Evan Burdette, Peter Sagiraju, Srilakshmi Neumann, Thomas Gal, Tamas S McInnes, Bridget T Database (Oxford) Database Tool TopEx is a natural language processing application developed to facilitate the exploration of topics and key words in a set of texts through a user interface that requires no programming or natural language processing knowledge, thus enhancing the ability of nontechnical researchers to explore and analyze textual data. The underlying algorithm groups semantically similar sentences together followed by a topic analysis on each group to identify the key topics discussed in a collection of texts. Implementation is achieved via a Python library back end and a web application front end built with React and D3.js for visualizations. TopEx has been successfully used to identify themes, topics and key words in a variety of corpora, including Coronavirus disease 2019 (COVID-19) discharge summaries and tweets. Feedback from the BioCreative VII Challenge Track 4 concludes that TopEx is a useful tool for text exploration for a variety of users and tasks. DATABSE URL: http://topex.cctr.vcu.edu Oxford University Press 2022-08-11 /pmc/articles/PMC9369716/ /pubmed/35951425 http://dx.doi.org/10.1093/database/baac063 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Database Tool
Olex, Amy L
French, Evan
Burdette, Peter
Sagiraju, Srilakshmi
Neumann, Thomas
Gal, Tamas S
McInnes, Bridget T
TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4
title TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4
title_full TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4
title_fullStr TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4
title_full_unstemmed TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4
title_short TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4
title_sort topex: topic exploration of covid-19 corpora - results from the biocreative vii challenge track 4
topic Database Tool
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9369716/
https://www.ncbi.nlm.nih.gov/pubmed/35951425
http://dx.doi.org/10.1093/database/baac063
work_keys_str_mv AT olexamyl topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4
AT frenchevan topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4
AT burdettepeter topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4
AT sagirajusrilakshmi topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4
AT neumannthomas topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4
AT galtamass topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4
AT mcinnesbridgett topextopicexplorationofcovid19corporaresultsfromthebiocreativeviichallengetrack4