Cargando…

TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4

TopEx is a natural language processing application developed to facilitate the exploration of topics and key words in a set of texts through a user interface that requires no programming or natural language processing knowledge, thus enhancing the ability of nontechnical researchers to explore and a...

Descripción completa

Detalles Bibliográficos
Autores principales: Olex, Amy L, French, Evan, Burdette, Peter, Sagiraju, Srilakshmi, Neumann, Thomas, Gal, Tamas S, McInnes, Bridget T
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9369716/
https://www.ncbi.nlm.nih.gov/pubmed/35951425
http://dx.doi.org/10.1093/database/baac063
Descripción
Sumario:TopEx is a natural language processing application developed to facilitate the exploration of topics and key words in a set of texts through a user interface that requires no programming or natural language processing knowledge, thus enhancing the ability of nontechnical researchers to explore and analyze textual data. The underlying algorithm groups semantically similar sentences together followed by a topic analysis on each group to identify the key topics discussed in a collection of texts. Implementation is achieved via a Python library back end and a web application front end built with React and D3.js for visualizations. TopEx has been successfully used to identify themes, topics and key words in a variety of corpora, including Coronavirus disease 2019 (COVID-19) discharge summaries and tweets. Feedback from the BioCreative VII Challenge Track 4 concludes that TopEx is a useful tool for text exploration for a variety of users and tasks. DATABSE URL: http://topex.cctr.vcu.edu