Cargando…

Use and validation of text mining and cluster algorithms to derive insights from Corona Virus Disease-2019 (COVID-19) medical literature

The emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) late last year has not only led to the world-wide coronavirus disease 2019 (COVID-19) pandemic but also a deluge of biomedical literature. Following the release of the COVID-19 open research dataset (CORD-19) comprisin...

Descripción completa

Detalles Bibliográficos
Autores principales: Reddy, Sandeep, Bhaskar, Ravi, Padmanabhan, Sandosh, Verspoor, Karin, Mamillapalli, Chaitanya, Lahoti, Rani, Makinen, Ville-Petteri, Pradhan, Smitan, Kushwah, Puru, Sinha, Saumya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Authors. Published by Elsevier B.V. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8050406/
https://www.ncbi.nlm.nih.gov/pubmed/34337589
http://dx.doi.org/10.1016/j.cmpbup.2021.100010
Descripción
Sumario:The emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) late last year has not only led to the world-wide coronavirus disease 2019 (COVID-19) pandemic but also a deluge of biomedical literature. Following the release of the COVID-19 open research dataset (CORD-19) comprising over 200,000 scholarly articles, we a multi-disciplinary team of data scientists, clinicians, medical researchers and software engineers developed an innovative natural language processing (NLP) platform that combines an advanced search engine with a biomedical named entity recognition extraction package. In particular, the platform was developed to extract information relating to clinical risk factors for COVID-19 by presenting the results in a cluster format to support knowledge discovery. Here we describe the principles behind the development, the model and the results we obtained.