Cargando…

Text mining for identification of biological entities related to antibiotic resistant organisms

Antimicrobial resistance is a significant public health problem worldwide. In recent years, the scientific community has been intensifying efforts to combat this problem; many experiments have been developed, and many articles are published in this area. However, the growing volume of biological lit...

Descripción completa

Detalles Bibliográficos
Autores principales: Fortunato Costa, Kelle, Almeida Araújo, Fabrício, Morais, Jefferson, Lisboa Frances, Carlos Renato, Ramos, Rommel T. J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9080439/
https://www.ncbi.nlm.nih.gov/pubmed/35539017
http://dx.doi.org/10.7717/peerj.13351
_version_ 1784702786993127424
author Fortunato Costa, Kelle
Almeida Araújo, Fabrício
Morais, Jefferson
Lisboa Frances, Carlos Renato
Ramos, Rommel T. J.
author_facet Fortunato Costa, Kelle
Almeida Araújo, Fabrício
Morais, Jefferson
Lisboa Frances, Carlos Renato
Ramos, Rommel T. J.
author_sort Fortunato Costa, Kelle
collection PubMed
description Antimicrobial resistance is a significant public health problem worldwide. In recent years, the scientific community has been intensifying efforts to combat this problem; many experiments have been developed, and many articles are published in this area. However, the growing volume of biological literature increases the difficulty of the biocuration process due to the cost and time required. Modern text mining tools with the adoption of artificial intelligence technology are helpful to assist in the evolution of research. In this article, we propose a text mining model capable of identifying and ranking prioritizing scientific articles in the context of antimicrobial resistance. We retrieved scientific articles from the PubMed database, adopted machine learning techniques to generate the vector representation of the retrieved scientific articles, and identified their similarity with the context. As a result of this process, we obtained a dataset labeled “Relevant” and “Irrelevant” and used this dataset to implement one supervised learning algorithm to classify new records. The model’s overall performance reached 90% accuracy and the f-measure (harmonic mean between the metrics) reached 82% accuracy for positive class and 93% for negative class, showing quality in the identification of scientific articles relevant to the context. The dataset, scripts and models are available at https://github.com/engbiopct/TextMiningAMR.
format Online
Article
Text
id pubmed-9080439
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-90804392022-05-09 Text mining for identification of biological entities related to antibiotic resistant organisms Fortunato Costa, Kelle Almeida Araújo, Fabrício Morais, Jefferson Lisboa Frances, Carlos Renato Ramos, Rommel T. J. PeerJ Bioinformatics Antimicrobial resistance is a significant public health problem worldwide. In recent years, the scientific community has been intensifying efforts to combat this problem; many experiments have been developed, and many articles are published in this area. However, the growing volume of biological literature increases the difficulty of the biocuration process due to the cost and time required. Modern text mining tools with the adoption of artificial intelligence technology are helpful to assist in the evolution of research. In this article, we propose a text mining model capable of identifying and ranking prioritizing scientific articles in the context of antimicrobial resistance. We retrieved scientific articles from the PubMed database, adopted machine learning techniques to generate the vector representation of the retrieved scientific articles, and identified their similarity with the context. As a result of this process, we obtained a dataset labeled “Relevant” and “Irrelevant” and used this dataset to implement one supervised learning algorithm to classify new records. The model’s overall performance reached 90% accuracy and the f-measure (harmonic mean between the metrics) reached 82% accuracy for positive class and 93% for negative class, showing quality in the identification of scientific articles relevant to the context. The dataset, scripts and models are available at https://github.com/engbiopct/TextMiningAMR. PeerJ Inc. 2022-05-05 /pmc/articles/PMC9080439/ /pubmed/35539017 http://dx.doi.org/10.7717/peerj.13351 Text en © 2022 Fortunato Costa et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Fortunato Costa, Kelle
Almeida Araújo, Fabrício
Morais, Jefferson
Lisboa Frances, Carlos Renato
Ramos, Rommel T. J.
Text mining for identification of biological entities related to antibiotic resistant organisms
title Text mining for identification of biological entities related to antibiotic resistant organisms
title_full Text mining for identification of biological entities related to antibiotic resistant organisms
title_fullStr Text mining for identification of biological entities related to antibiotic resistant organisms
title_full_unstemmed Text mining for identification of biological entities related to antibiotic resistant organisms
title_short Text mining for identification of biological entities related to antibiotic resistant organisms
title_sort text mining for identification of biological entities related to antibiotic resistant organisms
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9080439/
https://www.ncbi.nlm.nih.gov/pubmed/35539017
http://dx.doi.org/10.7717/peerj.13351
work_keys_str_mv AT fortunatocostakelle textminingforidentificationofbiologicalentitiesrelatedtoantibioticresistantorganisms
AT almeidaaraujofabricio textminingforidentificationofbiologicalentitiesrelatedtoantibioticresistantorganisms
AT moraisjefferson textminingforidentificationofbiologicalentitiesrelatedtoantibioticresistantorganisms
AT lisboafrancescarlosrenato textminingforidentificationofbiologicalentitiesrelatedtoantibioticresistantorganisms
AT ramosrommeltj textminingforidentificationofbiologicalentitiesrelatedtoantibioticresistantorganisms