Cargando…

Automated extraction of genes associated with antibiotic resistance from the biomedical literature

The detection of bacterial antibiotic resistance phenotypes is important when carrying out clinical decisions for patient treatment. Conventional phenotypic testing involves culturing bacteria which requires a significant amount of time and work. Whole-genome sequencing is emerging as a fast alterna...

Descripción completa

Detalles Bibliográficos
Autores principales: Brincat, Andre, Hofmann, Markus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9263533/
https://www.ncbi.nlm.nih.gov/pubmed/35134132
http://dx.doi.org/10.1093/database/baab077
_version_ 1784742753899380736
author Brincat, Andre
Hofmann, Markus
author_facet Brincat, Andre
Hofmann, Markus
author_sort Brincat, Andre
collection PubMed
description The detection of bacterial antibiotic resistance phenotypes is important when carrying out clinical decisions for patient treatment. Conventional phenotypic testing involves culturing bacteria which requires a significant amount of time and work. Whole-genome sequencing is emerging as a fast alternative to resistance prediction, by considering the presence/absence of certain genes. A lot of research has focused on determining which bacterial genes cause antibiotic resistance and efforts are being made to consolidate these facts in knowledge bases (KBs). KBs are usually manually curated by domain experts to be of the highest quality. However, this limits the pace at which new facts are added. Automated relation extraction of gene-antibiotic resistance relations from the biomedical literature is one solution that can simplify the curation process. This paper reports on the development of a text mining pipeline that takes in English biomedical abstracts and outputs genes that are predicted to cause resistance to antibiotics. To test the generalisability of this pipeline it was then applied to predict genes associated with Helicobacter pylori antibiotic resistance, that are not present in common antibiotic resistance KBs or publications studying H. pylori. These genes would be candidates for further lab-based antibiotic research and inclusion in these KBs. For relation extraction, state-of-the-art deep learning models were used. These models were trained on a newly developed silver corpus which was generated by distant supervision of abstracts using the facts obtained from KBs. The top performing model was superior to a co-occurrence model, achieving a recall of 95%, a precision of 60% and F1-score of 74% on a manually annotated holdout dataset. To our knowledge, this project was the first attempt at developing a complete text mining pipeline that incorporates deep learning models to extract gene-antibiotic resistance relations from the literature. Additional related data can be found at https://github.com/AndreBrincat/Gene-Antibiotic-Resistance-Relation-Extraction
format Online
Article
Text
id pubmed-9263533
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92635332022-07-08 Automated extraction of genes associated with antibiotic resistance from the biomedical literature Brincat, Andre Hofmann, Markus Database (Oxford) Original Article The detection of bacterial antibiotic resistance phenotypes is important when carrying out clinical decisions for patient treatment. Conventional phenotypic testing involves culturing bacteria which requires a significant amount of time and work. Whole-genome sequencing is emerging as a fast alternative to resistance prediction, by considering the presence/absence of certain genes. A lot of research has focused on determining which bacterial genes cause antibiotic resistance and efforts are being made to consolidate these facts in knowledge bases (KBs). KBs are usually manually curated by domain experts to be of the highest quality. However, this limits the pace at which new facts are added. Automated relation extraction of gene-antibiotic resistance relations from the biomedical literature is one solution that can simplify the curation process. This paper reports on the development of a text mining pipeline that takes in English biomedical abstracts and outputs genes that are predicted to cause resistance to antibiotics. To test the generalisability of this pipeline it was then applied to predict genes associated with Helicobacter pylori antibiotic resistance, that are not present in common antibiotic resistance KBs or publications studying H. pylori. These genes would be candidates for further lab-based antibiotic research and inclusion in these KBs. For relation extraction, state-of-the-art deep learning models were used. These models were trained on a newly developed silver corpus which was generated by distant supervision of abstracts using the facts obtained from KBs. The top performing model was superior to a co-occurrence model, achieving a recall of 95%, a precision of 60% and F1-score of 74% on a manually annotated holdout dataset. To our knowledge, this project was the first attempt at developing a complete text mining pipeline that incorporates deep learning models to extract gene-antibiotic resistance relations from the literature. Additional related data can be found at https://github.com/AndreBrincat/Gene-Antibiotic-Resistance-Relation-Extraction Oxford University Press 2022-01-20 /pmc/articles/PMC9263533/ /pubmed/35134132 http://dx.doi.org/10.1093/database/baab077 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Brincat, Andre
Hofmann, Markus
Automated extraction of genes associated with antibiotic resistance from the biomedical literature
title Automated extraction of genes associated with antibiotic resistance from the biomedical literature
title_full Automated extraction of genes associated with antibiotic resistance from the biomedical literature
title_fullStr Automated extraction of genes associated with antibiotic resistance from the biomedical literature
title_full_unstemmed Automated extraction of genes associated with antibiotic resistance from the biomedical literature
title_short Automated extraction of genes associated with antibiotic resistance from the biomedical literature
title_sort automated extraction of genes associated with antibiotic resistance from the biomedical literature
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9263533/
https://www.ncbi.nlm.nih.gov/pubmed/35134132
http://dx.doi.org/10.1093/database/baab077
work_keys_str_mv AT brincatandre automatedextractionofgenesassociatedwithantibioticresistancefromthebiomedicalliterature
AT hofmannmarkus automatedextractionofgenesassociatedwithantibioticresistancefromthebiomedicalliterature