Cargando…

ViMRT: a text-mining tool and search engine for automated virus mutation recognition

MOTIVATION: Virus mutation is one of the most important research issues which plays a critical role in disease progression and has prompted substantial scientific publications. Mutation extraction from published literature has become an increasingly important task, benefiting many downstream applica...

Descripción completa

Detalles Bibliográficos
Autores principales: Tong, Yuantao, Tan, Fanglin, Huang, Honglian, Zhang, Zeyu, Zong, Hui, Xie, Yujia, Huang, Danqi, Cheng, Shiyang, Wei, Ziyi, Fang, Meng, Crabbe, M James C, Wang, Ying, Zhang, Xiaoyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805560/
https://www.ncbi.nlm.nih.gov/pubmed/36342236
http://dx.doi.org/10.1093/bioinformatics/btac721
_version_ 1784862353278369792
author Tong, Yuantao
Tan, Fanglin
Huang, Honglian
Zhang, Zeyu
Zong, Hui
Xie, Yujia
Huang, Danqi
Cheng, Shiyang
Wei, Ziyi
Fang, Meng
Crabbe, M James C
Wang, Ying
Zhang, Xiaoyan
author_facet Tong, Yuantao
Tan, Fanglin
Huang, Honglian
Zhang, Zeyu
Zong, Hui
Xie, Yujia
Huang, Danqi
Cheng, Shiyang
Wei, Ziyi
Fang, Meng
Crabbe, M James C
Wang, Ying
Zhang, Xiaoyan
author_sort Tong, Yuantao
collection PubMed
description MOTIVATION: Virus mutation is one of the most important research issues which plays a critical role in disease progression and has prompted substantial scientific publications. Mutation extraction from published literature has become an increasingly important task, benefiting many downstream applications such as vaccine design and drug usage. However, most existing approaches have low performances in extracting virus mutation due to both lack of precise virus mutation information and their development based on human gene mutations. RESULTS: We developed ViMRT, a text-mining tool and search engine for automated virus mutation recognition using natural language processing. ViMRT mainly developed 8 optimized rules and 12 regular expressions based on a development dataset comprising 830 papers of 5 human severe disease-related viruses. It achieved higher performance than other tools in a test dataset (1662 papers, 99.17% in F1-score) and has been applied well to two other viruses, influenza virus and severe acute respiratory syndrome coronavirus-2 (212 papers, 96.99% in F1-score). These results indicate that ViMRT is a high-performance method for the extraction of virus mutation from the biomedical literature. Besides, we present a search engine for researchers to quickly find and accurately search virus mutation-related information including virus genes and related diseases. AVAILABILITY AND IMPLEMENTATION: ViMRT software is freely available at http://bmtongji.cn:1225/mutation/index.
format Online
Article
Text
id pubmed-9805560
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98055602023-01-03 ViMRT: a text-mining tool and search engine for automated virus mutation recognition Tong, Yuantao Tan, Fanglin Huang, Honglian Zhang, Zeyu Zong, Hui Xie, Yujia Huang, Danqi Cheng, Shiyang Wei, Ziyi Fang, Meng Crabbe, M James C Wang, Ying Zhang, Xiaoyan Bioinformatics Original Paper MOTIVATION: Virus mutation is one of the most important research issues which plays a critical role in disease progression and has prompted substantial scientific publications. Mutation extraction from published literature has become an increasingly important task, benefiting many downstream applications such as vaccine design and drug usage. However, most existing approaches have low performances in extracting virus mutation due to both lack of precise virus mutation information and their development based on human gene mutations. RESULTS: We developed ViMRT, a text-mining tool and search engine for automated virus mutation recognition using natural language processing. ViMRT mainly developed 8 optimized rules and 12 regular expressions based on a development dataset comprising 830 papers of 5 human severe disease-related viruses. It achieved higher performance than other tools in a test dataset (1662 papers, 99.17% in F1-score) and has been applied well to two other viruses, influenza virus and severe acute respiratory syndrome coronavirus-2 (212 papers, 96.99% in F1-score). These results indicate that ViMRT is a high-performance method for the extraction of virus mutation from the biomedical literature. Besides, we present a search engine for researchers to quickly find and accurately search virus mutation-related information including virus genes and related diseases. AVAILABILITY AND IMPLEMENTATION: ViMRT software is freely available at http://bmtongji.cn:1225/mutation/index. Oxford University Press 2022-11-07 /pmc/articles/PMC9805560/ /pubmed/36342236 http://dx.doi.org/10.1093/bioinformatics/btac721 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Tong, Yuantao
Tan, Fanglin
Huang, Honglian
Zhang, Zeyu
Zong, Hui
Xie, Yujia
Huang, Danqi
Cheng, Shiyang
Wei, Ziyi
Fang, Meng
Crabbe, M James C
Wang, Ying
Zhang, Xiaoyan
ViMRT: a text-mining tool and search engine for automated virus mutation recognition
title ViMRT: a text-mining tool and search engine for automated virus mutation recognition
title_full ViMRT: a text-mining tool and search engine for automated virus mutation recognition
title_fullStr ViMRT: a text-mining tool and search engine for automated virus mutation recognition
title_full_unstemmed ViMRT: a text-mining tool and search engine for automated virus mutation recognition
title_short ViMRT: a text-mining tool and search engine for automated virus mutation recognition
title_sort vimrt: a text-mining tool and search engine for automated virus mutation recognition
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9805560/
https://www.ncbi.nlm.nih.gov/pubmed/36342236
http://dx.doi.org/10.1093/bioinformatics/btac721
work_keys_str_mv AT tongyuantao vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT tanfanglin vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT huanghonglian vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT zhangzeyu vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT zonghui vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT xieyujia vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT huangdanqi vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT chengshiyang vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT weiziyi vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT fangmeng vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT crabbemjamesc vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT wangying vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition
AT zhangxiaoyan vimrtatextminingtoolandsearchengineforautomatedvirusmutationrecognition