Cargando…

Protocol for the automatic extraction of epidemiological information via a pre-trained language model

The lack of systems to automatically extract epidemiological fields from open-access COVID-19 cases restricts the timeliness of formulating prevention measures. Here we present a protocol for using CCIE, a COVID-19 Cases Information Extraction system based on the pre-trained language model.(1) We de...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Zhizheng, Liu, Xiao Fan, Du, Zhanwei, Wang, Lin, Wu, Ye, Holme, Petter, Lachmann, Michael, Lin, Hongfei, Wang, Zhuoyue, Cao, Yu, Wong, Zoie S.Y., Xu, Xiao-Ke, Sun, Yuanyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328978/
https://www.ncbi.nlm.nih.gov/pubmed/37393610
http://dx.doi.org/10.1016/j.xpro.2023.102392
_version_ 1785069923712630784
author Wang, Zhizheng
Liu, Xiao Fan
Du, Zhanwei
Wang, Lin
Wu, Ye
Holme, Petter
Lachmann, Michael
Lin, Hongfei
Wang, Zhuoyue
Cao, Yu
Wong, Zoie S.Y.
Xu, Xiao-Ke
Sun, Yuanyuan
author_facet Wang, Zhizheng
Liu, Xiao Fan
Du, Zhanwei
Wang, Lin
Wu, Ye
Holme, Petter
Lachmann, Michael
Lin, Hongfei
Wang, Zhuoyue
Cao, Yu
Wong, Zoie S.Y.
Xu, Xiao-Ke
Sun, Yuanyuan
author_sort Wang, Zhizheng
collection PubMed
description The lack of systems to automatically extract epidemiological fields from open-access COVID-19 cases restricts the timeliness of formulating prevention measures. Here we present a protocol for using CCIE, a COVID-19 Cases Information Extraction system based on the pre-trained language model.(1) We describe steps for preparing supervised training data and executing python scripts for named entity recognition and text category classification. We then detail the use of machine evaluation and manual validation to illustrate the effectiveness of CCIE. For complete details on the use and execution of this protocol, please refer to Wang et al.(2)
format Online
Article
Text
id pubmed-10328978
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-103289782023-07-09 Protocol for the automatic extraction of epidemiological information via a pre-trained language model Wang, Zhizheng Liu, Xiao Fan Du, Zhanwei Wang, Lin Wu, Ye Holme, Petter Lachmann, Michael Lin, Hongfei Wang, Zhuoyue Cao, Yu Wong, Zoie S.Y. Xu, Xiao-Ke Sun, Yuanyuan STAR Protoc Protocol The lack of systems to automatically extract epidemiological fields from open-access COVID-19 cases restricts the timeliness of formulating prevention measures. Here we present a protocol for using CCIE, a COVID-19 Cases Information Extraction system based on the pre-trained language model.(1) We describe steps for preparing supervised training data and executing python scripts for named entity recognition and text category classification. We then detail the use of machine evaluation and manual validation to illustrate the effectiveness of CCIE. For complete details on the use and execution of this protocol, please refer to Wang et al.(2) Elsevier 2023-07-01 /pmc/articles/PMC10328978/ /pubmed/37393610 http://dx.doi.org/10.1016/j.xpro.2023.102392 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Protocol
Wang, Zhizheng
Liu, Xiao Fan
Du, Zhanwei
Wang, Lin
Wu, Ye
Holme, Petter
Lachmann, Michael
Lin, Hongfei
Wang, Zhuoyue
Cao, Yu
Wong, Zoie S.Y.
Xu, Xiao-Ke
Sun, Yuanyuan
Protocol for the automatic extraction of epidemiological information via a pre-trained language model
title Protocol for the automatic extraction of epidemiological information via a pre-trained language model
title_full Protocol for the automatic extraction of epidemiological information via a pre-trained language model
title_fullStr Protocol for the automatic extraction of epidemiological information via a pre-trained language model
title_full_unstemmed Protocol for the automatic extraction of epidemiological information via a pre-trained language model
title_short Protocol for the automatic extraction of epidemiological information via a pre-trained language model
title_sort protocol for the automatic extraction of epidemiological information via a pre-trained language model
topic Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328978/
https://www.ncbi.nlm.nih.gov/pubmed/37393610
http://dx.doi.org/10.1016/j.xpro.2023.102392
work_keys_str_mv AT wangzhizheng protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT liuxiaofan protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT duzhanwei protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT wanglin protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT wuye protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT holmepetter protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT lachmannmichael protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT linhongfei protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT wangzhuoyue protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT caoyu protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT wongzoiesy protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT xuxiaoke protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel
AT sunyuanyuan protocolfortheautomaticextractionofepidemiologicalinformationviaapretrainedlanguagemodel