Cargando…

Application of BERT to Enable Gene Classification Based on Clinical Evidence

The identification of profiled cancer-related genes plays an essential role in cancer diagnosis and treatment. Based on literature research, the classification of genetic mutations continues to be done manually nowadays. Manual classification of genetic mutations is pathologist-dependent, subjective...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Yuhan, Xiang, Hongxin, Xie, Haotian, Yu, Yong, Dong, Shiyan, Yang, Zhaogang, Zhao, Na
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7563092/
https://www.ncbi.nlm.nih.gov/pubmed/33083472
http://dx.doi.org/10.1155/2020/5491963
_version_ 1783595414010724352
author Su, Yuhan
Xiang, Hongxin
Xie, Haotian
Yu, Yong
Dong, Shiyan
Yang, Zhaogang
Zhao, Na
author_facet Su, Yuhan
Xiang, Hongxin
Xie, Haotian
Yu, Yong
Dong, Shiyan
Yang, Zhaogang
Zhao, Na
author_sort Su, Yuhan
collection PubMed
description The identification of profiled cancer-related genes plays an essential role in cancer diagnosis and treatment. Based on literature research, the classification of genetic mutations continues to be done manually nowadays. Manual classification of genetic mutations is pathologist-dependent, subjective, and time-consuming. To improve the accuracy of clinical interpretation, scientists have proposed computational-based approaches for automatic analysis of mutations with the advent of next-generation sequencing technologies. Nevertheless, some challenges, such as multiple classifications, the complexity of texts, redundant descriptions, and inconsistent interpretation, have limited the development of algorithms. To overcome these difficulties, we have adapted a deep learning method named Bidirectional Encoder Representations from Transformers (BERT) to classify genetic mutations based on text evidence from an annotated database. During the training, three challenging features such as the extreme length of texts, biased data presentation, and high repeatability were addressed. Finally, the BERT+abstract demonstrates satisfactory results with 0.80 logarithmic loss, 0.6837 recall, and 0.705 F-measure. It is feasible for BERT to classify the genomic mutation text within literature-based datasets. Consequently, BERT is a practical tool for facilitating and significantly speeding up cancer research towards tumor progression, diagnosis, and the design of more precise and effective treatments.
format Online
Article
Text
id pubmed-7563092
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-75630922020-10-19 Application of BERT to Enable Gene Classification Based on Clinical Evidence Su, Yuhan Xiang, Hongxin Xie, Haotian Yu, Yong Dong, Shiyan Yang, Zhaogang Zhao, Na Biomed Res Int Research Article The identification of profiled cancer-related genes plays an essential role in cancer diagnosis and treatment. Based on literature research, the classification of genetic mutations continues to be done manually nowadays. Manual classification of genetic mutations is pathologist-dependent, subjective, and time-consuming. To improve the accuracy of clinical interpretation, scientists have proposed computational-based approaches for automatic analysis of mutations with the advent of next-generation sequencing technologies. Nevertheless, some challenges, such as multiple classifications, the complexity of texts, redundant descriptions, and inconsistent interpretation, have limited the development of algorithms. To overcome these difficulties, we have adapted a deep learning method named Bidirectional Encoder Representations from Transformers (BERT) to classify genetic mutations based on text evidence from an annotated database. During the training, three challenging features such as the extreme length of texts, biased data presentation, and high repeatability were addressed. Finally, the BERT+abstract demonstrates satisfactory results with 0.80 logarithmic loss, 0.6837 recall, and 0.705 F-measure. It is feasible for BERT to classify the genomic mutation text within literature-based datasets. Consequently, BERT is a practical tool for facilitating and significantly speeding up cancer research towards tumor progression, diagnosis, and the design of more precise and effective treatments. Hindawi 2020-10-07 /pmc/articles/PMC7563092/ /pubmed/33083472 http://dx.doi.org/10.1155/2020/5491963 Text en Copyright © 2020 Yuhan Su et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Su, Yuhan
Xiang, Hongxin
Xie, Haotian
Yu, Yong
Dong, Shiyan
Yang, Zhaogang
Zhao, Na
Application of BERT to Enable Gene Classification Based on Clinical Evidence
title Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_full Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_fullStr Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_full_unstemmed Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_short Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_sort application of bert to enable gene classification based on clinical evidence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7563092/
https://www.ncbi.nlm.nih.gov/pubmed/33083472
http://dx.doi.org/10.1155/2020/5491963
work_keys_str_mv AT suyuhan applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT xianghongxin applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT xiehaotian applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT yuyong applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT dongshiyan applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT yangzhaogang applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT zhaona applicationofberttoenablegeneclassificationbasedonclinicalevidence