Cargando…

REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis

BACKGROUND: Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direc...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Mengmeng, Zhong, Yunshan, Liu, Hongqian, Liang, Desheng, Liu, Erhong, Zhang, Yu, Tian, Feng, Liang, Qiaowei, Cram, David S., Wang, Hua, Wu, Lingqian, Yu, Fuli
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7667294/
https://www.ncbi.nlm.nih.gov/pubmed/32961042
http://dx.doi.org/10.1002/mgg3.1488
_version_ 1783610280666726400
author Liu, Mengmeng
Zhong, Yunshan
Liu, Hongqian
Liang, Desheng
Liu, Erhong
Zhang, Yu
Tian, Feng
Liang, Qiaowei
Cram, David S.
Wang, Hua
Wu, Lingqian
Yu, Fuli
author_facet Liu, Mengmeng
Zhong, Yunshan
Liu, Hongqian
Liang, Desheng
Liu, Erhong
Zhang, Yu
Tian, Feng
Liang, Qiaowei
Cram, David S.
Wang, Hua
Wu, Lingqian
Yu, Fuli
author_sort Liu, Mengmeng
collection PubMed
description BACKGROUND: Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples. METHODS: We applied natural language process (NLP) methods for analyzing 30,235 in‐house historical clinical reports through active learning, and then, developed clinical knowledge bases, evidence‐based interpretation methods and reporting criteria to support the whole postdetection pipeline. RESULTS: Of the 30,235 reports, we obtained 37,175 CNV‐paragraph pairs. For these pairs, the active learning approaches achieved a 0.9466 average F1‐score in sentence classification. The overall accuracy for variant classification was 95.7%, 95.2%, and 100.0% in retrospective, prospective, and clinical utility experiments, respectively. CONCLUSION: By integrating NLP methods in CNVs postdetection pipeline, REDBot is a robust and rapid tool with clinical utility for prenatal and POC diagnosis.
format Online
Article
Text
id pubmed-7667294
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-76672942020-11-20 REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis Liu, Mengmeng Zhong, Yunshan Liu, Hongqian Liang, Desheng Liu, Erhong Zhang, Yu Tian, Feng Liang, Qiaowei Cram, David S. Wang, Hua Wu, Lingqian Yu, Fuli Mol Genet Genomic Med Method BACKGROUND: Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples. METHODS: We applied natural language process (NLP) methods for analyzing 30,235 in‐house historical clinical reports through active learning, and then, developed clinical knowledge bases, evidence‐based interpretation methods and reporting criteria to support the whole postdetection pipeline. RESULTS: Of the 30,235 reports, we obtained 37,175 CNV‐paragraph pairs. For these pairs, the active learning approaches achieved a 0.9466 average F1‐score in sentence classification. The overall accuracy for variant classification was 95.7%, 95.2%, and 100.0% in retrospective, prospective, and clinical utility experiments, respectively. CONCLUSION: By integrating NLP methods in CNVs postdetection pipeline, REDBot is a robust and rapid tool with clinical utility for prenatal and POC diagnosis. John Wiley and Sons Inc. 2020-09-22 /pmc/articles/PMC7667294/ /pubmed/32961042 http://dx.doi.org/10.1002/mgg3.1488 Text en © 2020 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals LLC This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method
Liu, Mengmeng
Zhong, Yunshan
Liu, Hongqian
Liang, Desheng
Liu, Erhong
Zhang, Yu
Tian, Feng
Liang, Qiaowei
Cram, David S.
Wang, Hua
Wu, Lingqian
Yu, Fuli
REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
title REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
title_full REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
title_fullStr REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
title_full_unstemmed REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
title_short REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
title_sort redbot: natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7667294/
https://www.ncbi.nlm.nih.gov/pubmed/32961042
http://dx.doi.org/10.1002/mgg3.1488
work_keys_str_mv AT liumengmeng redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT zhongyunshan redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT liuhongqian redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT liangdesheng redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT liuerhong redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT zhangyu redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT tianfeng redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT liangqiaowei redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT cramdavids redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT wanghua redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT wulingqian redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis
AT yufuli redbotnaturallanguageprocessmethodsforclinicalcopynumbervariationreportinginprenatalandproductsofconceptiondiagnosis