Cargando…

Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease

BACKGROUND: Although patients with chronic obstructive pulmonary disease (COPD) experience high morbidity and mortality worldwide, few biomarkers are available for COPD. Here, we analyzed potential biomarkers for the diagnosis of COPD by using word embedding. METHODS: To determine which biomarkers a...

Descripción completa

Detalles Bibliográficos
Autores principales: Heo, Jeongwon, Moon, Da Hye, Hong, Yoonki, Bak, So Hyeon, Kim, Jeeyoung, Park, Joo Hyun, Oh, Byoung-Doo, Kim, Yu-Seop, Kim, Woo Jin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Korean Academy of Medical Sciences 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8422037/
https://www.ncbi.nlm.nih.gov/pubmed/34490754
http://dx.doi.org/10.3346/jkms.2021.36.e224
_version_ 1783749204489797632
author Heo, Jeongwon
Moon, Da Hye
Hong, Yoonki
Bak, So Hyeon
Kim, Jeeyoung
Park, Joo Hyun
Oh, Byoung-Doo
Kim, Yu-Seop
Kim, Woo Jin
author_facet Heo, Jeongwon
Moon, Da Hye
Hong, Yoonki
Bak, So Hyeon
Kim, Jeeyoung
Park, Joo Hyun
Oh, Byoung-Doo
Kim, Yu-Seop
Kim, Woo Jin
author_sort Heo, Jeongwon
collection PubMed
description BACKGROUND: Although patients with chronic obstructive pulmonary disease (COPD) experience high morbidity and mortality worldwide, few biomarkers are available for COPD. Here, we analyzed potential biomarkers for the diagnosis of COPD by using word embedding. METHODS: To determine which biomarkers are likely to be associated with COPD, we selected respiratory disease-related biomarkers. Degrees of similarity between the 26 selected biomarkers and COPD were measured by word embedding. And we infer the similarity with COPD through the word embedding model trained in the large-capacity medical corpus, and search for biomarkers with high similarity among them. We used Word2Vec, Canonical Correlation Analysis, and Global Vector for word embedding. We evaluated the associations of selected biomarkers with COPD parameters in a cohort of patients with COPD. RESULTS: Cytokeratin 19 fragment (Cyfra 21-1) was selected because of its high similarity and its significant correlation with the COPD phenotype. Serum Cyfra 21-1 levels were determined in patients with COPD and controls (4.3 ± 5.9 vs. 3.9 ± 3.6 ng/mL, P = 0.611). The emphysema index was significantly correlated with the serum Cyfra 21-1 level (correlation coefficient = 0.219, P = 0.015). CONCLUSION: Word embedding may be used for the discovery of biomarkers for COPD and Cyfra 21-1 may be used as a biomarker for emphysema. Additional studies are needed to validate Cyfra 21-1 as a biomarker for COPD.
format Online
Article
Text
id pubmed-8422037
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Korean Academy of Medical Sciences
record_format MEDLINE/PubMed
spelling pubmed-84220372021-09-15 Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease Heo, Jeongwon Moon, Da Hye Hong, Yoonki Bak, So Hyeon Kim, Jeeyoung Park, Joo Hyun Oh, Byoung-Doo Kim, Yu-Seop Kim, Woo Jin J Korean Med Sci Original Article BACKGROUND: Although patients with chronic obstructive pulmonary disease (COPD) experience high morbidity and mortality worldwide, few biomarkers are available for COPD. Here, we analyzed potential biomarkers for the diagnosis of COPD by using word embedding. METHODS: To determine which biomarkers are likely to be associated with COPD, we selected respiratory disease-related biomarkers. Degrees of similarity between the 26 selected biomarkers and COPD were measured by word embedding. And we infer the similarity with COPD through the word embedding model trained in the large-capacity medical corpus, and search for biomarkers with high similarity among them. We used Word2Vec, Canonical Correlation Analysis, and Global Vector for word embedding. We evaluated the associations of selected biomarkers with COPD parameters in a cohort of patients with COPD. RESULTS: Cytokeratin 19 fragment (Cyfra 21-1) was selected because of its high similarity and its significant correlation with the COPD phenotype. Serum Cyfra 21-1 levels were determined in patients with COPD and controls (4.3 ± 5.9 vs. 3.9 ± 3.6 ng/mL, P = 0.611). The emphysema index was significantly correlated with the serum Cyfra 21-1 level (correlation coefficient = 0.219, P = 0.015). CONCLUSION: Word embedding may be used for the discovery of biomarkers for COPD and Cyfra 21-1 may be used as a biomarker for emphysema. Additional studies are needed to validate Cyfra 21-1 as a biomarker for COPD. The Korean Academy of Medical Sciences 2021-08-02 /pmc/articles/PMC8422037/ /pubmed/34490754 http://dx.doi.org/10.3346/jkms.2021.36.e224 Text en © 2021 The Korean Academy of Medical Sciences. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Heo, Jeongwon
Moon, Da Hye
Hong, Yoonki
Bak, So Hyeon
Kim, Jeeyoung
Park, Joo Hyun
Oh, Byoung-Doo
Kim, Yu-Seop
Kim, Woo Jin
Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
title Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
title_full Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
title_fullStr Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
title_full_unstemmed Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
title_short Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease
title_sort word embedding reveals cyfra 21-1 as a biomarker for chronic obstructive pulmonary disease
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8422037/
https://www.ncbi.nlm.nih.gov/pubmed/34490754
http://dx.doi.org/10.3346/jkms.2021.36.e224
work_keys_str_mv AT heojeongwon wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT moondahye wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT hongyoonki wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT baksohyeon wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT kimjeeyoung wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT parkjoohyun wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT ohbyoungdoo wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT kimyuseop wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease
AT kimwoojin wordembeddingrevealscyfra211asabiomarkerforchronicobstructivepulmonarydisease