Cargando…

Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci

Cancer of unknown primary (CUP) refers to cancer with primary lesion unidentifiable by regular pathological and clinical diagnostic methods. This kind of cancer is extremely difficult to treat, and patients with CUP usually have a very short survival time. Recent studies have suggested that cancer t...

Descripción completa

Detalles Bibliográficos
Autores principales: Miao, Yongchang, Zhang, Xueliang, Chen, Sijie, Zhou, Wenjing, Xu, Dalai, Shi, Xiaoli, Li, Jian, Tu, Jinhui, Yuan, Xuelian, Lv, Kebo, Tian, Geng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396384/
https://www.ncbi.nlm.nih.gov/pubmed/36016607
http://dx.doi.org/10.3389/fonc.2022.946552
_version_ 1784771919990489088
author Miao, Yongchang
Zhang, Xueliang
Chen, Sijie
Zhou, Wenjing
Xu, Dalai
Shi, Xiaoli
Li, Jian
Tu, Jinhui
Yuan, Xuelian
Lv, Kebo
Tian, Geng
author_facet Miao, Yongchang
Zhang, Xueliang
Chen, Sijie
Zhou, Wenjing
Xu, Dalai
Shi, Xiaoli
Li, Jian
Tu, Jinhui
Yuan, Xuelian
Lv, Kebo
Tian, Geng
author_sort Miao, Yongchang
collection PubMed
description Cancer of unknown primary (CUP) refers to cancer with primary lesion unidentifiable by regular pathological and clinical diagnostic methods. This kind of cancer is extremely difficult to treat, and patients with CUP usually have a very short survival time. Recent studies have suggested that cancer treatment targeting primary lesion will significantly improve the survival of CUP patients. Thus, it is critical to develop accurate yet fast methods to infer the tissue-of-origin (TOO) of CUP. In the past years, there are a few computational methods to infer TOO based on single omics data like gene expression, methylation, somatic mutation, and so on. However, the metastasis of tumor involves the interaction of multiple levels of biological molecules. In this study, we developed a novel computational method to predict TOO of CUP patients by explicitly integrating expression quantitative trait loci (eQTL) into an XGBoost classification model. We trained our model with The Cancer Genome Atlas (TCGA) data involving over 7,000 samples across 20 types of solid tumors. In the 10-fold cross-validation, the prediction accuracy of the model with eQTL was over 0.96, better than that without eQTL. In addition, we also tested our model in an independent data downloaded from Gene Expression Omnibus (GEO) consisting of 87 samples across 4 cancer types. The model also achieved an f1-score of 0.7–1 depending on different cancer types. In summary, eQTL was an important information in inferring cancer TOO and the model might be applied in clinical routine test for CUP patients in the future.
format Online
Article
Text
id pubmed-9396384
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93963842022-08-24 Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci Miao, Yongchang Zhang, Xueliang Chen, Sijie Zhou, Wenjing Xu, Dalai Shi, Xiaoli Li, Jian Tu, Jinhui Yuan, Xuelian Lv, Kebo Tian, Geng Front Oncol Oncology Cancer of unknown primary (CUP) refers to cancer with primary lesion unidentifiable by regular pathological and clinical diagnostic methods. This kind of cancer is extremely difficult to treat, and patients with CUP usually have a very short survival time. Recent studies have suggested that cancer treatment targeting primary lesion will significantly improve the survival of CUP patients. Thus, it is critical to develop accurate yet fast methods to infer the tissue-of-origin (TOO) of CUP. In the past years, there are a few computational methods to infer TOO based on single omics data like gene expression, methylation, somatic mutation, and so on. However, the metastasis of tumor involves the interaction of multiple levels of biological molecules. In this study, we developed a novel computational method to predict TOO of CUP patients by explicitly integrating expression quantitative trait loci (eQTL) into an XGBoost classification model. We trained our model with The Cancer Genome Atlas (TCGA) data involving over 7,000 samples across 20 types of solid tumors. In the 10-fold cross-validation, the prediction accuracy of the model with eQTL was over 0.96, better than that without eQTL. In addition, we also tested our model in an independent data downloaded from Gene Expression Omnibus (GEO) consisting of 87 samples across 4 cancer types. The model also achieved an f1-score of 0.7–1 depending on different cancer types. In summary, eQTL was an important information in inferring cancer TOO and the model might be applied in clinical routine test for CUP patients in the future. Frontiers Media S.A. 2022-08-09 /pmc/articles/PMC9396384/ /pubmed/36016607 http://dx.doi.org/10.3389/fonc.2022.946552 Text en Copyright © 2022 Miao, Zhang, Chen, Zhou, Xu, Shi, Li, Tu, Yuan, Lv and Tian https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Miao, Yongchang
Zhang, Xueliang
Chen, Sijie
Zhou, Wenjing
Xu, Dalai
Shi, Xiaoli
Li, Jian
Tu, Jinhui
Yuan, Xuelian
Lv, Kebo
Tian, Geng
Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
title Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
title_full Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
title_fullStr Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
title_full_unstemmed Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
title_short Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
title_sort identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396384/
https://www.ncbi.nlm.nih.gov/pubmed/36016607
http://dx.doi.org/10.3389/fonc.2022.946552
work_keys_str_mv AT miaoyongchang identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT zhangxueliang identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT chensijie identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT zhouwenjing identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT xudalai identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT shixiaoli identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT lijian identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT tujinhui identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT yuanxuelian identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT lvkebo identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci
AT tiangeng identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci