Cargando…
Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci
Cancer of unknown primary (CUP) refers to cancer with primary lesion unidentifiable by regular pathological and clinical diagnostic methods. This kind of cancer is extremely difficult to treat, and patients with CUP usually have a very short survival time. Recent studies have suggested that cancer t...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396384/ https://www.ncbi.nlm.nih.gov/pubmed/36016607 http://dx.doi.org/10.3389/fonc.2022.946552 |
_version_ | 1784771919990489088 |
---|---|
author | Miao, Yongchang Zhang, Xueliang Chen, Sijie Zhou, Wenjing Xu, Dalai Shi, Xiaoli Li, Jian Tu, Jinhui Yuan, Xuelian Lv, Kebo Tian, Geng |
author_facet | Miao, Yongchang Zhang, Xueliang Chen, Sijie Zhou, Wenjing Xu, Dalai Shi, Xiaoli Li, Jian Tu, Jinhui Yuan, Xuelian Lv, Kebo Tian, Geng |
author_sort | Miao, Yongchang |
collection | PubMed |
description | Cancer of unknown primary (CUP) refers to cancer with primary lesion unidentifiable by regular pathological and clinical diagnostic methods. This kind of cancer is extremely difficult to treat, and patients with CUP usually have a very short survival time. Recent studies have suggested that cancer treatment targeting primary lesion will significantly improve the survival of CUP patients. Thus, it is critical to develop accurate yet fast methods to infer the tissue-of-origin (TOO) of CUP. In the past years, there are a few computational methods to infer TOO based on single omics data like gene expression, methylation, somatic mutation, and so on. However, the metastasis of tumor involves the interaction of multiple levels of biological molecules. In this study, we developed a novel computational method to predict TOO of CUP patients by explicitly integrating expression quantitative trait loci (eQTL) into an XGBoost classification model. We trained our model with The Cancer Genome Atlas (TCGA) data involving over 7,000 samples across 20 types of solid tumors. In the 10-fold cross-validation, the prediction accuracy of the model with eQTL was over 0.96, better than that without eQTL. In addition, we also tested our model in an independent data downloaded from Gene Expression Omnibus (GEO) consisting of 87 samples across 4 cancer types. The model also achieved an f1-score of 0.7–1 depending on different cancer types. In summary, eQTL was an important information in inferring cancer TOO and the model might be applied in clinical routine test for CUP patients in the future. |
format | Online Article Text |
id | pubmed-9396384 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-93963842022-08-24 Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci Miao, Yongchang Zhang, Xueliang Chen, Sijie Zhou, Wenjing Xu, Dalai Shi, Xiaoli Li, Jian Tu, Jinhui Yuan, Xuelian Lv, Kebo Tian, Geng Front Oncol Oncology Cancer of unknown primary (CUP) refers to cancer with primary lesion unidentifiable by regular pathological and clinical diagnostic methods. This kind of cancer is extremely difficult to treat, and patients with CUP usually have a very short survival time. Recent studies have suggested that cancer treatment targeting primary lesion will significantly improve the survival of CUP patients. Thus, it is critical to develop accurate yet fast methods to infer the tissue-of-origin (TOO) of CUP. In the past years, there are a few computational methods to infer TOO based on single omics data like gene expression, methylation, somatic mutation, and so on. However, the metastasis of tumor involves the interaction of multiple levels of biological molecules. In this study, we developed a novel computational method to predict TOO of CUP patients by explicitly integrating expression quantitative trait loci (eQTL) into an XGBoost classification model. We trained our model with The Cancer Genome Atlas (TCGA) data involving over 7,000 samples across 20 types of solid tumors. In the 10-fold cross-validation, the prediction accuracy of the model with eQTL was over 0.96, better than that without eQTL. In addition, we also tested our model in an independent data downloaded from Gene Expression Omnibus (GEO) consisting of 87 samples across 4 cancer types. The model also achieved an f1-score of 0.7–1 depending on different cancer types. In summary, eQTL was an important information in inferring cancer TOO and the model might be applied in clinical routine test for CUP patients in the future. Frontiers Media S.A. 2022-08-09 /pmc/articles/PMC9396384/ /pubmed/36016607 http://dx.doi.org/10.3389/fonc.2022.946552 Text en Copyright © 2022 Miao, Zhang, Chen, Zhou, Xu, Shi, Li, Tu, Yuan, Lv and Tian https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Oncology Miao, Yongchang Zhang, Xueliang Chen, Sijie Zhou, Wenjing Xu, Dalai Shi, Xiaoli Li, Jian Tu, Jinhui Yuan, Xuelian Lv, Kebo Tian, Geng Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
title | Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
title_full | Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
title_fullStr | Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
title_full_unstemmed | Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
title_short | Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
title_sort | identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci |
topic | Oncology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9396384/ https://www.ncbi.nlm.nih.gov/pubmed/36016607 http://dx.doi.org/10.3389/fonc.2022.946552 |
work_keys_str_mv | AT miaoyongchang identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT zhangxueliang identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT chensijie identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT zhouwenjing identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT xudalai identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT shixiaoli identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT lijian identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT tujinhui identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT yuanxuelian identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT lvkebo identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci AT tiangeng identifyingcancertissueoforiginbyanovelmachinelearningmethodbasedonexpressionquantitativetraitloci |