Cargando…
Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning?
There is a growing need to build a model that uses single cell RNA-seq (scRNA-seq) to separate malignant cells from nonmalignant cells and to identify tumor of origin of single cells and/or circulating tumor cells (CTCs). Currently, it is infeasible to build a tumor of origin model learnt from scRNA...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9162953/ https://www.ncbi.nlm.nih.gov/pubmed/35685355 http://dx.doi.org/10.1016/j.csbj.2022.05.035 |
_version_ | 1784719823470592000 |
---|---|
author | Liu, Hua-Ping Wang, Dongwen Lai, Hung-Ming |
author_facet | Liu, Hua-Ping Wang, Dongwen Lai, Hung-Ming |
author_sort | Liu, Hua-Ping |
collection | PubMed |
description | There is a growing need to build a model that uses single cell RNA-seq (scRNA-seq) to separate malignant cells from nonmalignant cells and to identify tumor of origin of single cells and/or circulating tumor cells (CTCs). Currently, it is infeasible to build a tumor of origin model learnt from scRNA-seq by machine learning (ML). We then wondered if an ML model learnt from bulk transcriptomes is applicable to scRNA-seq to infer single cells’ tumor presence and further indicate their tumor of origin. We used k-nearest neighbors, one-versus-all support vector machine, one-versus-one support vector machine, random forest and introduced scTumorTrace to conduct a pioneering experiment containing leukocytes and seven major cancer types where bulk RNA-seq and scRNA-seq data were available. 13 ML models learnt from bulk RNA-seq were all reliable to use (F-score > 96%) shown by a validation set of bulk transcriptomes, but none of them was applicable to scRNA-seq except scTumorTrace. Making inferences from bulk RNA-seq to scRNA-seq was impaired by feature selection and improved by log2-transformed TPM units. scTumorTrace with transcriptome-wide 2-tuples showed F-score beyond 98.74 and 94.29% in inferring tumor presence and tumor of origin at single-cell resolution and correctly identified 45 single candidate prostate CTCs but lineage-confirmed non-CTCs as leukocytes. We concluded that modern ML techniques are quantitative and could hardly address the raised questions. scTumorTrace with transcriptome-wide 2-tuples is qualitative, standardization-free and not subject to log2-transformed quantities, enabling us to infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes. |
format | Online Article Text |
id | pubmed-9162953 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-91629532022-06-08 Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? Liu, Hua-Ping Wang, Dongwen Lai, Hung-Ming Comput Struct Biotechnol J Research Article There is a growing need to build a model that uses single cell RNA-seq (scRNA-seq) to separate malignant cells from nonmalignant cells and to identify tumor of origin of single cells and/or circulating tumor cells (CTCs). Currently, it is infeasible to build a tumor of origin model learnt from scRNA-seq by machine learning (ML). We then wondered if an ML model learnt from bulk transcriptomes is applicable to scRNA-seq to infer single cells’ tumor presence and further indicate their tumor of origin. We used k-nearest neighbors, one-versus-all support vector machine, one-versus-one support vector machine, random forest and introduced scTumorTrace to conduct a pioneering experiment containing leukocytes and seven major cancer types where bulk RNA-seq and scRNA-seq data were available. 13 ML models learnt from bulk RNA-seq were all reliable to use (F-score > 96%) shown by a validation set of bulk transcriptomes, but none of them was applicable to scRNA-seq except scTumorTrace. Making inferences from bulk RNA-seq to scRNA-seq was impaired by feature selection and improved by log2-transformed TPM units. scTumorTrace with transcriptome-wide 2-tuples showed F-score beyond 98.74 and 94.29% in inferring tumor presence and tumor of origin at single-cell resolution and correctly identified 45 single candidate prostate CTCs but lineage-confirmed non-CTCs as leukocytes. We concluded that modern ML techniques are quantitative and could hardly address the raised questions. scTumorTrace with transcriptome-wide 2-tuples is qualitative, standardization-free and not subject to log2-transformed quantities, enabling us to infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes. Research Network of Computational and Structural Biotechnology 2022-05-23 /pmc/articles/PMC9162953/ /pubmed/35685355 http://dx.doi.org/10.1016/j.csbj.2022.05.035 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Liu, Hua-Ping Wang, Dongwen Lai, Hung-Ming Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
title | Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
title_full | Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
title_fullStr | Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
title_full_unstemmed | Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
title_short | Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
title_sort | can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning? |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9162953/ https://www.ncbi.nlm.nih.gov/pubmed/35685355 http://dx.doi.org/10.1016/j.csbj.2022.05.035 |
work_keys_str_mv | AT liuhuaping canweinfertumorpresenceofsinglecelltranscriptomesandtheirtumoroforiginfrombulktranscriptomesbymachinelearning AT wangdongwen canweinfertumorpresenceofsinglecelltranscriptomesandtheirtumoroforiginfrombulktranscriptomesbymachinelearning AT laihungming canweinfertumorpresenceofsinglecelltranscriptomesandtheirtumoroforiginfrombulktranscriptomesbymachinelearning |