Cargando…

Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets

Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost o...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yu-Hang, Huang, Tao, Chen, Lei, Xu, YaoChen, Hu, Yu, Hu, Lan-Dian, Cai, Yudong, Kong, Xiangyin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Impact Journals LLC 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5675649/
https://www.ncbi.nlm.nih.gov/pubmed/29152097
http://dx.doi.org/10.18632/oncotarget.20903
_version_ 1783276947161219072
author Zhang, Yu-Hang
Huang, Tao
Chen, Lei
Xu, YaoChen
Hu, Yu
Hu, Lan-Dian
Cai, Yudong
Kong, Xiangyin
author_facet Zhang, Yu-Hang
Huang, Tao
Chen, Lei
Xu, YaoChen
Hu, Yu
Hu, Lan-Dian
Cai, Yudong
Kong, Xiangyin
author_sort Zhang, Yu-Hang
collection PubMed
description Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost of detection according to a personalized expression profile. However, few studies have been performed to analyze this type of data, which can promote more effective methods for detection of different cancer subtypes. In this study, we applied some reliable machine learning algorithms to analyze data retrieved from patients who had one of six cancer subtypes (breast cancer, colorectal cancer, glioblastoma, hepatobiliary cancer, lung cancer and pancreatic cancer) as well as healthy persons. Quantitative gene expression profiles were used to encode each sample. Then, they were analyzed by the maximum relevance minimum redundancy method. Two feature lists were obtained in which genes were ranked rigorously. The incremental feature selection method was applied to the mRMR feature list to extract the optimal feature subset, which can be used in the support vector machine algorithm to determine the best performance for the detection of cancer subtypes and healthy controls. The ten-fold cross-validation for the constructed optimal classification model yielded an overall accuracy of 0.751. On the other hand, we extracted the top eighteen features (genes), including TTN, RHOH, RPS20, TRBC2, in another feature list, the MaxRel feature list, and performed a detailed analysis of them. The results indicated that these genes could be important biomarkers for discriminating different cancer subtypes and healthy controls.
format Online
Article
Text
id pubmed-5675649
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Impact Journals LLC
record_format MEDLINE/PubMed
spelling pubmed-56756492017-11-18 Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets Zhang, Yu-Hang Huang, Tao Chen, Lei Xu, YaoChen Hu, Yu Hu, Lan-Dian Cai, Yudong Kong, Xiangyin Oncotarget Research Paper Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost of detection according to a personalized expression profile. However, few studies have been performed to analyze this type of data, which can promote more effective methods for detection of different cancer subtypes. In this study, we applied some reliable machine learning algorithms to analyze data retrieved from patients who had one of six cancer subtypes (breast cancer, colorectal cancer, glioblastoma, hepatobiliary cancer, lung cancer and pancreatic cancer) as well as healthy persons. Quantitative gene expression profiles were used to encode each sample. Then, they were analyzed by the maximum relevance minimum redundancy method. Two feature lists were obtained in which genes were ranked rigorously. The incremental feature selection method was applied to the mRMR feature list to extract the optimal feature subset, which can be used in the support vector machine algorithm to determine the best performance for the detection of cancer subtypes and healthy controls. The ten-fold cross-validation for the constructed optimal classification model yielded an overall accuracy of 0.751. On the other hand, we extracted the top eighteen features (genes), including TTN, RHOH, RPS20, TRBC2, in another feature list, the MaxRel feature list, and performed a detailed analysis of them. The results indicated that these genes could be important biomarkers for discriminating different cancer subtypes and healthy controls. Impact Journals LLC 2017-09-15 /pmc/articles/PMC5675649/ /pubmed/29152097 http://dx.doi.org/10.18632/oncotarget.20903 Text en Copyright: © 2017 Zhang et al. http://creativecommons.org/licenses/by/3.0/ This article is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/) (CC-BY), which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Research Paper
Zhang, Yu-Hang
Huang, Tao
Chen, Lei
Xu, YaoChen
Hu, Yu
Hu, Lan-Dian
Cai, Yudong
Kong, Xiangyin
Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
title Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
title_full Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
title_fullStr Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
title_full_unstemmed Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
title_short Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
title_sort identifying and analyzing different cancer subtypes using rna-seq data of blood platelets
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5675649/
https://www.ncbi.nlm.nih.gov/pubmed/29152097
http://dx.doi.org/10.18632/oncotarget.20903
work_keys_str_mv AT zhangyuhang identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT huangtao identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT chenlei identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT xuyaochen identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT huyu identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT hulandian identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT caiyudong identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets
AT kongxiangyin identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets