Cargando…
Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets
Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost o...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Impact Journals LLC
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5675649/ https://www.ncbi.nlm.nih.gov/pubmed/29152097 http://dx.doi.org/10.18632/oncotarget.20903 |
_version_ | 1783276947161219072 |
---|---|
author | Zhang, Yu-Hang Huang, Tao Chen, Lei Xu, YaoChen Hu, Yu Hu, Lan-Dian Cai, Yudong Kong, Xiangyin |
author_facet | Zhang, Yu-Hang Huang, Tao Chen, Lei Xu, YaoChen Hu, Yu Hu, Lan-Dian Cai, Yudong Kong, Xiangyin |
author_sort | Zhang, Yu-Hang |
collection | PubMed |
description | Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost of detection according to a personalized expression profile. However, few studies have been performed to analyze this type of data, which can promote more effective methods for detection of different cancer subtypes. In this study, we applied some reliable machine learning algorithms to analyze data retrieved from patients who had one of six cancer subtypes (breast cancer, colorectal cancer, glioblastoma, hepatobiliary cancer, lung cancer and pancreatic cancer) as well as healthy persons. Quantitative gene expression profiles were used to encode each sample. Then, they were analyzed by the maximum relevance minimum redundancy method. Two feature lists were obtained in which genes were ranked rigorously. The incremental feature selection method was applied to the mRMR feature list to extract the optimal feature subset, which can be used in the support vector machine algorithm to determine the best performance for the detection of cancer subtypes and healthy controls. The ten-fold cross-validation for the constructed optimal classification model yielded an overall accuracy of 0.751. On the other hand, we extracted the top eighteen features (genes), including TTN, RHOH, RPS20, TRBC2, in another feature list, the MaxRel feature list, and performed a detailed analysis of them. The results indicated that these genes could be important biomarkers for discriminating different cancer subtypes and healthy controls. |
format | Online Article Text |
id | pubmed-5675649 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Impact Journals LLC |
record_format | MEDLINE/PubMed |
spelling | pubmed-56756492017-11-18 Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets Zhang, Yu-Hang Huang, Tao Chen, Lei Xu, YaoChen Hu, Yu Hu, Lan-Dian Cai, Yudong Kong, Xiangyin Oncotarget Research Paper Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost of detection according to a personalized expression profile. However, few studies have been performed to analyze this type of data, which can promote more effective methods for detection of different cancer subtypes. In this study, we applied some reliable machine learning algorithms to analyze data retrieved from patients who had one of six cancer subtypes (breast cancer, colorectal cancer, glioblastoma, hepatobiliary cancer, lung cancer and pancreatic cancer) as well as healthy persons. Quantitative gene expression profiles were used to encode each sample. Then, they were analyzed by the maximum relevance minimum redundancy method. Two feature lists were obtained in which genes were ranked rigorously. The incremental feature selection method was applied to the mRMR feature list to extract the optimal feature subset, which can be used in the support vector machine algorithm to determine the best performance for the detection of cancer subtypes and healthy controls. The ten-fold cross-validation for the constructed optimal classification model yielded an overall accuracy of 0.751. On the other hand, we extracted the top eighteen features (genes), including TTN, RHOH, RPS20, TRBC2, in another feature list, the MaxRel feature list, and performed a detailed analysis of them. The results indicated that these genes could be important biomarkers for discriminating different cancer subtypes and healthy controls. Impact Journals LLC 2017-09-15 /pmc/articles/PMC5675649/ /pubmed/29152097 http://dx.doi.org/10.18632/oncotarget.20903 Text en Copyright: © 2017 Zhang et al. http://creativecommons.org/licenses/by/3.0/ This article is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/) (CC-BY), which permits unrestricted use and redistribution provided that the original author and source are credited. |
spellingShingle | Research Paper Zhang, Yu-Hang Huang, Tao Chen, Lei Xu, YaoChen Hu, Yu Hu, Lan-Dian Cai, Yudong Kong, Xiangyin Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets |
title | Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets |
title_full | Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets |
title_fullStr | Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets |
title_full_unstemmed | Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets |
title_short | Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets |
title_sort | identifying and analyzing different cancer subtypes using rna-seq data of blood platelets |
topic | Research Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5675649/ https://www.ncbi.nlm.nih.gov/pubmed/29152097 http://dx.doi.org/10.18632/oncotarget.20903 |
work_keys_str_mv | AT zhangyuhang identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT huangtao identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT chenlei identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT xuyaochen identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT huyu identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT hulandian identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT caiyudong identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets AT kongxiangyin identifyingandanalyzingdifferentcancersubtypesusingrnaseqdataofbloodplatelets |