Cargando…

Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data

Background/Aim. Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods. Eleven real gene ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Yılmaz Isıkhan, Selen, Karabulut, Erdem, Alpar, Celal Reha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206477/
https://www.ncbi.nlm.nih.gov/pubmed/28096893
http://dx.doi.org/10.1155/2016/6794916
_version_ 1782490265674579968
author Yılmaz Isıkhan, Selen
Karabulut, Erdem
Alpar, Celal Reha
author_facet Yılmaz Isıkhan, Selen
Karabulut, Erdem
Alpar, Celal Reha
author_sort Yılmaz Isıkhan, Selen
collection PubMed
description Background/Aim. Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods. Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results. The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion. Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT.
format Online
Article
Text
id pubmed-5206477
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-52064772017-01-17 Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data Yılmaz Isıkhan, Selen Karabulut, Erdem Alpar, Celal Reha Comput Math Methods Med Research Article Background/Aim. Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods. Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results. The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion. Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT. Hindawi Publishing Corporation 2016 2016-12-20 /pmc/articles/PMC5206477/ /pubmed/28096893 http://dx.doi.org/10.1155/2016/6794916 Text en Copyright © 2016 Selen Yılmaz Isıkhan et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yılmaz Isıkhan, Selen
Karabulut, Erdem
Alpar, Celal Reha
Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data
title Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data
title_full Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data
title_fullStr Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data
title_full_unstemmed Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data
title_short Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data
title_sort determining cutoff point of ensemble trees based on sample size in predicting clinical dose with dna microarray data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206477/
https://www.ncbi.nlm.nih.gov/pubmed/28096893
http://dx.doi.org/10.1155/2016/6794916
work_keys_str_mv AT yılmazisıkhanselen determiningcutoffpointofensembletreesbasedonsamplesizeinpredictingclinicaldosewithdnamicroarraydata
AT karabuluterdem determiningcutoffpointofensembletreesbasedonsamplesizeinpredictingclinicaldosewithdnamicroarraydata
AT alparcelalreha determiningcutoffpointofensembletreesbasedonsamplesizeinpredictingclinicaldosewithdnamicroarraydata