Cargando…

RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process

This paper introduces an approach to classification of RNA-seq read counts using grey relational analysis (GRA) and Bayesian Gaussian process (GP) models. Read counts are transformed to microarray-like data to facilitate normal-based statistical methods. GRA is designed to select differentially expr...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Thanh, Bhatti, Asim, Yang, Samuel, Nahavandi, Saeid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5082617/
https://www.ncbi.nlm.nih.gov/pubmed/27783633
http://dx.doi.org/10.1371/journal.pone.0164766
_version_ 1782463095997726720
author Nguyen, Thanh
Bhatti, Asim
Yang, Samuel
Nahavandi, Saeid
author_facet Nguyen, Thanh
Bhatti, Asim
Yang, Samuel
Nahavandi, Saeid
author_sort Nguyen, Thanh
collection PubMed
description This paper introduces an approach to classification of RNA-seq read counts using grey relational analysis (GRA) and Bayesian Gaussian process (GP) models. Read counts are transformed to microarray-like data to facilitate normal-based statistical methods. GRA is designed to select differentially expressed genes by integrating outcomes of five individual feature selection methods including two-sample t-test, entropy test, Bhattacharyya distance, Wilcoxon test and receiver operating characteristic curve. GRA performs as an aggregate filter method through combining advantages of the individual methods to produce significant feature subsets that are then fed into a nonparametric GP model for classification. The proposed approach is verified by using two benchmark real datasets and the five-fold cross-validation method. Experimental results show the performance dominance of the GRA-based feature selection method as well as GP classifier against their competing methods. Moreover, the results demonstrate that GRA-GP considerably dominates the sparse Poisson linear discriminant analysis classifiers, which were introduced specifically for read counts, on different number of features. The proposed approach therefore can be implemented effectively in real practice for read count data analysis, which is useful in many applications including understanding disease pathogenesis, diagnosis and treatment monitoring at the molecular level.
format Online
Article
Text
id pubmed-5082617
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-50826172016-11-04 RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process Nguyen, Thanh Bhatti, Asim Yang, Samuel Nahavandi, Saeid PLoS One Research Article This paper introduces an approach to classification of RNA-seq read counts using grey relational analysis (GRA) and Bayesian Gaussian process (GP) models. Read counts are transformed to microarray-like data to facilitate normal-based statistical methods. GRA is designed to select differentially expressed genes by integrating outcomes of five individual feature selection methods including two-sample t-test, entropy test, Bhattacharyya distance, Wilcoxon test and receiver operating characteristic curve. GRA performs as an aggregate filter method through combining advantages of the individual methods to produce significant feature subsets that are then fed into a nonparametric GP model for classification. The proposed approach is verified by using two benchmark real datasets and the five-fold cross-validation method. Experimental results show the performance dominance of the GRA-based feature selection method as well as GP classifier against their competing methods. Moreover, the results demonstrate that GRA-GP considerably dominates the sparse Poisson linear discriminant analysis classifiers, which were introduced specifically for read counts, on different number of features. The proposed approach therefore can be implemented effectively in real practice for read count data analysis, which is useful in many applications including understanding disease pathogenesis, diagnosis and treatment monitoring at the molecular level. Public Library of Science 2016-10-26 /pmc/articles/PMC5082617/ /pubmed/27783633 http://dx.doi.org/10.1371/journal.pone.0164766 Text en © 2016 Nguyen et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Nguyen, Thanh
Bhatti, Asim
Yang, Samuel
Nahavandi, Saeid
RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process
title RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process
title_full RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process
title_fullStr RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process
title_full_unstemmed RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process
title_short RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process
title_sort rna-seq count data modelling by grey relational analysis and nonparametric gaussian process
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5082617/
https://www.ncbi.nlm.nih.gov/pubmed/27783633
http://dx.doi.org/10.1371/journal.pone.0164766
work_keys_str_mv AT nguyenthanh rnaseqcountdatamodellingbygreyrelationalanalysisandnonparametricgaussianprocess
AT bhattiasim rnaseqcountdatamodellingbygreyrelationalanalysisandnonparametricgaussianprocess
AT yangsamuel rnaseqcountdatamodellingbygreyrelationalanalysisandnonparametricgaussianprocess
AT nahavandisaeid rnaseqcountdatamodellingbygreyrelationalanalysisandnonparametricgaussianprocess