Cargando…

AUCTSP: an improved biomarker gene pair class predictor

BACKGROUND: The Top Scoring Pair (TSP) classifier, based on the concept of relative ranking reversals in the expressions of pairs of genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. The idea th...

Descripción completa

Detalles Bibliográficos
Autores principales: Kagaris, Dimitri, Khamesipour, Alireza, Yiannoutsos, Constantin T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6020231/
https://www.ncbi.nlm.nih.gov/pubmed/29940833
http://dx.doi.org/10.1186/s12859-018-2231-1
_version_ 1783335249342627840
author Kagaris, Dimitri
Khamesipour, Alireza
Yiannoutsos, Constantin T.
author_facet Kagaris, Dimitri
Khamesipour, Alireza
Yiannoutsos, Constantin T.
author_sort Kagaris, Dimitri
collection PubMed
description BACKGROUND: The Top Scoring Pair (TSP) classifier, based on the concept of relative ranking reversals in the expressions of pairs of genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. The idea that differences in gene expression ranking are associated with presence or absence of disease is compelling and has strong biological plausibility. Nevertheless, the TSP formulation ignores significant available information which can improve classification accuracy and is vulnerable to selecting genes which do not have differential expression in the two conditions (“pivot" genes). RESULTS: We introduce the AUCTSP classifier as an alternative rank-based estimator of the magnitude of the ranking reversals involved in the original TSP. The proposed estimator is based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) and as such, takes into account the separation of the entire distribution of gene expression levels in gene pairs under the conditions considered, as opposed to comparing gene rankings within individual subjects as in the original TSP formulation. Through extensive simulations and case studies involving classification in ovarian, leukemia, colon, breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative (pivot) genes. CONCLUSIONS: The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across all subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2231-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6020231
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60202312018-07-06 AUCTSP: an improved biomarker gene pair class predictor Kagaris, Dimitri Khamesipour, Alireza Yiannoutsos, Constantin T. BMC Bioinformatics Research Article BACKGROUND: The Top Scoring Pair (TSP) classifier, based on the concept of relative ranking reversals in the expressions of pairs of genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. The idea that differences in gene expression ranking are associated with presence or absence of disease is compelling and has strong biological plausibility. Nevertheless, the TSP formulation ignores significant available information which can improve classification accuracy and is vulnerable to selecting genes which do not have differential expression in the two conditions (“pivot" genes). RESULTS: We introduce the AUCTSP classifier as an alternative rank-based estimator of the magnitude of the ranking reversals involved in the original TSP. The proposed estimator is based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) and as such, takes into account the separation of the entire distribution of gene expression levels in gene pairs under the conditions considered, as opposed to comparing gene rankings within individual subjects as in the original TSP formulation. Through extensive simulations and case studies involving classification in ovarian, leukemia, colon, breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative (pivot) genes. CONCLUSIONS: The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across all subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2231-1) contains supplementary material, which is available to authorized users. BioMed Central 2018-06-26 /pmc/articles/PMC6020231/ /pubmed/29940833 http://dx.doi.org/10.1186/s12859-018-2231-1 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Kagaris, Dimitri
Khamesipour, Alireza
Yiannoutsos, Constantin T.
AUCTSP: an improved biomarker gene pair class predictor
title AUCTSP: an improved biomarker gene pair class predictor
title_full AUCTSP: an improved biomarker gene pair class predictor
title_fullStr AUCTSP: an improved biomarker gene pair class predictor
title_full_unstemmed AUCTSP: an improved biomarker gene pair class predictor
title_short AUCTSP: an improved biomarker gene pair class predictor
title_sort auctsp: an improved biomarker gene pair class predictor
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6020231/
https://www.ncbi.nlm.nih.gov/pubmed/29940833
http://dx.doi.org/10.1186/s12859-018-2231-1
work_keys_str_mv AT kagarisdimitri auctspanimprovedbiomarkergenepairclasspredictor
AT khamesipouralireza auctspanimprovedbiomarkergenepairclasspredictor
AT yiannoutsosconstantint auctspanimprovedbiomarkergenepairclasspredictor