Cargando…

The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules

BACKGROUND: Relative expression algorithms such as the top-scoring pair (TSP) and the top-scoring triplet (TST) have several strengths that distinguish them from other classification methods, including resistance to overfitting, invariance to most data normalization methods, and biological interpret...

Descripción completa

Detalles Bibliográficos
Autores principales:	Magis, Andrew T, Price, Nathan D
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3663421/ https://www.ncbi.nlm.nih.gov/pubmed/22966958 http://dx.doi.org/10.1186/1471-2105-13-227

_version_	1782270993722507264
author	Magis, Andrew T Price, Nathan D
author_facet	Magis, Andrew T Price, Nathan D
author_sort	Magis, Andrew T
collection	PubMed
description	BACKGROUND: Relative expression algorithms such as the top-scoring pair (TSP) and the top-scoring triplet (TST) have several strengths that distinguish them from other classification methods, including resistance to overfitting, invariance to most data normalization methods, and biological interpretability. The top-scoring ‘N’ (TSN) algorithm is a generalized form of other relative expression algorithms which uses generic permutations and a dynamic classifier size to control both the permutation and combination space available for classification. RESULTS: TSN was tested on nine cancer datasets, showing statistically significant differences in classification accuracy between different classifier sizes (choices of N). TSN also performed competitively against a wide variety of different classification methods, including artificial neural networks, classification trees, discriminant analysis, k-Nearest neighbor, naïve Bayes, and support vector machines, when tested on the Microarray Quality Control II datasets. Furthermore, TSN exhibits low levels of overfitting on training data compared to other methods, giving confidence that results obtained during cross validation will be more generally applicable to external validation sets. CONCLUSIONS: TSN preserves the strengths of other relative expression algorithms while allowing a much larger permutation and combination space to be explored, potentially improving classification accuracies when fewer numbers of measured features are available.
format	Online Article Text
id	pubmed-3663421
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-36634212013-05-24 The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules Magis, Andrew T Price, Nathan D BMC Bioinformatics Methodology Article BACKGROUND: Relative expression algorithms such as the top-scoring pair (TSP) and the top-scoring triplet (TST) have several strengths that distinguish them from other classification methods, including resistance to overfitting, invariance to most data normalization methods, and biological interpretability. The top-scoring ‘N’ (TSN) algorithm is a generalized form of other relative expression algorithms which uses generic permutations and a dynamic classifier size to control both the permutation and combination space available for classification. RESULTS: TSN was tested on nine cancer datasets, showing statistically significant differences in classification accuracy between different classifier sizes (choices of N). TSN also performed competitively against a wide variety of different classification methods, including artificial neural networks, classification trees, discriminant analysis, k-Nearest neighbor, naïve Bayes, and support vector machines, when tested on the Microarray Quality Control II datasets. Furthermore, TSN exhibits low levels of overfitting on training data compared to other methods, giving confidence that results obtained during cross validation will be more generally applicable to external validation sets. CONCLUSIONS: TSN preserves the strengths of other relative expression algorithms while allowing a much larger permutation and combination space to be explored, potentially improving classification accuracies when fewer numbers of measured features are available. BioMed Central 2012-09-11 /pmc/articles/PMC3663421/ /pubmed/22966958 http://dx.doi.org/10.1186/1471-2105-13-227 Text en Copyright © 2012 Magis and Price; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Magis, Andrew T Price, Nathan D The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
title	The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
title_full	The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
title_fullStr	The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
title_full_unstemmed	The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
title_short	The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
title_sort	top-scoring ‘n’ algorithm: a generalized relative expression classification method from small numbers of biomolecules
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3663421/ https://www.ncbi.nlm.nih.gov/pubmed/22966958 http://dx.doi.org/10.1186/1471-2105-13-227
work_keys_str_mv	AT magisandrewt thetopscoringnalgorithmageneralizedrelativeexpressionclassificationmethodfromsmallnumbersofbiomolecules AT pricenathand thetopscoringnalgorithmageneralizedrelativeexpressionclassificationmethodfromsmallnumbersofbiomolecules AT magisandrewt topscoringnalgorithmageneralizedrelativeexpressionclassificationmethodfromsmallnumbersofbiomolecules AT pricenathand topscoringnalgorithmageneralizedrelativeexpressionclassificationmethodfromsmallnumbersofbiomolecules

The top-scoring ‘N’ algorithm: a generalized relative expression classification method from small numbers of biomolecules

Ejemplares similares