Cargando…

Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets

BACKGROUND: Alzheimer’s disease has been known for more than 100 years and the underlying molecular mechanisms are not yet completely understood. The identification of genes involved in the processes in Alzheimer affected brain is an important step towards such an understanding. Genes differentially...

Descripción completa

Detalles Bibliográficos
Autores principales: Scheubert, Lena, Luštrek, Mitja, Schmidt, Rainer, Repsilber, Dirk, Fuellen, Georg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3574043/
https://www.ncbi.nlm.nih.gov/pubmed/23066814
http://dx.doi.org/10.1186/1471-2105-13-266
_version_ 1782259554980986880
author Scheubert, Lena
Luštrek, Mitja
Schmidt, Rainer
Repsilber, Dirk
Fuellen, Georg
author_facet Scheubert, Lena
Luštrek, Mitja
Schmidt, Rainer
Repsilber, Dirk
Fuellen, Georg
author_sort Scheubert, Lena
collection PubMed
description BACKGROUND: Alzheimer’s disease has been known for more than 100 years and the underlying molecular mechanisms are not yet completely understood. The identification of genes involved in the processes in Alzheimer affected brain is an important step towards such an understanding. Genes differentially expressed in diseased and healthy brains are promising candidates. RESULTS: Based on microarray data we identify potential biomarkers as well as biomarker combinations using three feature selection methods: information gain, mean decrease accuracy of random forest and a wrapper of genetic algorithm and support vector machine (GA/SVM). Information gain and random forest are two commonly used methods. We compare their output to the results obtained from GA/SVM. GA/SVM is rarely used for the analysis of microarray data, but it is able to identify genes capable of classifying tissues into different classes at least as well as the two reference methods. CONCLUSION: Compared to the other methods, GA/SVM has the advantage of finding small, less redundant sets of genes that, in combination, show superior classification characteristics. The biological significance of the genes and gene pairs is discussed.
format Online
Article
Text
id pubmed-3574043
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35740432013-02-20 Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets Scheubert, Lena Luštrek, Mitja Schmidt, Rainer Repsilber, Dirk Fuellen, Georg BMC Bioinformatics Research Article BACKGROUND: Alzheimer’s disease has been known for more than 100 years and the underlying molecular mechanisms are not yet completely understood. The identification of genes involved in the processes in Alzheimer affected brain is an important step towards such an understanding. Genes differentially expressed in diseased and healthy brains are promising candidates. RESULTS: Based on microarray data we identify potential biomarkers as well as biomarker combinations using three feature selection methods: information gain, mean decrease accuracy of random forest and a wrapper of genetic algorithm and support vector machine (GA/SVM). Information gain and random forest are two commonly used methods. We compare their output to the results obtained from GA/SVM. GA/SVM is rarely used for the analysis of microarray data, but it is able to identify genes capable of classifying tissues into different classes at least as well as the two reference methods. CONCLUSION: Compared to the other methods, GA/SVM has the advantage of finding small, less redundant sets of genes that, in combination, show superior classification characteristics. The biological significance of the genes and gene pairs is discussed. BioMed Central 2012-10-15 /pmc/articles/PMC3574043/ /pubmed/23066814 http://dx.doi.org/10.1186/1471-2105-13-266 Text en Copyright ©2012 Scheubert et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Scheubert, Lena
Luštrek, Mitja
Schmidt, Rainer
Repsilber, Dirk
Fuellen, Georg
Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
title Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
title_full Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
title_fullStr Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
title_full_unstemmed Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
title_short Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
title_sort tissue-based alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3574043/
https://www.ncbi.nlm.nih.gov/pubmed/23066814
http://dx.doi.org/10.1186/1471-2105-13-266
work_keys_str_mv AT scheubertlena tissuebasedalzheimergeneexpressionmarkerscomparisonofmultiplemachinelearningapproachesandinvestigationofredundancyinsmallbiomarkersets
AT lustrekmitja tissuebasedalzheimergeneexpressionmarkerscomparisonofmultiplemachinelearningapproachesandinvestigationofredundancyinsmallbiomarkersets
AT schmidtrainer tissuebasedalzheimergeneexpressionmarkerscomparisonofmultiplemachinelearningapproachesandinvestigationofredundancyinsmallbiomarkersets
AT repsilberdirk tissuebasedalzheimergeneexpressionmarkerscomparisonofmultiplemachinelearningapproachesandinvestigationofredundancyinsmallbiomarkersets
AT fuellengeorg tissuebasedalzheimergeneexpressionmarkerscomparisonofmultiplemachinelearningapproachesandinvestigationofredundancyinsmallbiomarkersets