Cargando…

The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data

BACKGROUND: Ranking and identifying biomarkers that are associated with disease from genome-wide measurements holds significant promise for understanding the genetic basis of common diseases. The large number of single nucleotide polymorphisms (SNPs) in genome-wide studies (GWAS), however, makes thi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Stokes, Matthew E, Barmada, M Michael, Kamboh, M Ilyas, Visweswaran, Shyam
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4234455/ https://www.ncbi.nlm.nih.gov/pubmed/24731236 http://dx.doi.org/10.1186/1471-2164-15-282

_version_	1782344863270830080
author	Stokes, Matthew E Barmada, M Michael Kamboh, M Ilyas Visweswaran, Shyam
author_facet	Stokes, Matthew E Barmada, M Michael Kamboh, M Ilyas Visweswaran, Shyam
author_sort	Stokes, Matthew E
collection	PubMed
description	BACKGROUND: Ranking and identifying biomarkers that are associated with disease from genome-wide measurements holds significant promise for understanding the genetic basis of common diseases. The large number of single nucleotide polymorphisms (SNPs) in genome-wide studies (GWAS), however, makes this task computationally challenging when the ranking is to be done in a multivariate fashion. This paper evaluates the performance of a multivariate graph-based method called label propagation (LP) that efficiently ranks SNPs in genome-wide data. RESULTS: The performance of LP was evaluated on a synthetic dataset and two late onset Alzheimer’s disease (LOAD) genome-wide datasets, and the performance was compared to that of three control methods. The control methods included chi squared, which is a commonly used univariate method, as well as a Relief method called SWRF and a sparse logistic regression (SLR) method, which are both multivariate ranking methods. Performance was measured by evaluating the top-ranked SNPs in terms of classification performance, reproducibility between the two datasets, and prior evidence of being associated with LOAD. On the synthetic data LP performed comparably to the control methods. On GWAS data, LP performed significantly better than chi squared and SWRF in classification performance in the range from 10 to 1000 top-ranked SNPs for both datasets, and not significantly different from SLR. LP also had greater ranking reproducibility than chi squared, SWRF, and SLR. Among the 25 top-ranked SNPs that were identified by LP, there were 14 SNPs in one dataset that had evidence in the literature of being associated with LOAD, and 10 SNPs in the other, which was higher than for the other methods. CONCLUSION: LP performed considerably better in ranking SNPs in two high-dimensional genome-wide datasets when compared to three control methods. It had better performance in the evaluation measures we used, and is computationally efficient to be applied practically to data from genome-wide studies. These results provide support for including LP in the methods that are used to rank SNPs in genome-wide datasets.
format	Online Article Text
id	pubmed-4234455
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-42344552014-11-19 The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data Stokes, Matthew E Barmada, M Michael Kamboh, M Ilyas Visweswaran, Shyam BMC Genomics Methodology Article BACKGROUND: Ranking and identifying biomarkers that are associated with disease from genome-wide measurements holds significant promise for understanding the genetic basis of common diseases. The large number of single nucleotide polymorphisms (SNPs) in genome-wide studies (GWAS), however, makes this task computationally challenging when the ranking is to be done in a multivariate fashion. This paper evaluates the performance of a multivariate graph-based method called label propagation (LP) that efficiently ranks SNPs in genome-wide data. RESULTS: The performance of LP was evaluated on a synthetic dataset and two late onset Alzheimer’s disease (LOAD) genome-wide datasets, and the performance was compared to that of three control methods. The control methods included chi squared, which is a commonly used univariate method, as well as a Relief method called SWRF and a sparse logistic regression (SLR) method, which are both multivariate ranking methods. Performance was measured by evaluating the top-ranked SNPs in terms of classification performance, reproducibility between the two datasets, and prior evidence of being associated with LOAD. On the synthetic data LP performed comparably to the control methods. On GWAS data, LP performed significantly better than chi squared and SWRF in classification performance in the range from 10 to 1000 top-ranked SNPs for both datasets, and not significantly different from SLR. LP also had greater ranking reproducibility than chi squared, SWRF, and SLR. Among the 25 top-ranked SNPs that were identified by LP, there were 14 SNPs in one dataset that had evidence in the literature of being associated with LOAD, and 10 SNPs in the other, which was higher than for the other methods. CONCLUSION: LP performed considerably better in ranking SNPs in two high-dimensional genome-wide datasets when compared to three control methods. It had better performance in the evaluation measures we used, and is computationally efficient to be applied practically to data from genome-wide studies. These results provide support for including LP in the methods that are used to rank SNPs in genome-wide datasets. BioMed Central 2014-04-14 /pmc/articles/PMC4234455/ /pubmed/24731236 http://dx.doi.org/10.1186/1471-2164-15-282 Text en Copyright © 2014 Stokes et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle	Methodology Article Stokes, Matthew E Barmada, M Michael Kamboh, M Ilyas Visweswaran, Shyam The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
title	The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
title_full	The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
title_fullStr	The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
title_full_unstemmed	The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
title_short	The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
title_sort	application of network label propagation to rank biomarkers in genome-wide alzheimer’s data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4234455/ https://www.ncbi.nlm.nih.gov/pubmed/24731236 http://dx.doi.org/10.1186/1471-2164-15-282
work_keys_str_mv	AT stokesmatthewe theapplicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT barmadammichael theapplicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT kambohmilyas theapplicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT visweswaranshyam theapplicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT stokesmatthewe applicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT barmadammichael applicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT kambohmilyas applicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata AT visweswaranshyam applicationofnetworklabelpropagationtorankbiomarkersingenomewidealzheimersdata

The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data

Ejemplares similares