Cargando…

Genome wide association studies in presence of misclassified binary responses

BACKGROUND: Misclassification has been shown to have a high prevalence in binary responses in both livestock and human populations. Leaving these errors uncorrected before analyses will have a negative impact on the overall goal of genome-wide association studies (GWAS) including reducing predictive...

Descripción completa

Detalles Bibliográficos
Autores principales:	Smith, Shannon, Hay, El Hamidi, Farhat, Nourhene, Rekaya, Romdhane
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3879434/ https://www.ncbi.nlm.nih.gov/pubmed/24369108 http://dx.doi.org/10.1186/1471-2156-14-124

_version_	1782297984671678464
author	Smith, Shannon Hay, El Hamidi Farhat, Nourhene Rekaya, Romdhane
author_facet	Smith, Shannon Hay, El Hamidi Farhat, Nourhene Rekaya, Romdhane
author_sort	Smith, Shannon
collection	PubMed
description	BACKGROUND: Misclassification has been shown to have a high prevalence in binary responses in both livestock and human populations. Leaving these errors uncorrected before analyses will have a negative impact on the overall goal of genome-wide association studies (GWAS) including reducing predictive power. A liability threshold model that contemplates misclassification was developed to assess the effects of mis-diagnostic errors on GWAS. Four simulated scenarios of case–control datasets were generated. Each dataset consisted of 2000 individuals and was analyzed with varying odds ratios of the influential SNPs and misclassification rates of 5% and 10%. RESULTS: Analyses of binary responses subject to misclassification resulted in underestimation of influential SNPs and failed to estimate the true magnitude and direction of the effects. Once the misclassification algorithm was applied there was a 12% to 29% increase in accuracy, and a substantial reduction in bias. The proposed method was able to capture the majority of the most significant SNPs that were not identified in the analysis of the misclassified data. In fact, in one of the simulation scenarios, 33% of the influential SNPs were not identified using the misclassified data, compared with the analysis using the data without misclassification. However, using the proposed method, only 13% were not identified. Furthermore, the proposed method was able to identify with high probability a large portion of the truly misclassified observations. CONCLUSIONS: The proposed model provides a statistical tool to correct or at least attenuate the negative effects of misclassified binary responses in GWAS. Across different levels of misclassification probability as well as odds ratios of significant SNPs, the model proved to be robust. In fact, SNP effects, and misclassification probability were accurately estimated and the truly misclassified observations were identified with high probabilities compared to non-misclassified responses. This study was limited to situations where the misclassification probability was assumed to be the same in cases and controls which is not always the case based on real human disease data. Thus, it is of interest to evaluate the performance of the proposed model in that situation which is the current focus of our research.
format	Online Article Text
id	pubmed-3879434
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-38794342014-01-09 Genome wide association studies in presence of misclassified binary responses Smith, Shannon Hay, El Hamidi Farhat, Nourhene Rekaya, Romdhane BMC Genet Research Article BACKGROUND: Misclassification has been shown to have a high prevalence in binary responses in both livestock and human populations. Leaving these errors uncorrected before analyses will have a negative impact on the overall goal of genome-wide association studies (GWAS) including reducing predictive power. A liability threshold model that contemplates misclassification was developed to assess the effects of mis-diagnostic errors on GWAS. Four simulated scenarios of case–control datasets were generated. Each dataset consisted of 2000 individuals and was analyzed with varying odds ratios of the influential SNPs and misclassification rates of 5% and 10%. RESULTS: Analyses of binary responses subject to misclassification resulted in underestimation of influential SNPs and failed to estimate the true magnitude and direction of the effects. Once the misclassification algorithm was applied there was a 12% to 29% increase in accuracy, and a substantial reduction in bias. The proposed method was able to capture the majority of the most significant SNPs that were not identified in the analysis of the misclassified data. In fact, in one of the simulation scenarios, 33% of the influential SNPs were not identified using the misclassified data, compared with the analysis using the data without misclassification. However, using the proposed method, only 13% were not identified. Furthermore, the proposed method was able to identify with high probability a large portion of the truly misclassified observations. CONCLUSIONS: The proposed model provides a statistical tool to correct or at least attenuate the negative effects of misclassified binary responses in GWAS. Across different levels of misclassification probability as well as odds ratios of significant SNPs, the model proved to be robust. In fact, SNP effects, and misclassification probability were accurately estimated and the truly misclassified observations were identified with high probabilities compared to non-misclassified responses. This study was limited to situations where the misclassification probability was assumed to be the same in cases and controls which is not always the case based on real human disease data. Thus, it is of interest to evaluate the performance of the proposed model in that situation which is the current focus of our research. BioMed Central 2013-12-26 /pmc/articles/PMC3879434/ /pubmed/24369108 http://dx.doi.org/10.1186/1471-2156-14-124 Text en Copyright © 2013 Smith et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Smith, Shannon Hay, El Hamidi Farhat, Nourhene Rekaya, Romdhane Genome wide association studies in presence of misclassified binary responses
title	Genome wide association studies in presence of misclassified binary responses
title_full	Genome wide association studies in presence of misclassified binary responses
title_fullStr	Genome wide association studies in presence of misclassified binary responses
title_full_unstemmed	Genome wide association studies in presence of misclassified binary responses
title_short	Genome wide association studies in presence of misclassified binary responses
title_sort	genome wide association studies in presence of misclassified binary responses
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3879434/ https://www.ncbi.nlm.nih.gov/pubmed/24369108 http://dx.doi.org/10.1186/1471-2156-14-124
work_keys_str_mv	AT smithshannon genomewideassociationstudiesinpresenceofmisclassifiedbinaryresponses AT hayelhamidi genomewideassociationstudiesinpresenceofmisclassifiedbinaryresponses AT farhatnourhene genomewideassociationstudiesinpresenceofmisclassifiedbinaryresponses AT rekayaromdhane genomewideassociationstudiesinpresenceofmisclassifiedbinaryresponses

Genome wide association studies in presence of misclassified binary responses

Ejemplares similares