Cargando…

Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure

BACKGROUND: In genome-wide association studies the extent and impact of confounding due to population structure have been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication, and the identification of causal variants. Several str...

Descripción completa

Detalles Bibliográficos
Autores principales: Abegaz, Fentaw, Van Lishout, François, Mahachie John, Jestinah M., Chiachoompu, Kridsadakorn, Bhardwaj, Archana, Duroux, Diane, Gusareva, Elena S., Wei, Zhi, Hakonarson, Hakon, Van Steen, Kristel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893746/
https://www.ncbi.nlm.nih.gov/pubmed/33608043
http://dx.doi.org/10.1186/s13040-021-00247-w
_version_ 1783653108358840320
author Abegaz, Fentaw
Van Lishout, François
Mahachie John, Jestinah M.
Chiachoompu, Kridsadakorn
Bhardwaj, Archana
Duroux, Diane
Gusareva, Elena S.
Wei, Zhi
Hakonarson, Hakon
Van Steen, Kristel
author_facet Abegaz, Fentaw
Van Lishout, François
Mahachie John, Jestinah M.
Chiachoompu, Kridsadakorn
Bhardwaj, Archana
Duroux, Diane
Gusareva, Elena S.
Wei, Zhi
Hakonarson, Hakon
Van Steen, Kristel
author_sort Abegaz, Fentaw
collection PubMed
description BACKGROUND: In genome-wide association studies the extent and impact of confounding due to population structure have been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication, and the identification of causal variants. Several strategies have been developed for protecting associations against confounding, the most popular one is based on Principal Component Analysis. In contrast, the extent and impact of confounding due to population structure in gene-gene interaction association epistasis studies are much less investigated and understood. In particular, the role of nonlinear genetic population substructure in epistasis detection is largely under-investigated, especially outside a regression framework. METHODS: To identify causal variants in synergy, to improve interpretability and replicability of epistasis results, we introduce three strategies based on a model-based multifactor dimensionality reduction approach for structured populations, namely MBMDR-PC, MBMDR-PG, and MBMDR-GC. RESULTS: Simulation results comparing the performance of various approaches show that in the presence of population structure MBMDR-PC and MBMDR-PG consistently better control type I error rate at the nominal level than MBMDR-GC. Moreover, our proposed three methods of population structure correction outperform MDR-SP in terms of statistical power. CONCLUSION: We demonstrate through extensive simulation studies the effect of various degrees of genetic population structure and relatedness on epistasis detection and propose appropriate remedial measures based on linear and nonlinear sample genetic similarity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-021-00247-w.
format Online
Article
Text
id pubmed-7893746
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78937462021-02-22 Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure Abegaz, Fentaw Van Lishout, François Mahachie John, Jestinah M. Chiachoompu, Kridsadakorn Bhardwaj, Archana Duroux, Diane Gusareva, Elena S. Wei, Zhi Hakonarson, Hakon Van Steen, Kristel BioData Min Methodology BACKGROUND: In genome-wide association studies the extent and impact of confounding due to population structure have been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication, and the identification of causal variants. Several strategies have been developed for protecting associations against confounding, the most popular one is based on Principal Component Analysis. In contrast, the extent and impact of confounding due to population structure in gene-gene interaction association epistasis studies are much less investigated and understood. In particular, the role of nonlinear genetic population substructure in epistasis detection is largely under-investigated, especially outside a regression framework. METHODS: To identify causal variants in synergy, to improve interpretability and replicability of epistasis results, we introduce three strategies based on a model-based multifactor dimensionality reduction approach for structured populations, namely MBMDR-PC, MBMDR-PG, and MBMDR-GC. RESULTS: Simulation results comparing the performance of various approaches show that in the presence of population structure MBMDR-PC and MBMDR-PG consistently better control type I error rate at the nominal level than MBMDR-GC. Moreover, our proposed three methods of population structure correction outperform MDR-SP in terms of statistical power. CONCLUSION: We demonstrate through extensive simulation studies the effect of various degrees of genetic population structure and relatedness on epistasis detection and propose appropriate remedial measures based on linear and nonlinear sample genetic similarity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-021-00247-w. BioMed Central 2021-02-19 /pmc/articles/PMC7893746/ /pubmed/33608043 http://dx.doi.org/10.1186/s13040-021-00247-w Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Abegaz, Fentaw
Van Lishout, François
Mahachie John, Jestinah M.
Chiachoompu, Kridsadakorn
Bhardwaj, Archana
Duroux, Diane
Gusareva, Elena S.
Wei, Zhi
Hakonarson, Hakon
Van Steen, Kristel
Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
title Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
title_full Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
title_fullStr Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
title_full_unstemmed Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
title_short Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
title_sort performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893746/
https://www.ncbi.nlm.nih.gov/pubmed/33608043
http://dx.doi.org/10.1186/s13040-021-00247-w
work_keys_str_mv AT abegazfentaw performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT vanlishoutfrancois performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT mahachiejohnjestinahm performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT chiachoompukridsadakorn performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT bhardwajarchana performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT durouxdiane performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT gusarevaelenas performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT weizhi performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT hakonarsonhakon performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure
AT vansteenkristel performanceofmodelbasedmultifactordimensionalityreductionmethodsforepistasisdetectionbycontrollingpopulationstructure