Cargando…

Penalized partial least squares for pleiotropy

BACKGROUND: The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand co...

Descripción completa

Detalles Bibliográficos
Autores principales: Broc, Camilo, Truong, Therese, Liquet, Benoit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905667/
https://www.ncbi.nlm.nih.gov/pubmed/33627076
http://dx.doi.org/10.1186/s12859-021-03968-1
_version_ 1783655152507420672
author Broc, Camilo
Truong, Therese
Liquet, Benoit
author_facet Broc, Camilo
Truong, Therese
Liquet, Benoit
author_sort Broc, Camilo
collection PubMed
description BACKGROUND: The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. RESULTS: Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. CONCLUSION: The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.
format Online
Article
Text
id pubmed-7905667
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79056672021-02-25 Penalized partial least squares for pleiotropy Broc, Camilo Truong, Therese Liquet, Benoit BMC Bioinformatics Methodology Article BACKGROUND: The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. RESULTS: Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. CONCLUSION: The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields. BioMed Central 2021-02-24 /pmc/articles/PMC7905667/ /pubmed/33627076 http://dx.doi.org/10.1186/s12859-021-03968-1 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Broc, Camilo
Truong, Therese
Liquet, Benoit
Penalized partial least squares for pleiotropy
title Penalized partial least squares for pleiotropy
title_full Penalized partial least squares for pleiotropy
title_fullStr Penalized partial least squares for pleiotropy
title_full_unstemmed Penalized partial least squares for pleiotropy
title_short Penalized partial least squares for pleiotropy
title_sort penalized partial least squares for pleiotropy
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905667/
https://www.ncbi.nlm.nih.gov/pubmed/33627076
http://dx.doi.org/10.1186/s12859-021-03968-1
work_keys_str_mv AT broccamilo penalizedpartialleastsquaresforpleiotropy
AT truongtherese penalizedpartialleastsquaresforpleiotropy
AT liquetbenoit penalizedpartialleastsquaresforpleiotropy