Cargando…

CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants

BACKGROUND: Availability of next generation sequencing data, allows low-frequency and rare variants to be studied through strategies other than the commonly used genome-wide association studies (GWAS). Rare variants are important keys towards explaining the heritability for complex diseases that rem...

Descripción completa

Detalles Bibliográficos
Autores principales: Chattopadhyay, Amrita, Shih, Ching-Yu, Hsu, Yu-Chen, Juang, Jyh-Ming Jimmy, Chuang, Eric Y., Lu, Tzu-Pin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9590128/
https://www.ncbi.nlm.nih.gov/pubmed/36274122
http://dx.doi.org/10.1186/s12859-022-04987-2
_version_ 1784814448331980800
author Chattopadhyay, Amrita
Shih, Ching-Yu
Hsu, Yu-Chen
Juang, Jyh-Ming Jimmy
Chuang, Eric Y.
Lu, Tzu-Pin
author_facet Chattopadhyay, Amrita
Shih, Ching-Yu
Hsu, Yu-Chen
Juang, Jyh-Ming Jimmy
Chuang, Eric Y.
Lu, Tzu-Pin
author_sort Chattopadhyay, Amrita
collection PubMed
description BACKGROUND: Availability of next generation sequencing data, allows low-frequency and rare variants to be studied through strategies other than the commonly used genome-wide association studies (GWAS). Rare variants are important keys towards explaining the heritability for complex diseases that remains to be explained by common variants due to their low effect sizes. However, analysis strategies struggle to keep up with the huge amount of data at disposal therefore creating a bottleneck. This study describes CLIN_SKAT, an R package, that provides users with an easily implemented analysis pipeline with the goal of (i) extracting clinically relevant variants (both rare and common), followed by (ii) gene-based association analysis by grouping the selected variants. RESULTS: CLIN_SKAT offers four simple functions that can be used to obtain clinically relevant variants, map them to genes or gene sets, calculate weights from global healthy populations and conduct weighted case–control analysis. CLIN_SKAT introduces improvements by adding certain pre-analysis steps and customizable features to make the SKAT results clinically more meaningful. Moreover, it offers several plot functions that can be availed towards obtaining visualizations for interpretation of the analyses results. CLIN_SKAT is available on Windows/Linux/MacOS and is operative for R version 4.0.4 or later. It can be freely downloaded from https://github.com/ShihChingYu/CLIN_SKAT, installed through devtools::install_github("ShihChingYu/CLIN_SKAT", force=T) and executed by loading the package into R using library(CLIN_SKAT). All outputs (tabular and graphical) can be downloaded in simple, publishable formats. CONCLUSIONS: Statistical association analysis is often underpowered due to low sample sizes and high numbers of variants to be tested, limiting detection of causal ones. Therefore, retaining a subset of variants that are biologically meaningful seems to be a more effective strategy for identifying explainable associations while reducing the degrees of freedom. CLIN_SKAT offers users a one-stop R package that identifies disease risk variants with improved power via a series of tailor-made procedures that allows dimension reduction, by retaining functionally relevant variants, and incorporating ethnicity based priors. Furthermore, it also eliminates the requirement for high computational resources and bioinformatics expertise. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04987-2.
format Online
Article
Text
id pubmed-9590128
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-95901282022-10-25 CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants Chattopadhyay, Amrita Shih, Ching-Yu Hsu, Yu-Chen Juang, Jyh-Ming Jimmy Chuang, Eric Y. Lu, Tzu-Pin BMC Bioinformatics Software BACKGROUND: Availability of next generation sequencing data, allows low-frequency and rare variants to be studied through strategies other than the commonly used genome-wide association studies (GWAS). Rare variants are important keys towards explaining the heritability for complex diseases that remains to be explained by common variants due to their low effect sizes. However, analysis strategies struggle to keep up with the huge amount of data at disposal therefore creating a bottleneck. This study describes CLIN_SKAT, an R package, that provides users with an easily implemented analysis pipeline with the goal of (i) extracting clinically relevant variants (both rare and common), followed by (ii) gene-based association analysis by grouping the selected variants. RESULTS: CLIN_SKAT offers four simple functions that can be used to obtain clinically relevant variants, map them to genes or gene sets, calculate weights from global healthy populations and conduct weighted case–control analysis. CLIN_SKAT introduces improvements by adding certain pre-analysis steps and customizable features to make the SKAT results clinically more meaningful. Moreover, it offers several plot functions that can be availed towards obtaining visualizations for interpretation of the analyses results. CLIN_SKAT is available on Windows/Linux/MacOS and is operative for R version 4.0.4 or later. It can be freely downloaded from https://github.com/ShihChingYu/CLIN_SKAT, installed through devtools::install_github("ShihChingYu/CLIN_SKAT", force=T) and executed by loading the package into R using library(CLIN_SKAT). All outputs (tabular and graphical) can be downloaded in simple, publishable formats. CONCLUSIONS: Statistical association analysis is often underpowered due to low sample sizes and high numbers of variants to be tested, limiting detection of causal ones. Therefore, retaining a subset of variants that are biologically meaningful seems to be a more effective strategy for identifying explainable associations while reducing the degrees of freedom. CLIN_SKAT offers users a one-stop R package that identifies disease risk variants with improved power via a series of tailor-made procedures that allows dimension reduction, by retaining functionally relevant variants, and incorporating ethnicity based priors. Furthermore, it also eliminates the requirement for high computational resources and bioinformatics expertise. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04987-2. BioMed Central 2022-10-23 /pmc/articles/PMC9590128/ /pubmed/36274122 http://dx.doi.org/10.1186/s12859-022-04987-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Chattopadhyay, Amrita
Shih, Ching-Yu
Hsu, Yu-Chen
Juang, Jyh-Ming Jimmy
Chuang, Eric Y.
Lu, Tzu-Pin
CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants
title CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants
title_full CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants
title_fullStr CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants
title_full_unstemmed CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants
title_short CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants
title_sort clin_skat: an r package to conduct association analysis using functionally relevant variants
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9590128/
https://www.ncbi.nlm.nih.gov/pubmed/36274122
http://dx.doi.org/10.1186/s12859-022-04987-2
work_keys_str_mv AT chattopadhyayamrita clinskatanrpackagetoconductassociationanalysisusingfunctionallyrelevantvariants
AT shihchingyu clinskatanrpackagetoconductassociationanalysisusingfunctionallyrelevantvariants
AT hsuyuchen clinskatanrpackagetoconductassociationanalysisusingfunctionallyrelevantvariants
AT juangjyhmingjimmy clinskatanrpackagetoconductassociationanalysisusingfunctionallyrelevantvariants
AT chuangericy clinskatanrpackagetoconductassociationanalysisusingfunctionallyrelevantvariants
AT lutzupin clinskatanrpackagetoconductassociationanalysisusingfunctionallyrelevantvariants