Cargando…
Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statist...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984751/ https://www.ncbi.nlm.nih.gov/pubmed/24580776 http://dx.doi.org/10.1186/1471-2105-15-62 |
_version_ | 1782311482650787840 |
---|---|
author | Gou, Jianwei Zhao, Yang Wei, Yongyue Wu, Chen Zhang, Ruyang Qiu, Yongyong Zeng, Ping Tan, Wen Yu, Dianke Wu, Tangchun Hu, Zhibin Lin, Dongxin Shen, Hongbing Chen, Feng |
author_facet | Gou, Jianwei Zhao, Yang Wei, Yongyue Wu, Chen Zhang, Ruyang Qiu, Yongyong Zeng, Ping Tan, Wen Yu, Dianke Wu, Tangchun Hu, Zhibin Lin, Dongxin Shen, Hongbing Chen, Feng |
author_sort | Gou, Jianwei |
collection | PubMed |
description | BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statistical community, and are now being applied to detect interactions. These techniques tend to be over-fitting, and are prone to false positives. The recently developed stability least absolute shrinkage and selection operator ((S)LASSO) has been used to control family-wise error rate, but often at the expense of power (and thus false negative results). RESULTS: Here, we propose an alternative stability selection procedure known as stability smoothly clipped absolute deviation ((S)SCAD). Briefly, this method applies a smoothly clipped absolute deviation (SCAD) algorithm to multiple sub-samples, and then identifies cluster ensemble of interactions across the sub-samples. The proposed method was compared with (S)LASSO and two kinds of traditional penalized methods by intensive simulation. The simulation revealed higher power and lower false discovery rate (FDR) with (S)SCAD. An analysis using the new method on the previously published GWAS of lung cancer confirmed all significant interactions identified with (S)LASSO, and identified two additional interactions not reported with (S)LASSO analysis. CONCLUSIONS: Based on the results obtained in this study, (S)SCAD presents to be a powerful procedure for the detection of SNP-SNP interactions in large-scale genomic data. |
format | Online Article Text |
id | pubmed-3984751 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39847512014-04-25 Stability SCAD: a powerful approach to detect interactions in large-scale genomic study Gou, Jianwei Zhao, Yang Wei, Yongyue Wu, Chen Zhang, Ruyang Qiu, Yongyong Zeng, Ping Tan, Wen Yu, Dianke Wu, Tangchun Hu, Zhibin Lin, Dongxin Shen, Hongbing Chen, Feng BMC Bioinformatics Research Article BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statistical community, and are now being applied to detect interactions. These techniques tend to be over-fitting, and are prone to false positives. The recently developed stability least absolute shrinkage and selection operator ((S)LASSO) has been used to control family-wise error rate, but often at the expense of power (and thus false negative results). RESULTS: Here, we propose an alternative stability selection procedure known as stability smoothly clipped absolute deviation ((S)SCAD). Briefly, this method applies a smoothly clipped absolute deviation (SCAD) algorithm to multiple sub-samples, and then identifies cluster ensemble of interactions across the sub-samples. The proposed method was compared with (S)LASSO and two kinds of traditional penalized methods by intensive simulation. The simulation revealed higher power and lower false discovery rate (FDR) with (S)SCAD. An analysis using the new method on the previously published GWAS of lung cancer confirmed all significant interactions identified with (S)LASSO, and identified two additional interactions not reported with (S)LASSO analysis. CONCLUSIONS: Based on the results obtained in this study, (S)SCAD presents to be a powerful procedure for the detection of SNP-SNP interactions in large-scale genomic data. BioMed Central 2014-03-01 /pmc/articles/PMC3984751/ /pubmed/24580776 http://dx.doi.org/10.1186/1471-2105-15-62 Text en Copyright © 2014 Gou et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Research Article Gou, Jianwei Zhao, Yang Wei, Yongyue Wu, Chen Zhang, Ruyang Qiu, Yongyong Zeng, Ping Tan, Wen Yu, Dianke Wu, Tangchun Hu, Zhibin Lin, Dongxin Shen, Hongbing Chen, Feng Stability SCAD: a powerful approach to detect interactions in large-scale genomic study |
title | Stability SCAD: a powerful approach to detect interactions in large-scale genomic study |
title_full | Stability SCAD: a powerful approach to detect interactions in large-scale genomic study |
title_fullStr | Stability SCAD: a powerful approach to detect interactions in large-scale genomic study |
title_full_unstemmed | Stability SCAD: a powerful approach to detect interactions in large-scale genomic study |
title_short | Stability SCAD: a powerful approach to detect interactions in large-scale genomic study |
title_sort | stability scad: a powerful approach to detect interactions in large-scale genomic study |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984751/ https://www.ncbi.nlm.nih.gov/pubmed/24580776 http://dx.doi.org/10.1186/1471-2105-15-62 |
work_keys_str_mv | AT goujianwei stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT zhaoyang stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT weiyongyue stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT wuchen stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT zhangruyang stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT qiuyongyong stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT zengping stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT tanwen stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT yudianke stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT wutangchun stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT huzhibin stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT lindongxin stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT shenhongbing stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy AT chenfeng stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy |