Cargando…

Stability SCAD: a powerful approach to detect interactions in large-scale genomic study

BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statist...

Descripción completa

Detalles Bibliográficos
Autores principales: Gou, Jianwei, Zhao, Yang, Wei, Yongyue, Wu, Chen, Zhang, Ruyang, Qiu, Yongyong, Zeng, Ping, Tan, Wen, Yu, Dianke, Wu, Tangchun, Hu, Zhibin, Lin, Dongxin, Shen, Hongbing, Chen, Feng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984751/
https://www.ncbi.nlm.nih.gov/pubmed/24580776
http://dx.doi.org/10.1186/1471-2105-15-62
_version_ 1782311482650787840
author Gou, Jianwei
Zhao, Yang
Wei, Yongyue
Wu, Chen
Zhang, Ruyang
Qiu, Yongyong
Zeng, Ping
Tan, Wen
Yu, Dianke
Wu, Tangchun
Hu, Zhibin
Lin, Dongxin
Shen, Hongbing
Chen, Feng
author_facet Gou, Jianwei
Zhao, Yang
Wei, Yongyue
Wu, Chen
Zhang, Ruyang
Qiu, Yongyong
Zeng, Ping
Tan, Wen
Yu, Dianke
Wu, Tangchun
Hu, Zhibin
Lin, Dongxin
Shen, Hongbing
Chen, Feng
author_sort Gou, Jianwei
collection PubMed
description BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statistical community, and are now being applied to detect interactions. These techniques tend to be over-fitting, and are prone to false positives. The recently developed stability least absolute shrinkage and selection operator ((S)LASSO) has been used to control family-wise error rate, but often at the expense of power (and thus false negative results). RESULTS: Here, we propose an alternative stability selection procedure known as stability smoothly clipped absolute deviation ((S)SCAD). Briefly, this method applies a smoothly clipped absolute deviation (SCAD) algorithm to multiple sub-samples, and then identifies cluster ensemble of interactions across the sub-samples. The proposed method was compared with (S)LASSO and two kinds of traditional penalized methods by intensive simulation. The simulation revealed higher power and lower false discovery rate (FDR) with (S)SCAD. An analysis using the new method on the previously published GWAS of lung cancer confirmed all significant interactions identified with (S)LASSO, and identified two additional interactions not reported with (S)LASSO analysis. CONCLUSIONS: Based on the results obtained in this study, (S)SCAD presents to be a powerful procedure for the detection of SNP-SNP interactions in large-scale genomic data.
format Online
Article
Text
id pubmed-3984751
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39847512014-04-25 Stability SCAD: a powerful approach to detect interactions in large-scale genomic study Gou, Jianwei Zhao, Yang Wei, Yongyue Wu, Chen Zhang, Ruyang Qiu, Yongyong Zeng, Ping Tan, Wen Yu, Dianke Wu, Tangchun Hu, Zhibin Lin, Dongxin Shen, Hongbing Chen, Feng BMC Bioinformatics Research Article BACKGROUND: Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statistical community, and are now being applied to detect interactions. These techniques tend to be over-fitting, and are prone to false positives. The recently developed stability least absolute shrinkage and selection operator ((S)LASSO) has been used to control family-wise error rate, but often at the expense of power (and thus false negative results). RESULTS: Here, we propose an alternative stability selection procedure known as stability smoothly clipped absolute deviation ((S)SCAD). Briefly, this method applies a smoothly clipped absolute deviation (SCAD) algorithm to multiple sub-samples, and then identifies cluster ensemble of interactions across the sub-samples. The proposed method was compared with (S)LASSO and two kinds of traditional penalized methods by intensive simulation. The simulation revealed higher power and lower false discovery rate (FDR) with (S)SCAD. An analysis using the new method on the previously published GWAS of lung cancer confirmed all significant interactions identified with (S)LASSO, and identified two additional interactions not reported with (S)LASSO analysis. CONCLUSIONS: Based on the results obtained in this study, (S)SCAD presents to be a powerful procedure for the detection of SNP-SNP interactions in large-scale genomic data. BioMed Central 2014-03-01 /pmc/articles/PMC3984751/ /pubmed/24580776 http://dx.doi.org/10.1186/1471-2105-15-62 Text en Copyright © 2014 Gou et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Research Article
Gou, Jianwei
Zhao, Yang
Wei, Yongyue
Wu, Chen
Zhang, Ruyang
Qiu, Yongyong
Zeng, Ping
Tan, Wen
Yu, Dianke
Wu, Tangchun
Hu, Zhibin
Lin, Dongxin
Shen, Hongbing
Chen, Feng
Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
title Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
title_full Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
title_fullStr Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
title_full_unstemmed Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
title_short Stability SCAD: a powerful approach to detect interactions in large-scale genomic study
title_sort stability scad: a powerful approach to detect interactions in large-scale genomic study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984751/
https://www.ncbi.nlm.nih.gov/pubmed/24580776
http://dx.doi.org/10.1186/1471-2105-15-62
work_keys_str_mv AT goujianwei stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT zhaoyang stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT weiyongyue stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT wuchen stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT zhangruyang stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT qiuyongyong stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT zengping stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT tanwen stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT yudianke stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT wutangchun stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT huzhibin stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT lindongxin stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT shenhongbing stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy
AT chenfeng stabilityscadapowerfulapproachtodetectinteractionsinlargescalegenomicstudy