Cargando…

A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data

BACKGROUND: Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly incr...

Descripción completa

Detalles Bibliográficos
Autores principales: Nishiyama, Takeshi, Takahashi, Kunihiko, Tango, Toshiro, Pinto, Dalila, Scherer, Stephen W, Takami, Satoshi, Kishino, Hirohisa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130692/
https://www.ncbi.nlm.nih.gov/pubmed/21612662
http://dx.doi.org/10.1186/1471-2105-12-205
_version_ 1782207641578110976
author Nishiyama, Takeshi
Takahashi, Kunihiko
Tango, Toshiro
Pinto, Dalila
Scherer, Stephen W
Takami, Satoshi
Kishino, Hirohisa
author_facet Nishiyama, Takeshi
Takahashi, Kunihiko
Tango, Toshiro
Pinto, Dalila
Scherer, Stephen W
Takami, Satoshi
Kishino, Hirohisa
author_sort Nishiyama, Takeshi
collection PubMed
description BACKGROUND: Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. RESULTS: We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. CONCLUSIONS: The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.
format Online
Article
Text
id pubmed-3130692
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31306922011-07-07 A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data Nishiyama, Takeshi Takahashi, Kunihiko Tango, Toshiro Pinto, Dalila Scherer, Stephen W Takami, Satoshi Kishino, Hirohisa BMC Bioinformatics Methodology Article BACKGROUND: Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. RESULTS: We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. CONCLUSIONS: The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway. BioMed Central 2011-05-26 /pmc/articles/PMC3130692/ /pubmed/21612662 http://dx.doi.org/10.1186/1471-2105-12-205 Text en Copyright ©2011 Nishiyama et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Nishiyama, Takeshi
Takahashi, Kunihiko
Tango, Toshiro
Pinto, Dalila
Scherer, Stephen W
Takami, Satoshi
Kishino, Hirohisa
A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data
title A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data
title_full A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data
title_fullStr A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data
title_full_unstemmed A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data
title_short A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data
title_sort scan statistic to extract causal gene clusters from case-control genome-wide rare cnv data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130692/
https://www.ncbi.nlm.nih.gov/pubmed/21612662
http://dx.doi.org/10.1186/1471-2105-12-205
work_keys_str_mv AT nishiyamatakeshi ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT takahashikunihiko ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT tangotoshiro ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT pintodalila ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT schererstephenw ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT takamisatoshi ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT kishinohirohisa ascanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT nishiyamatakeshi scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT takahashikunihiko scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT tangotoshiro scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT pintodalila scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT schererstephenw scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT takamisatoshi scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata
AT kishinohirohisa scanstatistictoextractcausalgeneclustersfromcasecontrolgenomewiderarecnvdata