Cargando…

A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes

Sequencing cases without matched healthy controls hinders prioritization of germline disease-predisposition genes. To circumvent this problem, genotype summary counts from public data sets can serve as controls. However, systematic inflation and false positives can arise if confounding factors are n...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Wenan, Wang, Shuoguo, Tithi, Saima Sultana, Ellison, David W., Schaid, Daniel J., Wu, Gang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9095601/
https://www.ncbi.nlm.nih.gov/pubmed/35545612
http://dx.doi.org/10.1038/s41467-022-30248-0
_version_ 1784705791468503040
author Chen, Wenan
Wang, Shuoguo
Tithi, Saima Sultana
Ellison, David W.
Schaid, Daniel J.
Wu, Gang
author_facet Chen, Wenan
Wang, Shuoguo
Tithi, Saima Sultana
Ellison, David W.
Schaid, Daniel J.
Wu, Gang
author_sort Chen, Wenan
collection PubMed
description Sequencing cases without matched healthy controls hinders prioritization of germline disease-predisposition genes. To circumvent this problem, genotype summary counts from public data sets can serve as controls. However, systematic inflation and false positives can arise if confounding factors are not controlled. We propose a framework, consistent summary counts based rare variant burden test (CoCoRV), to address these challenges. CoCoRV implements consistent variant quality control and filtering, ethnicity-stratified rare variant association test, accurate estimation of inflation factors, powerful FDR control, and detection of rare variant pairs in high linkage disequilibrium. When we applied CoCoRV to pediatric cancer cohorts, the top genes identified were cancer-predisposition genes. We also applied CoCoRV to identify disease-predisposition genes in adult brain tumors and amyotrophic lateral sclerosis. Given that potential confounding factors were well controlled after applying the framework, CoCoRV provides a cost-effective solution to prioritizing disease-risk genes enriched with rare pathogenic variants.
format Online
Article
Text
id pubmed-9095601
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-90956012022-05-13 A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes Chen, Wenan Wang, Shuoguo Tithi, Saima Sultana Ellison, David W. Schaid, Daniel J. Wu, Gang Nat Commun Article Sequencing cases without matched healthy controls hinders prioritization of germline disease-predisposition genes. To circumvent this problem, genotype summary counts from public data sets can serve as controls. However, systematic inflation and false positives can arise if confounding factors are not controlled. We propose a framework, consistent summary counts based rare variant burden test (CoCoRV), to address these challenges. CoCoRV implements consistent variant quality control and filtering, ethnicity-stratified rare variant association test, accurate estimation of inflation factors, powerful FDR control, and detection of rare variant pairs in high linkage disequilibrium. When we applied CoCoRV to pediatric cancer cohorts, the top genes identified were cancer-predisposition genes. We also applied CoCoRV to identify disease-predisposition genes in adult brain tumors and amyotrophic lateral sclerosis. Given that potential confounding factors were well controlled after applying the framework, CoCoRV provides a cost-effective solution to prioritizing disease-risk genes enriched with rare pathogenic variants. Nature Publishing Group UK 2022-05-11 /pmc/articles/PMC9095601/ /pubmed/35545612 http://dx.doi.org/10.1038/s41467-022-30248-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Chen, Wenan
Wang, Shuoguo
Tithi, Saima Sultana
Ellison, David W.
Schaid, Daniel J.
Wu, Gang
A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
title A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
title_full A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
title_fullStr A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
title_full_unstemmed A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
title_short A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
title_sort rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9095601/
https://www.ncbi.nlm.nih.gov/pubmed/35545612
http://dx.doi.org/10.1038/s41467-022-30248-0
work_keys_str_mv AT chenwenan ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT wangshuoguo ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT tithisaimasultana ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT ellisondavidw ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT schaiddanielj ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT wugang ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT chenwenan rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT wangshuoguo rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT tithisaimasultana rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT ellisondavidw rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT schaiddanielj rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes
AT wugang rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes