Cargando…
A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes
Sequencing cases without matched healthy controls hinders prioritization of germline disease-predisposition genes. To circumvent this problem, genotype summary counts from public data sets can serve as controls. However, systematic inflation and false positives can arise if confounding factors are n...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9095601/ https://www.ncbi.nlm.nih.gov/pubmed/35545612 http://dx.doi.org/10.1038/s41467-022-30248-0 |
_version_ | 1784705791468503040 |
---|---|
author | Chen, Wenan Wang, Shuoguo Tithi, Saima Sultana Ellison, David W. Schaid, Daniel J. Wu, Gang |
author_facet | Chen, Wenan Wang, Shuoguo Tithi, Saima Sultana Ellison, David W. Schaid, Daniel J. Wu, Gang |
author_sort | Chen, Wenan |
collection | PubMed |
description | Sequencing cases without matched healthy controls hinders prioritization of germline disease-predisposition genes. To circumvent this problem, genotype summary counts from public data sets can serve as controls. However, systematic inflation and false positives can arise if confounding factors are not controlled. We propose a framework, consistent summary counts based rare variant burden test (CoCoRV), to address these challenges. CoCoRV implements consistent variant quality control and filtering, ethnicity-stratified rare variant association test, accurate estimation of inflation factors, powerful FDR control, and detection of rare variant pairs in high linkage disequilibrium. When we applied CoCoRV to pediatric cancer cohorts, the top genes identified were cancer-predisposition genes. We also applied CoCoRV to identify disease-predisposition genes in adult brain tumors and amyotrophic lateral sclerosis. Given that potential confounding factors were well controlled after applying the framework, CoCoRV provides a cost-effective solution to prioritizing disease-risk genes enriched with rare pathogenic variants. |
format | Online Article Text |
id | pubmed-9095601 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-90956012022-05-13 A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes Chen, Wenan Wang, Shuoguo Tithi, Saima Sultana Ellison, David W. Schaid, Daniel J. Wu, Gang Nat Commun Article Sequencing cases without matched healthy controls hinders prioritization of germline disease-predisposition genes. To circumvent this problem, genotype summary counts from public data sets can serve as controls. However, systematic inflation and false positives can arise if confounding factors are not controlled. We propose a framework, consistent summary counts based rare variant burden test (CoCoRV), to address these challenges. CoCoRV implements consistent variant quality control and filtering, ethnicity-stratified rare variant association test, accurate estimation of inflation factors, powerful FDR control, and detection of rare variant pairs in high linkage disequilibrium. When we applied CoCoRV to pediatric cancer cohorts, the top genes identified were cancer-predisposition genes. We also applied CoCoRV to identify disease-predisposition genes in adult brain tumors and amyotrophic lateral sclerosis. Given that potential confounding factors were well controlled after applying the framework, CoCoRV provides a cost-effective solution to prioritizing disease-risk genes enriched with rare pathogenic variants. Nature Publishing Group UK 2022-05-11 /pmc/articles/PMC9095601/ /pubmed/35545612 http://dx.doi.org/10.1038/s41467-022-30248-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Chen, Wenan Wang, Shuoguo Tithi, Saima Sultana Ellison, David W. Schaid, Daniel J. Wu, Gang A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
title | A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
title_full | A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
title_fullStr | A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
title_full_unstemmed | A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
title_short | A rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
title_sort | rare variant analysis framework using public genotype summary counts to prioritize disease-predisposition genes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9095601/ https://www.ncbi.nlm.nih.gov/pubmed/35545612 http://dx.doi.org/10.1038/s41467-022-30248-0 |
work_keys_str_mv | AT chenwenan ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT wangshuoguo ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT tithisaimasultana ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT ellisondavidw ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT schaiddanielj ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT wugang ararevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT chenwenan rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT wangshuoguo rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT tithisaimasultana rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT ellisondavidw rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT schaiddanielj rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes AT wugang rarevariantanalysisframeworkusingpublicgenotypesummarycountstoprioritizediseasepredispositiongenes |