Cargando…

Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations

Copy-number variations (CNV) are believed to play an important role in a wide range of complex traits, but discovering such associations remains challenging. While whole-genome sequencing (WGS) is the gold-standard approach for CNV detection, there are several orders of magnitude more samples with a...

Descripción completa

Detalles Bibliográficos
Autores principales: Lepamets, Maarja, Auwerx, Chiara, Nõukas, Margit, Claringbould, Annique, Porcu, Eleonora, Kals, Mart, Jürgenson, Tuuli, Morris, Andrew Paul, Võsa, Urmo, Bochud, Murielle, Stringhini, Silvia, Wijmenga, Cisca, Franke, Lude, Peterson, Hedi, Vilo, Jaak, Lepik, Kaido, Mägi, Reedik, Kutalik, Zoltán
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9399386/
https://www.ncbi.nlm.nih.gov/pubmed/36035246
http://dx.doi.org/10.1016/j.xhgg.2022.100133
_version_ 1784772508722921472
author Lepamets, Maarja
Auwerx, Chiara
Nõukas, Margit
Claringbould, Annique
Porcu, Eleonora
Kals, Mart
Jürgenson, Tuuli
Morris, Andrew Paul
Võsa, Urmo
Bochud, Murielle
Stringhini, Silvia
Wijmenga, Cisca
Franke, Lude
Peterson, Hedi
Vilo, Jaak
Lepik, Kaido
Mägi, Reedik
Kutalik, Zoltán
author_facet Lepamets, Maarja
Auwerx, Chiara
Nõukas, Margit
Claringbould, Annique
Porcu, Eleonora
Kals, Mart
Jürgenson, Tuuli
Morris, Andrew Paul
Võsa, Urmo
Bochud, Murielle
Stringhini, Silvia
Wijmenga, Cisca
Franke, Lude
Peterson, Hedi
Vilo, Jaak
Lepik, Kaido
Mägi, Reedik
Kutalik, Zoltán
author_sort Lepamets, Maarja
collection PubMed
description Copy-number variations (CNV) are believed to play an important role in a wide range of complex traits, but discovering such associations remains challenging. While whole-genome sequencing (WGS) is the gold-standard approach for CNV detection, there are several orders of magnitude more samples with available genotyping microarray data. Such array data can be exploited for CNV detection using dedicated software (e.g., PennCNV); however, these calls suffer from elevated false-positive and -negative rates. In this study, we developed a CNV quality score that weights PennCNV calls (pCNVs) based on their likelihood of being true positive. First, we established a measure of pCNV reliability by leveraging evidence from multiple omics data (WGS, transcriptomics, and methylomics) obtained from the same samples. Next, we built a predictor of omics-confirmed pCNVs, termed omics-informed quality score (OQS), using only PennCNV software output parameters. Promisingly, OQS assigned to pCNVs detected in close family members was up to 35% higher than the OQS of pCNVs not carried by other relatives (p < 3.0 × 10(−90)), outperforming other scores. Finally, in an association study of four anthropometric traits in 89,516 Estonian Biobank samples, the use of OQS led to a relative increase in the trait variance explained by CNVs of up to 56% compared with published quality filtering methods or scores. Overall, we put forward a flexible framework to improve any CNV detection method leveraging multi-omics evidence, applied it to improve PennCNV calls, and demonstrated its utility by improving the statistical power for downstream association analyses.
format Online
Article
Text
id pubmed-9399386
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-93993862022-08-25 Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations Lepamets, Maarja Auwerx, Chiara Nõukas, Margit Claringbould, Annique Porcu, Eleonora Kals, Mart Jürgenson, Tuuli Morris, Andrew Paul Võsa, Urmo Bochud, Murielle Stringhini, Silvia Wijmenga, Cisca Franke, Lude Peterson, Hedi Vilo, Jaak Lepik, Kaido Mägi, Reedik Kutalik, Zoltán HGG Adv Article Copy-number variations (CNV) are believed to play an important role in a wide range of complex traits, but discovering such associations remains challenging. While whole-genome sequencing (WGS) is the gold-standard approach for CNV detection, there are several orders of magnitude more samples with available genotyping microarray data. Such array data can be exploited for CNV detection using dedicated software (e.g., PennCNV); however, these calls suffer from elevated false-positive and -negative rates. In this study, we developed a CNV quality score that weights PennCNV calls (pCNVs) based on their likelihood of being true positive. First, we established a measure of pCNV reliability by leveraging evidence from multiple omics data (WGS, transcriptomics, and methylomics) obtained from the same samples. Next, we built a predictor of omics-confirmed pCNVs, termed omics-informed quality score (OQS), using only PennCNV software output parameters. Promisingly, OQS assigned to pCNVs detected in close family members was up to 35% higher than the OQS of pCNVs not carried by other relatives (p < 3.0 × 10(−90)), outperforming other scores. Finally, in an association study of four anthropometric traits in 89,516 Estonian Biobank samples, the use of OQS led to a relative increase in the trait variance explained by CNVs of up to 56% compared with published quality filtering methods or scores. Overall, we put forward a flexible framework to improve any CNV detection method leveraging multi-omics evidence, applied it to improve PennCNV calls, and demonstrated its utility by improving the statistical power for downstream association analyses. Elsevier 2022-08-01 /pmc/articles/PMC9399386/ /pubmed/36035246 http://dx.doi.org/10.1016/j.xhgg.2022.100133 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Lepamets, Maarja
Auwerx, Chiara
Nõukas, Margit
Claringbould, Annique
Porcu, Eleonora
Kals, Mart
Jürgenson, Tuuli
Morris, Andrew Paul
Võsa, Urmo
Bochud, Murielle
Stringhini, Silvia
Wijmenga, Cisca
Franke, Lude
Peterson, Hedi
Vilo, Jaak
Lepik, Kaido
Mägi, Reedik
Kutalik, Zoltán
Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations
title Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations
title_full Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations
title_fullStr Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations
title_full_unstemmed Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations
title_short Omics-informed CNV calls reduce false-positive rates and improve power for CNV-trait associations
title_sort omics-informed cnv calls reduce false-positive rates and improve power for cnv-trait associations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9399386/
https://www.ncbi.nlm.nih.gov/pubmed/36035246
http://dx.doi.org/10.1016/j.xhgg.2022.100133
work_keys_str_mv AT lepametsmaarja omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT auwerxchiara omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT noukasmargit omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT claringbouldannique omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT porcueleonora omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT kalsmart omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT jurgensontuuli omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT morrisandrewpaul omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT vosaurmo omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT bochudmurielle omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT stringhinisilvia omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT wijmengacisca omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT frankelude omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT petersonhedi omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT vilojaak omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT lepikkaido omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT magireedik omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations
AT kutalikzoltan omicsinformedcnvcallsreducefalsepositiveratesandimprovepowerforcnvtraitassociations