Cargando…

cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate

Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high...

Descripción completa

Detalles Bibliográficos
Autores principales: Clevert, Djork-Arné, Mitterecker, Andreas, Mayr, Andreas, Klambauer, Günter, Tuefferd, Marianne, Bondt, An De, Talloen, Willem, Göhlmann, Hinrich, Hochreiter, Sepp
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130288/
https://www.ncbi.nlm.nih.gov/pubmed/21486749
http://dx.doi.org/10.1093/nar/gkr197
_version_ 1782207602592055296
author Clevert, Djork-Arné
Mitterecker, Andreas
Mayr, Andreas
Klambauer, Günter
Tuefferd, Marianne
Bondt, An De
Talloen, Willem
Göhlmann, Hinrich
Hochreiter, Sepp
author_facet Clevert, Djork-Arné
Mitterecker, Andreas
Mayr, Andreas
Klambauer, Günter
Tuefferd, Marianne
Bondt, An De
Talloen, Willem
Göhlmann, Hinrich
Hochreiter, Sepp
author_sort Clevert, Djork-Arné
collection PubMed
description Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, ‘cn.FARMS’, which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html.
format Online
Article
Text
id pubmed-3130288
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31302882011-07-06 cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate Clevert, Djork-Arné Mitterecker, Andreas Mayr, Andreas Klambauer, Günter Tuefferd, Marianne Bondt, An De Talloen, Willem Göhlmann, Hinrich Hochreiter, Sepp Nucleic Acids Res Methods Online Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, ‘cn.FARMS’, which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html. Oxford University Press 2011-07 2011-04-12 /pmc/articles/PMC3130288/ /pubmed/21486749 http://dx.doi.org/10.1093/nar/gkr197 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Clevert, Djork-Arné
Mitterecker, Andreas
Mayr, Andreas
Klambauer, Günter
Tuefferd, Marianne
Bondt, An De
Talloen, Willem
Göhlmann, Hinrich
Hochreiter, Sepp
cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
title cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
title_full cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
title_fullStr cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
title_full_unstemmed cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
title_short cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
title_sort cn.farms: a latent variable model to detect copy number variations in microarray data with a low false discovery rate
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130288/
https://www.ncbi.nlm.nih.gov/pubmed/21486749
http://dx.doi.org/10.1093/nar/gkr197
work_keys_str_mv AT clevertdjorkarne cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT mittereckerandreas cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT mayrandreas cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT klambauergunter cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT tuefferdmarianne cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT bondtande cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT talloenwillem cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT gohlmannhinrich cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate
AT hochreitersepp cnfarmsalatentvariablemodeltodetectcopynumbervariationsinmicroarraydatawithalowfalsediscoveryrate