Cargando…

Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data

BACKGROUND: The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mosén-Ansorena, David, Aransay, Ana María, Rodríguez-Ezpeleta, Naiara
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3472297/ https://www.ncbi.nlm.nih.gov/pubmed/22870940 http://dx.doi.org/10.1186/1471-2105-13-192

_version_	1782246575656927232
author	Mosén-Ansorena, David Aransay, Ana María Rodríguez-Ezpeleta, Naiara
author_facet	Mosén-Ansorena, David Aransay, Ana María Rodríguez-Ezpeleta, Naiara
author_sort	Mosén-Ansorena, David
collection	PubMed
description	BACKGROUND: The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic data on which to perform benchmarks is indispensable. RESULTS: We propose such model and implement it in an R package called CnaGen to synthetically generate a wide range of alterations under different normal cell contamination levels. Six recently published methods for CNA and loss of heterozygosity (LOH) detection on tumour samples were assessed on this synthetic data and on a dilution series of a breast cancer cell-line: ASCAT, GAP, GenoCNA, GPHMM, MixHMM and OncoSNP. We report the recall rates in terms of normal cell contamination levels and alteration characteristics: length, copy number and LOH state, as well as the false discovery rate distribution for each copy number under different normal cell contamination levels. Assessed methods are in general better at detecting alterations with low copy number and under a little normal cell contamination levels. All methods except GPHMM, which failed to recognize the alteration pattern in the cell-line samples, provided similar results for the synthetic and cell-line sample sets. MixHMM and GenoCNA are the poorliest performing methods, while GAP generally performed better. This supports the viability of approaches other than the common hidden Markov model (HMM)-based. CONCLUSIONS: We devised and implemented a comprehensive model to generate data that simulate tumoural samples genotyped using SNP arrays. The validity of the model is supported by the similarity of the results obtained with synthetic and real data. Based on these results and on the software implementation of the methods, we recommend GAP for advanced users and GPHMM for a fully driven analysis.
format	Online Article Text
id	pubmed-3472297
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-34722972012-10-23 Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data Mosén-Ansorena, David Aransay, Ana María Rodríguez-Ezpeleta, Naiara BMC Bioinformatics Research Article BACKGROUND: The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic data on which to perform benchmarks is indispensable. RESULTS: We propose such model and implement it in an R package called CnaGen to synthetically generate a wide range of alterations under different normal cell contamination levels. Six recently published methods for CNA and loss of heterozygosity (LOH) detection on tumour samples were assessed on this synthetic data and on a dilution series of a breast cancer cell-line: ASCAT, GAP, GenoCNA, GPHMM, MixHMM and OncoSNP. We report the recall rates in terms of normal cell contamination levels and alteration characteristics: length, copy number and LOH state, as well as the false discovery rate distribution for each copy number under different normal cell contamination levels. Assessed methods are in general better at detecting alterations with low copy number and under a little normal cell contamination levels. All methods except GPHMM, which failed to recognize the alteration pattern in the cell-line samples, provided similar results for the synthetic and cell-line sample sets. MixHMM and GenoCNA are the poorliest performing methods, while GAP generally performed better. This supports the viability of approaches other than the common hidden Markov model (HMM)-based. CONCLUSIONS: We devised and implemented a comprehensive model to generate data that simulate tumoural samples genotyped using SNP arrays. The validity of the model is supported by the similarity of the results obtained with synthetic and real data. Based on these results and on the software implementation of the methods, we recommend GAP for advanced users and GPHMM for a fully driven analysis. BioMed Central 2012-08-07 /pmc/articles/PMC3472297/ /pubmed/22870940 http://dx.doi.org/10.1186/1471-2105-13-192 Text en Copyright ©2012 Mosén-Ansorena et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Mosén-Ansorena, David Aransay, Ana María Rodríguez-Ezpeleta, Naiara Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title	Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_full	Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_fullStr	Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_full_unstemmed	Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_short	Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
title_sort	comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3472297/ https://www.ncbi.nlm.nih.gov/pubmed/22870940 http://dx.doi.org/10.1186/1471-2105-13-192
work_keys_str_mv	AT mosenansorenadavid comparisonofmethodstodetectcopynumberalterationsincancerusingsimulatedandrealgenotypingdata AT aransayanamaria comparisonofmethodstodetectcopynumberalterationsincancerusingsimulatedandrealgenotypingdata AT rodriguezezpeletanaiara comparisonofmethodstodetectcopynumberalterationsincancerusingsimulatedandrealgenotypingdata

Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data

Ejemplares similares