Cargando…

Comparative analysis of copy number variation detection methods and database construction

BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluate...

Descripción completa

Detalles Bibliográficos
Autores principales:	Koike, Asako, Nishida, Nao, Yamashita, Daiki, Tokunaga, Katsushi
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3058066/ https://www.ncbi.nlm.nih.gov/pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29

_version_	1782200336488857600
author	Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi
author_facet	Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi
author_sort	Koike, Asako
collection	PubMed
description	BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV, CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy Japanese individuals using parameters that showed the best performance in the HapMap data and investigated their characteristics. RESULTS: The results indicate that Hidden Markov model-based programs PennCNV and Birdseye (part of Birdsuite), or Birdsuite show better detection performance than other programs when the high reproducibility rates of the same individuals and the low Mendelian inconsistencies are considered. Furthermore, when rates of overlap with other experimental results were taken into account, Birdsuite showed the best performance from the view point of sensitivity but was expected to include many false negatives and some false positives. The results of 180 healthy Japanese demonstrate that the ratio containing repeat sequences, not only segmental repeats but also long interspersed nuclear element (LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs that are commonly detected among multiple individuals than that in randomly selected regions, and the conservation score based on primates is lower in these regions than in randomly selected regions. Similar tendencies were observed in HapMap data and other experimental data. CONCLUSIONS: Our results suggest that not only segmental repeats but also interspersed repeats, especially LINE sequences, are deeply involved in CNVs, particularly in common CNV formations. The detected CNVs are stored in the CNV repository database newly constructed by the "Japanese integrated database project" for sharing data among researchers. http://gwas.lifesciencedb.jp/cgi-bin/cnvdb/cnv_top.cgi
format	Text
id	pubmed-3058066
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-30580662011-03-16 Comparative analysis of copy number variation detection methods and database construction Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi BMC Genet Research Article BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV, CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy Japanese individuals using parameters that showed the best performance in the HapMap data and investigated their characteristics. RESULTS: The results indicate that Hidden Markov model-based programs PennCNV and Birdseye (part of Birdsuite), or Birdsuite show better detection performance than other programs when the high reproducibility rates of the same individuals and the low Mendelian inconsistencies are considered. Furthermore, when rates of overlap with other experimental results were taken into account, Birdsuite showed the best performance from the view point of sensitivity but was expected to include many false negatives and some false positives. The results of 180 healthy Japanese demonstrate that the ratio containing repeat sequences, not only segmental repeats but also long interspersed nuclear element (LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs that are commonly detected among multiple individuals than that in randomly selected regions, and the conservation score based on primates is lower in these regions than in randomly selected regions. Similar tendencies were observed in HapMap data and other experimental data. CONCLUSIONS: Our results suggest that not only segmental repeats but also interspersed repeats, especially LINE sequences, are deeply involved in CNVs, particularly in common CNV formations. The detected CNVs are stored in the CNV repository database newly constructed by the "Japanese integrated database project" for sharing data among researchers. http://gwas.lifesciencedb.jp/cgi-bin/cnvdb/cnv_top.cgi BioMed Central 2011-03-07 /pmc/articles/PMC3058066/ /pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29 Text en Copyright ©2011 Koike et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi Comparative analysis of copy number variation detection methods and database construction
title	Comparative analysis of copy number variation detection methods and database construction
title_full	Comparative analysis of copy number variation detection methods and database construction
title_fullStr	Comparative analysis of copy number variation detection methods and database construction
title_full_unstemmed	Comparative analysis of copy number variation detection methods and database construction
title_short	Comparative analysis of copy number variation detection methods and database construction
title_sort	comparative analysis of copy number variation detection methods and database construction
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3058066/ https://www.ncbi.nlm.nih.gov/pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29
work_keys_str_mv	AT koikeasako comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction AT nishidanao comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction AT yamashitadaiki comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction AT tokunagakatsushi comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction

Comparative analysis of copy number variation detection methods and database construction

Ejemplares similares