Cargando…

Comparative analysis of copy number variation detection methods and database construction

BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluate...

Descripción completa

Detalles Bibliográficos
Autores principales: Koike, Asako, Nishida, Nao, Yamashita, Daiki, Tokunaga, Katsushi
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3058066/
https://www.ncbi.nlm.nih.gov/pubmed/21385384
http://dx.doi.org/10.1186/1471-2156-12-29
_version_ 1782200336488857600
author Koike, Asako
Nishida, Nao
Yamashita, Daiki
Tokunaga, Katsushi
author_facet Koike, Asako
Nishida, Nao
Yamashita, Daiki
Tokunaga, Katsushi
author_sort Koike, Asako
collection PubMed
description BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV, CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy Japanese individuals using parameters that showed the best performance in the HapMap data and investigated their characteristics. RESULTS: The results indicate that Hidden Markov model-based programs PennCNV and Birdseye (part of Birdsuite), or Birdsuite show better detection performance than other programs when the high reproducibility rates of the same individuals and the low Mendelian inconsistencies are considered. Furthermore, when rates of overlap with other experimental results were taken into account, Birdsuite showed the best performance from the view point of sensitivity but was expected to include many false negatives and some false positives. The results of 180 healthy Japanese demonstrate that the ratio containing repeat sequences, not only segmental repeats but also long interspersed nuclear element (LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs that are commonly detected among multiple individuals than that in randomly selected regions, and the conservation score based on primates is lower in these regions than in randomly selected regions. Similar tendencies were observed in HapMap data and other experimental data. CONCLUSIONS: Our results suggest that not only segmental repeats but also interspersed repeats, especially LINE sequences, are deeply involved in CNVs, particularly in common CNV formations. The detected CNVs are stored in the CNV repository database newly constructed by the "Japanese integrated database project" for sharing data among researchers. http://gwas.lifesciencedb.jp/cgi-bin/cnvdb/cnv_top.cgi
format Text
id pubmed-3058066
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30580662011-03-16 Comparative analysis of copy number variation detection methods and database construction Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi BMC Genet Research Article BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV, CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy Japanese individuals using parameters that showed the best performance in the HapMap data and investigated their characteristics. RESULTS: The results indicate that Hidden Markov model-based programs PennCNV and Birdseye (part of Birdsuite), or Birdsuite show better detection performance than other programs when the high reproducibility rates of the same individuals and the low Mendelian inconsistencies are considered. Furthermore, when rates of overlap with other experimental results were taken into account, Birdsuite showed the best performance from the view point of sensitivity but was expected to include many false negatives and some false positives. The results of 180 healthy Japanese demonstrate that the ratio containing repeat sequences, not only segmental repeats but also long interspersed nuclear element (LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs that are commonly detected among multiple individuals than that in randomly selected regions, and the conservation score based on primates is lower in these regions than in randomly selected regions. Similar tendencies were observed in HapMap data and other experimental data. CONCLUSIONS: Our results suggest that not only segmental repeats but also interspersed repeats, especially LINE sequences, are deeply involved in CNVs, particularly in common CNV formations. The detected CNVs are stored in the CNV repository database newly constructed by the "Japanese integrated database project" for sharing data among researchers. http://gwas.lifesciencedb.jp/cgi-bin/cnvdb/cnv_top.cgi BioMed Central 2011-03-07 /pmc/articles/PMC3058066/ /pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29 Text en Copyright ©2011 Koike et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Koike, Asako
Nishida, Nao
Yamashita, Daiki
Tokunaga, Katsushi
Comparative analysis of copy number variation detection methods and database construction
title Comparative analysis of copy number variation detection methods and database construction
title_full Comparative analysis of copy number variation detection methods and database construction
title_fullStr Comparative analysis of copy number variation detection methods and database construction
title_full_unstemmed Comparative analysis of copy number variation detection methods and database construction
title_short Comparative analysis of copy number variation detection methods and database construction
title_sort comparative analysis of copy number variation detection methods and database construction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3058066/
https://www.ncbi.nlm.nih.gov/pubmed/21385384
http://dx.doi.org/10.1186/1471-2156-12-29
work_keys_str_mv AT koikeasako comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction
AT nishidanao comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction
AT yamashitadaiki comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction
AT tokunagakatsushi comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction