Cargando…
Comparative analysis of copy number variation detection methods and database construction
BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluate...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3058066/ https://www.ncbi.nlm.nih.gov/pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29 |
_version_ | 1782200336488857600 |
---|---|
author | Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi |
author_facet | Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi |
author_sort | Koike, Asako |
collection | PubMed |
description | BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV, CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy Japanese individuals using parameters that showed the best performance in the HapMap data and investigated their characteristics. RESULTS: The results indicate that Hidden Markov model-based programs PennCNV and Birdseye (part of Birdsuite), or Birdsuite show better detection performance than other programs when the high reproducibility rates of the same individuals and the low Mendelian inconsistencies are considered. Furthermore, when rates of overlap with other experimental results were taken into account, Birdsuite showed the best performance from the view point of sensitivity but was expected to include many false negatives and some false positives. The results of 180 healthy Japanese demonstrate that the ratio containing repeat sequences, not only segmental repeats but also long interspersed nuclear element (LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs that are commonly detected among multiple individuals than that in randomly selected regions, and the conservation score based on primates is lower in these regions than in randomly selected regions. Similar tendencies were observed in HapMap data and other experimental data. CONCLUSIONS: Our results suggest that not only segmental repeats but also interspersed repeats, especially LINE sequences, are deeply involved in CNVs, particularly in common CNV formations. The detected CNVs are stored in the CNV repository database newly constructed by the "Japanese integrated database project" for sharing data among researchers. http://gwas.lifesciencedb.jp/cgi-bin/cnvdb/cnv_top.cgi |
format | Text |
id | pubmed-3058066 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30580662011-03-16 Comparative analysis of copy number variation detection methods and database construction Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi BMC Genet Research Article BACKGROUND: Array-based detection of copy number variations (CNVs) is widely used for identifying disease-specific genetic variations. However, the accuracy of CNV detection is not sufficient and results differ depending on the detection programs used and their parameters. In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV, CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy Japanese individuals using parameters that showed the best performance in the HapMap data and investigated their characteristics. RESULTS: The results indicate that Hidden Markov model-based programs PennCNV and Birdseye (part of Birdsuite), or Birdsuite show better detection performance than other programs when the high reproducibility rates of the same individuals and the low Mendelian inconsistencies are considered. Furthermore, when rates of overlap with other experimental results were taken into account, Birdsuite showed the best performance from the view point of sensitivity but was expected to include many false negatives and some false positives. The results of 180 healthy Japanese demonstrate that the ratio containing repeat sequences, not only segmental repeats but also long interspersed nuclear element (LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs that are commonly detected among multiple individuals than that in randomly selected regions, and the conservation score based on primates is lower in these regions than in randomly selected regions. Similar tendencies were observed in HapMap data and other experimental data. CONCLUSIONS: Our results suggest that not only segmental repeats but also interspersed repeats, especially LINE sequences, are deeply involved in CNVs, particularly in common CNV formations. The detected CNVs are stored in the CNV repository database newly constructed by the "Japanese integrated database project" for sharing data among researchers. http://gwas.lifesciencedb.jp/cgi-bin/cnvdb/cnv_top.cgi BioMed Central 2011-03-07 /pmc/articles/PMC3058066/ /pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29 Text en Copyright ©2011 Koike et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Koike, Asako Nishida, Nao Yamashita, Daiki Tokunaga, Katsushi Comparative analysis of copy number variation detection methods and database construction |
title | Comparative analysis of copy number variation detection methods and database construction |
title_full | Comparative analysis of copy number variation detection methods and database construction |
title_fullStr | Comparative analysis of copy number variation detection methods and database construction |
title_full_unstemmed | Comparative analysis of copy number variation detection methods and database construction |
title_short | Comparative analysis of copy number variation detection methods and database construction |
title_sort | comparative analysis of copy number variation detection methods and database construction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3058066/ https://www.ncbi.nlm.nih.gov/pubmed/21385384 http://dx.doi.org/10.1186/1471-2156-12-29 |
work_keys_str_mv | AT koikeasako comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction AT nishidanao comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction AT yamashitadaiki comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction AT tokunagakatsushi comparativeanalysisofcopynumbervariationdetectionmethodsanddatabaseconstruction |