Cargando…
Comparison of genotype clustering tools with rare variants
BACKGROUND: Along with the improvement of high throughput sequencing technologies, the genetics community is showing marked interest for the rare variants/common diseases hypothesis. While sequencing can still be prohibitive for large studies, commercially available genotyping arrays targeting rare...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3941951/ https://www.ncbi.nlm.nih.gov/pubmed/24559245 http://dx.doi.org/10.1186/1471-2105-15-52 |
_version_ | 1782306004469284864 |
---|---|
author | Perreault, Louis-Philippe Lemieux Legault, Marc-André Barhdadi, Amina Provost, Sylvie Normand, Valérie Tardif, Jean-Claude Dubé, Marie-Pierre |
author_facet | Perreault, Louis-Philippe Lemieux Legault, Marc-André Barhdadi, Amina Provost, Sylvie Normand, Valérie Tardif, Jean-Claude Dubé, Marie-Pierre |
author_sort | Perreault, Louis-Philippe Lemieux |
collection | PubMed |
description | BACKGROUND: Along with the improvement of high throughput sequencing technologies, the genetics community is showing marked interest for the rare variants/common diseases hypothesis. While sequencing can still be prohibitive for large studies, commercially available genotyping arrays targeting rare variants prove to be a reasonable alternative. A technical challenge of array based methods is the task of deriving genotype classes (homozygous or heterozygous) by clustering intensity data points. The performance of clustering tools for common polymorphisms is well established, while their performance when conducted with a large proportion of rare variants (where data points are sparse for genotypes containing the rare allele) is less known. We have compared the performance of four clustering tools (GenCall, GenoSNP, optiCall and zCall) for the genotyping of over 10,000 samples using the Illumina’s HumanExome BeadChip, which includes 247,870 variants, 90% of which have a minor allele frequency below 5% in a population of European ancestry. Different reference parameters for GenCall and different initial parameters for GenoSNP were tested. Genotyping accuracy was assessed using data from the 1000 Genomes Project as a gold standard, and agreement between tools was measured. RESULTS: Concordance of GenoSNP’s calls with the gold standard was below expectations and was increased by changing the tool’s initial parameters. While the four tools provided concordance with the gold standard above 99% for common alleles, some of them performed poorly for rare alleles. The reproducibility of genotype calls for each tool was assessed using experimental duplicates which provided concordance rates above 99%. The inter-tool agreement of genotype calls was high for approximately 95% of variants. Most tools yielded similar error rates (approximately 0.02), except for zCall which performed better with a 0.00164 mean error rate. CONCLUSIONS: The GenoSNP clustering tool could not be run straight “out of the box” with the HumanExome BeadChip, as modification of hard coded parameters was necessary to achieve optimal performance. Overall, GenCall marginally outperformed the other tools for the HumanExome BeadChip. The use of experimental replicates provided a valuable quality control tool for genotyping projects with rare variants. |
format | Online Article Text |
id | pubmed-3941951 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39419512014-03-14 Comparison of genotype clustering tools with rare variants Perreault, Louis-Philippe Lemieux Legault, Marc-André Barhdadi, Amina Provost, Sylvie Normand, Valérie Tardif, Jean-Claude Dubé, Marie-Pierre BMC Bioinformatics Methodology Article BACKGROUND: Along with the improvement of high throughput sequencing technologies, the genetics community is showing marked interest for the rare variants/common diseases hypothesis. While sequencing can still be prohibitive for large studies, commercially available genotyping arrays targeting rare variants prove to be a reasonable alternative. A technical challenge of array based methods is the task of deriving genotype classes (homozygous or heterozygous) by clustering intensity data points. The performance of clustering tools for common polymorphisms is well established, while their performance when conducted with a large proportion of rare variants (where data points are sparse for genotypes containing the rare allele) is less known. We have compared the performance of four clustering tools (GenCall, GenoSNP, optiCall and zCall) for the genotyping of over 10,000 samples using the Illumina’s HumanExome BeadChip, which includes 247,870 variants, 90% of which have a minor allele frequency below 5% in a population of European ancestry. Different reference parameters for GenCall and different initial parameters for GenoSNP were tested. Genotyping accuracy was assessed using data from the 1000 Genomes Project as a gold standard, and agreement between tools was measured. RESULTS: Concordance of GenoSNP’s calls with the gold standard was below expectations and was increased by changing the tool’s initial parameters. While the four tools provided concordance with the gold standard above 99% for common alleles, some of them performed poorly for rare alleles. The reproducibility of genotype calls for each tool was assessed using experimental duplicates which provided concordance rates above 99%. The inter-tool agreement of genotype calls was high for approximately 95% of variants. Most tools yielded similar error rates (approximately 0.02), except for zCall which performed better with a 0.00164 mean error rate. CONCLUSIONS: The GenoSNP clustering tool could not be run straight “out of the box” with the HumanExome BeadChip, as modification of hard coded parameters was necessary to achieve optimal performance. Overall, GenCall marginally outperformed the other tools for the HumanExome BeadChip. The use of experimental replicates provided a valuable quality control tool for genotyping projects with rare variants. BioMed Central 2014-02-21 /pmc/articles/PMC3941951/ /pubmed/24559245 http://dx.doi.org/10.1186/1471-2105-15-52 Text en Copyright © 2014 Lemieux Perreault et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Perreault, Louis-Philippe Lemieux Legault, Marc-André Barhdadi, Amina Provost, Sylvie Normand, Valérie Tardif, Jean-Claude Dubé, Marie-Pierre Comparison of genotype clustering tools with rare variants |
title | Comparison of genotype clustering tools with rare variants |
title_full | Comparison of genotype clustering tools with rare variants |
title_fullStr | Comparison of genotype clustering tools with rare variants |
title_full_unstemmed | Comparison of genotype clustering tools with rare variants |
title_short | Comparison of genotype clustering tools with rare variants |
title_sort | comparison of genotype clustering tools with rare variants |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3941951/ https://www.ncbi.nlm.nih.gov/pubmed/24559245 http://dx.doi.org/10.1186/1471-2105-15-52 |
work_keys_str_mv | AT perreaultlouisphilippelemieux comparisonofgenotypeclusteringtoolswithrarevariants AT legaultmarcandre comparisonofgenotypeclusteringtoolswithrarevariants AT barhdadiamina comparisonofgenotypeclusteringtoolswithrarevariants AT provostsylvie comparisonofgenotypeclusteringtoolswithrarevariants AT normandvalerie comparisonofgenotypeclusteringtoolswithrarevariants AT tardifjeanclaude comparisonofgenotypeclusteringtoolswithrarevariants AT dubemariepierre comparisonofgenotypeclusteringtoolswithrarevariants |