Cargando…

Comparing nominal and real quality scores on next-generation sequencing genotype calls

I seek to comprehensively evaluate the quality of the Genetic Analysis Workshop 17 (GAW17) data set by examining the accuracy of its genotype calls, which were based on the pilot3 data of the 1000 Genomes Project. Taking advantage of the 1000 Genomes Project/HapMap sample intersect, I compared GAW17...

Descripción completa

Detalles Bibliográficos
Autor principal:	Stram, Alexander H
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287848/ https://www.ncbi.nlm.nih.gov/pubmed/22373481 http://dx.doi.org/10.1186/1753-6561-5-S9-S14

_version_	1782224757410758656
author	Stram, Alexander H
author_facet	Stram, Alexander H
author_sort	Stram, Alexander H
collection	PubMed
description	I seek to comprehensively evaluate the quality of the Genetic Analysis Workshop 17 (GAW17) data set by examining the accuracy of its genotype calls, which were based on the pilot3 data of the 1000 Genomes Project. Taking advantage of the 1000 Genomes Project/HapMap sample intersect, I compared GAW17 genotype calls to HapMap III, release 2, genotype calls for an individual. These genotype calls should be concordant almost everywhere. Instead I found an astonishingly low 65.4% concordance. Regarding HapMap as the gold standard, I assume that this is a GAW17 data problem and seek to explain this discordance accordingly. I found that a large proportion of this discordance occurred outside targeted regions and that concordance could be improved to at least 94.6% by simply staying within targeted regions, which were sequenced across more samples. Furthermore, I found that in certain individuals, high sample counts did little to improve concordance and concluded that quality scores for a certain sample’s sequence reads were simply incorrect.
format	Online Article Text
id	pubmed-3287848
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-32878482012-02-28 Comparing nominal and real quality scores on next-generation sequencing genotype calls Stram, Alexander H BMC Proc Proceedings I seek to comprehensively evaluate the quality of the Genetic Analysis Workshop 17 (GAW17) data set by examining the accuracy of its genotype calls, which were based on the pilot3 data of the 1000 Genomes Project. Taking advantage of the 1000 Genomes Project/HapMap sample intersect, I compared GAW17 genotype calls to HapMap III, release 2, genotype calls for an individual. These genotype calls should be concordant almost everywhere. Instead I found an astonishingly low 65.4% concordance. Regarding HapMap as the gold standard, I assume that this is a GAW17 data problem and seek to explain this discordance accordingly. I found that a large proportion of this discordance occurred outside targeted regions and that concordance could be improved to at least 94.6% by simply staying within targeted regions, which were sequenced across more samples. Furthermore, I found that in certain individuals, high sample counts did little to improve concordance and concluded that quality scores for a certain sample’s sequence reads were simply incorrect. BioMed Central 2011-11-29 /pmc/articles/PMC3287848/ /pubmed/22373481 http://dx.doi.org/10.1186/1753-6561-5-S9-S14 Text en Copyright ©2011 Stram; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Stram, Alexander H Comparing nominal and real quality scores on next-generation sequencing genotype calls
title	Comparing nominal and real quality scores on next-generation sequencing genotype calls
title_full	Comparing nominal and real quality scores on next-generation sequencing genotype calls
title_fullStr	Comparing nominal and real quality scores on next-generation sequencing genotype calls
title_full_unstemmed	Comparing nominal and real quality scores on next-generation sequencing genotype calls
title_short	Comparing nominal and real quality scores on next-generation sequencing genotype calls
title_sort	comparing nominal and real quality scores on next-generation sequencing genotype calls
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287848/ https://www.ncbi.nlm.nih.gov/pubmed/22373481 http://dx.doi.org/10.1186/1753-6561-5-S9-S14
work_keys_str_mv	AT stramalexanderh comparingnominalandrealqualityscoresonnextgenerationsequencinggenotypecalls

Comparing nominal and real quality scores on next-generation sequencing genotype calls

Ejemplares similares