Cargando…

Analysis of concordance of different haplotype block partitioning algorithms

BACKGROUND: Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to gen...

Descripción completa

Detalles Bibliográficos
Autores principales:	Indap, Amit R, Marth, Gabor T, Struble, Craig A, Tonellato, Peter, Olivier, Michael
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1343594/ https://www.ncbi.nlm.nih.gov/pubmed/16356172 http://dx.doi.org/10.1186/1471-2105-6-303

_version_	1782126590560305152
author	Indap, Amit R Marth, Gabor T Struble, Craig A Tonellato, Peter Olivier, Michael
author_facet	Indap, Amit R Marth, Gabor T Struble, Craig A Tonellato, Peter Olivier, Michael
author_sort	Indap, Amit R
collection	PubMed
description	BACKGROUND: Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency. RESULTS: We simulated 1000 haplotypes using the standard coalescent for three world populations – European, African American, and East Asian – and applied three classes of block partitioning algorithms – diversity based, LD based, and information theoretic. We assessed algorithm differences in number, size, and coverage of blocks inferred under different conditions of SNP density, allele frequency, and sample size. Each algorithm inferred blocks differing in number, size, and coverage under different density and allele frequency conditions. Different partitions had few if any matching block boundaries. However they still overlapped and a high percentage of total chromosomal region was common to all methods. This percentage was generally higher with a higher density of SNPs and when rarer markers were included. CONCLUSION: A gold standard definition of a haplotype block is difficult to achieve, but collecting haplotypes covered with a high density of SNPs, partitioning them with a variety of block algorithms, and identifying regions common to all methods may be the best way to identify genomic regions that harbor SNP variants that cause disease.
format	Text
id	pubmed-1343594
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-13435942006-01-22 Analysis of concordance of different haplotype block partitioning algorithms Indap, Amit R Marth, Gabor T Struble, Craig A Tonellato, Peter Olivier, Michael BMC Bioinformatics Research Article BACKGROUND: Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency. RESULTS: We simulated 1000 haplotypes using the standard coalescent for three world populations – European, African American, and East Asian – and applied three classes of block partitioning algorithms – diversity based, LD based, and information theoretic. We assessed algorithm differences in number, size, and coverage of blocks inferred under different conditions of SNP density, allele frequency, and sample size. Each algorithm inferred blocks differing in number, size, and coverage under different density and allele frequency conditions. Different partitions had few if any matching block boundaries. However they still overlapped and a high percentage of total chromosomal region was common to all methods. This percentage was generally higher with a higher density of SNPs and when rarer markers were included. CONCLUSION: A gold standard definition of a haplotype block is difficult to achieve, but collecting haplotypes covered with a high density of SNPs, partitioning them with a variety of block algorithms, and identifying regions common to all methods may be the best way to identify genomic regions that harbor SNP variants that cause disease. BioMed Central 2005-12-15 /pmc/articles/PMC1343594/ /pubmed/16356172 http://dx.doi.org/10.1186/1471-2105-6-303 Text en Copyright © 2005 Indap et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Indap, Amit R Marth, Gabor T Struble, Craig A Tonellato, Peter Olivier, Michael Analysis of concordance of different haplotype block partitioning algorithms
title	Analysis of concordance of different haplotype block partitioning algorithms
title_full	Analysis of concordance of different haplotype block partitioning algorithms
title_fullStr	Analysis of concordance of different haplotype block partitioning algorithms
title_full_unstemmed	Analysis of concordance of different haplotype block partitioning algorithms
title_short	Analysis of concordance of different haplotype block partitioning algorithms
title_sort	analysis of concordance of different haplotype block partitioning algorithms
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1343594/ https://www.ncbi.nlm.nih.gov/pubmed/16356172 http://dx.doi.org/10.1186/1471-2105-6-303
work_keys_str_mv	AT indapamitr analysisofconcordanceofdifferenthaplotypeblockpartitioningalgorithms AT marthgabort analysisofconcordanceofdifferenthaplotypeblockpartitioningalgorithms AT strublecraiga analysisofconcordanceofdifferenthaplotypeblockpartitioningalgorithms AT tonellatopeter analysisofconcordanceofdifferenthaplotypeblockpartitioningalgorithms AT oliviermichael analysisofconcordanceofdifferenthaplotypeblockpartitioningalgorithms

Analysis of concordance of different haplotype block partitioning algorithms

Ejemplares similares