Cargando…

Identifying mutation regions for closely related individuals without a known pedigree

BACKGROUND: Linkage analysis is the first step in the search for a disease gene. Linkage studies have facilitated the identification of several hundred human genes that can harbor mutations leading to a disease phenotype. In this paper, we study a very important case, where the sampled individuals a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cui, Wenjuan, Wang, Lusheng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507658/ https://www.ncbi.nlm.nih.gov/pubmed/22731852 http://dx.doi.org/10.1186/1471-2105-13-146

_version_	1782251102797824000
author	Cui, Wenjuan Wang, Lusheng
author_facet	Cui, Wenjuan Wang, Lusheng
author_sort	Cui, Wenjuan
collection	PubMed
description	BACKGROUND: Linkage analysis is the first step in the search for a disease gene. Linkage studies have facilitated the identification of several hundred human genes that can harbor mutations leading to a disease phenotype. In this paper, we study a very important case, where the sampled individuals are closely related, but the pedigree is not given. This situation happens very often when the individuals share a common ancestor 6 or more generations ago. To our knowledge, no algorithm can give good results for this case. RESULTS: To solve this problem, we first developed some heuristic algorithms for haplotype inference without any given pedigree. We propose a model using the parsimony principle that can be viewed as an extension of the model first proposed by Dan Gusfield. Our heuristic algorithm uses Clark’s inference rule to infer haplotype segments. CONCLUSIONS: We ran our program both on the simulated data and a set of real data from the phase II HapMap database. Experiments show that our program performs well. The recall value is from 90% to 99% in various cases. This implies that the program can report more than 90% of the true mutation regions. The value of precision varies from 29% to 90%. When the precision is 29%, the size of the reported regions is three times that of the true mutation region. This is still very useful for narrowing down the range of the disease gene location. Our program can complete the computation for all the tested cases, where there are about 110,000 SNPs on a chromosome, within 20 seconds.
format	Online Article Text
id	pubmed-3507658
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35076582012-12-03 Identifying mutation regions for closely related individuals without a known pedigree Cui, Wenjuan Wang, Lusheng BMC Bioinformatics Software BACKGROUND: Linkage analysis is the first step in the search for a disease gene. Linkage studies have facilitated the identification of several hundred human genes that can harbor mutations leading to a disease phenotype. In this paper, we study a very important case, where the sampled individuals are closely related, but the pedigree is not given. This situation happens very often when the individuals share a common ancestor 6 or more generations ago. To our knowledge, no algorithm can give good results for this case. RESULTS: To solve this problem, we first developed some heuristic algorithms for haplotype inference without any given pedigree. We propose a model using the parsimony principle that can be viewed as an extension of the model first proposed by Dan Gusfield. Our heuristic algorithm uses Clark’s inference rule to infer haplotype segments. CONCLUSIONS: We ran our program both on the simulated data and a set of real data from the phase II HapMap database. Experiments show that our program performs well. The recall value is from 90% to 99% in various cases. This implies that the program can report more than 90% of the true mutation regions. The value of precision varies from 29% to 90%. When the precision is 29%, the size of the reported regions is three times that of the true mutation region. This is still very useful for narrowing down the range of the disease gene location. Our program can complete the computation for all the tested cases, where there are about 110,000 SNPs on a chromosome, within 20 seconds. BioMed Central 2012-06-25 /pmc/articles/PMC3507658/ /pubmed/22731852 http://dx.doi.org/10.1186/1471-2105-13-146 Text en Copyright ©2012 Cui and Wang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Cui, Wenjuan Wang, Lusheng Identifying mutation regions for closely related individuals without a known pedigree
title	Identifying mutation regions for closely related individuals without a known pedigree
title_full	Identifying mutation regions for closely related individuals without a known pedigree
title_fullStr	Identifying mutation regions for closely related individuals without a known pedigree
title_full_unstemmed	Identifying mutation regions for closely related individuals without a known pedigree
title_short	Identifying mutation regions for closely related individuals without a known pedigree
title_sort	identifying mutation regions for closely related individuals without a known pedigree
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507658/ https://www.ncbi.nlm.nih.gov/pubmed/22731852 http://dx.doi.org/10.1186/1471-2105-13-146
work_keys_str_mv	AT cuiwenjuan identifyingmutationregionsforcloselyrelatedindividualswithoutaknownpedigree AT wanglusheng identifyingmutationregionsforcloselyrelatedindividualswithoutaknownpedigree

Identifying mutation regions for closely related individuals without a known pedigree

Ejemplares similares