Cargando…

Analysis of protein sequence and interaction data for candidate disease gene prediction

Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates fo...

Descripción completa

Detalles Bibliográficos
Autores principales:	George, Richard A., Liu, Jason Y., Feng, Lina L., Bryson-Richardson, Robert J., Fatkin, Diane, Wouters, Merridee A.
Formato:	Texto
Lenguaje:	English
Publicado:	Oxford University Press 2006
Materias:	Methods Online
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636487/ https://www.ncbi.nlm.nih.gov/pubmed/17020920 http://dx.doi.org/10.1093/nar/gkl707

_version_	1782130758702333952
author	George, Richard A. Liu, Jason Y. Feng, Lina L. Bryson-Richardson, Robert J. Fatkin, Diane Wouters, Merridee A.
author_facet	George, Richard A. Liu, Jason Y. Feng, Lina L. Bryson-Richardson, Robert J. Fatkin, Diane Wouters, Merridee A.
author_sort	George, Richard A.
collection	PubMed
description	Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein–protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process.
format	Text
id	pubmed-1636487
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-16364872006-11-29 Analysis of protein sequence and interaction data for candidate disease gene prediction George, Richard A. Liu, Jason Y. Feng, Lina L. Bryson-Richardson, Robert J. Fatkin, Diane Wouters, Merridee A. Nucleic Acids Res Methods Online Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein–protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process. Oxford University Press 2006-11 2006-10-04 /pmc/articles/PMC1636487/ /pubmed/17020920 http://dx.doi.org/10.1093/nar/gkl707 Text en © 2006 The Author(s)
spellingShingle	Methods Online George, Richard A. Liu, Jason Y. Feng, Lina L. Bryson-Richardson, Robert J. Fatkin, Diane Wouters, Merridee A. Analysis of protein sequence and interaction data for candidate disease gene prediction
title	Analysis of protein sequence and interaction data for candidate disease gene prediction
title_full	Analysis of protein sequence and interaction data for candidate disease gene prediction
title_fullStr	Analysis of protein sequence and interaction data for candidate disease gene prediction
title_full_unstemmed	Analysis of protein sequence and interaction data for candidate disease gene prediction
title_short	Analysis of protein sequence and interaction data for candidate disease gene prediction
title_sort	analysis of protein sequence and interaction data for candidate disease gene prediction
topic	Methods Online
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636487/ https://www.ncbi.nlm.nih.gov/pubmed/17020920 http://dx.doi.org/10.1093/nar/gkl707
work_keys_str_mv	AT georgericharda analysisofproteinsequenceandinteractiondataforcandidatediseasegeneprediction AT liujasony analysisofproteinsequenceandinteractiondataforcandidatediseasegeneprediction AT fenglinal analysisofproteinsequenceandinteractiondataforcandidatediseasegeneprediction AT brysonrichardsonrobertj analysisofproteinsequenceandinteractiondataforcandidatediseasegeneprediction AT fatkindiane analysisofproteinsequenceandinteractiondataforcandidatediseasegeneprediction AT woutersmerrideea analysisofproteinsequenceandinteractiondataforcandidatediseasegeneprediction

Analysis of protein sequence and interaction data for candidate disease gene prediction

Ejemplares similares