Cargando…

Exhaustive prediction of disease susceptibility to coding base changes in the human genome

BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation. RESULTS: We propo...

Descripción completa

Detalles Bibliográficos
Autores principales: Kulkarni, Vinayak, Errami, Mounir, Barber, Robert, Garner, Harold R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537574/
https://www.ncbi.nlm.nih.gov/pubmed/18793467
http://dx.doi.org/10.1186/1471-2105-9-S9-S3
_version_ 1782159110179913728
author Kulkarni, Vinayak
Errami, Mounir
Barber, Robert
Garner, Harold R
author_facet Kulkarni, Vinayak
Errami, Mounir
Barber, Robert
Garner, Harold R
author_sort Kulkarni, Vinayak
collection PubMed
description BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation. RESULTS: We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated. CONCLUSION: This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies.
format Text
id pubmed-2537574
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25375742008-09-17 Exhaustive prediction of disease susceptibility to coding base changes in the human genome Kulkarni, Vinayak Errami, Mounir Barber, Robert Garner, Harold R BMC Bioinformatics Proceedings BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation. RESULTS: We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated. CONCLUSION: This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies. BioMed Central 2008-08-12 /pmc/articles/PMC2537574/ /pubmed/18793467 http://dx.doi.org/10.1186/1471-2105-9-S9-S3 Text en Copyright © 2008 Kulkarni et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Kulkarni, Vinayak
Errami, Mounir
Barber, Robert
Garner, Harold R
Exhaustive prediction of disease susceptibility to coding base changes in the human genome
title Exhaustive prediction of disease susceptibility to coding base changes in the human genome
title_full Exhaustive prediction of disease susceptibility to coding base changes in the human genome
title_fullStr Exhaustive prediction of disease susceptibility to coding base changes in the human genome
title_full_unstemmed Exhaustive prediction of disease susceptibility to coding base changes in the human genome
title_short Exhaustive prediction of disease susceptibility to coding base changes in the human genome
title_sort exhaustive prediction of disease susceptibility to coding base changes in the human genome
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537574/
https://www.ncbi.nlm.nih.gov/pubmed/18793467
http://dx.doi.org/10.1186/1471-2105-9-S9-S3
work_keys_str_mv AT kulkarnivinayak exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome
AT erramimounir exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome
AT barberrobert exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome
AT garnerharoldr exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome