Cargando…
Exhaustive prediction of disease susceptibility to coding base changes in the human genome
BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation. RESULTS: We propo...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537574/ https://www.ncbi.nlm.nih.gov/pubmed/18793467 http://dx.doi.org/10.1186/1471-2105-9-S9-S3 |
_version_ | 1782159110179913728 |
---|---|
author | Kulkarni, Vinayak Errami, Mounir Barber, Robert Garner, Harold R |
author_facet | Kulkarni, Vinayak Errami, Mounir Barber, Robert Garner, Harold R |
author_sort | Kulkarni, Vinayak |
collection | PubMed |
description | BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation. RESULTS: We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated. CONCLUSION: This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies. |
format | Text |
id | pubmed-2537574 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-25375742008-09-17 Exhaustive prediction of disease susceptibility to coding base changes in the human genome Kulkarni, Vinayak Errami, Mounir Barber, Robert Garner, Harold R BMC Bioinformatics Proceedings BACKGROUND: Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation. RESULTS: We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated. CONCLUSION: This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies. BioMed Central 2008-08-12 /pmc/articles/PMC2537574/ /pubmed/18793467 http://dx.doi.org/10.1186/1471-2105-9-S9-S3 Text en Copyright © 2008 Kulkarni et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Kulkarni, Vinayak Errami, Mounir Barber, Robert Garner, Harold R Exhaustive prediction of disease susceptibility to coding base changes in the human genome |
title | Exhaustive prediction of disease susceptibility to coding base changes in the human genome |
title_full | Exhaustive prediction of disease susceptibility to coding base changes in the human genome |
title_fullStr | Exhaustive prediction of disease susceptibility to coding base changes in the human genome |
title_full_unstemmed | Exhaustive prediction of disease susceptibility to coding base changes in the human genome |
title_short | Exhaustive prediction of disease susceptibility to coding base changes in the human genome |
title_sort | exhaustive prediction of disease susceptibility to coding base changes in the human genome |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537574/ https://www.ncbi.nlm.nih.gov/pubmed/18793467 http://dx.doi.org/10.1186/1471-2105-9-S9-S3 |
work_keys_str_mv | AT kulkarnivinayak exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome AT erramimounir exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome AT barberrobert exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome AT garnerharoldr exhaustivepredictionofdiseasesusceptibilitytocodingbasechangesinthehumangenome |