Cargando…

Target SNP selection in complex disease association studies

BACKGROUND: The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are n...

Descripción completa

Detalles Bibliográficos
Autor principal: Wjst, Matthias
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC487897/
https://www.ncbi.nlm.nih.gov/pubmed/15248903
http://dx.doi.org/10.1186/1471-2105-5-92
_version_ 1782121662671486976
author Wjst, Matthias
author_facet Wjst, Matthias
author_sort Wjst, Matthias
collection PubMed
description BACKGROUND: The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are not available. RESULTS: We implemented a computational pipeline that retrieves the genomic sequence of target genes, collects information about sequence variation and selects functional motifs containing SNPs. Motifs being considered are gene promoter, exon-intron structure, AU-rich mRNA elements, transcription factor binding motifs, cryptic and enhancer splice sites together with expression in target tissue. As a case study, 396 genes on chromosome 6p21 in the extended HLA region were selected that contributed nearly 20,000 SNPs. By computer annotation ~2,500 SNPs in functional motifs could be identified. Most of these SNPs are disrupting transcription factor binding sites but only those introducing new sites had a significant depressing effect on SNP allele frequency. Other decision rules concern position within motifs, the validity of SNP database entries, the unique occurrence in the genome and conserved sequence context in other mammalian genomes. CONCLUSION: Only 10% of all gene-based SNPs have sequence-predicted functional relevance making them a primary target for genotyping in association studies.
format Text
id pubmed-487897
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4878972004-07-25 Target SNP selection in complex disease association studies Wjst, Matthias BMC Bioinformatics Methodology Article BACKGROUND: The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are not available. RESULTS: We implemented a computational pipeline that retrieves the genomic sequence of target genes, collects information about sequence variation and selects functional motifs containing SNPs. Motifs being considered are gene promoter, exon-intron structure, AU-rich mRNA elements, transcription factor binding motifs, cryptic and enhancer splice sites together with expression in target tissue. As a case study, 396 genes on chromosome 6p21 in the extended HLA region were selected that contributed nearly 20,000 SNPs. By computer annotation ~2,500 SNPs in functional motifs could be identified. Most of these SNPs are disrupting transcription factor binding sites but only those introducing new sites had a significant depressing effect on SNP allele frequency. Other decision rules concern position within motifs, the validity of SNP database entries, the unique occurrence in the genome and conserved sequence context in other mammalian genomes. CONCLUSION: Only 10% of all gene-based SNPs have sequence-predicted functional relevance making them a primary target for genotyping in association studies. BioMed Central 2004-07-12 /pmc/articles/PMC487897/ /pubmed/15248903 http://dx.doi.org/10.1186/1471-2105-5-92 Text en Copyright © 2004 Wjst; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Methodology Article
Wjst, Matthias
Target SNP selection in complex disease association studies
title Target SNP selection in complex disease association studies
title_full Target SNP selection in complex disease association studies
title_fullStr Target SNP selection in complex disease association studies
title_full_unstemmed Target SNP selection in complex disease association studies
title_short Target SNP selection in complex disease association studies
title_sort target snp selection in complex disease association studies
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC487897/
https://www.ncbi.nlm.nih.gov/pubmed/15248903
http://dx.doi.org/10.1186/1471-2105-5-92
work_keys_str_mv AT wjstmatthias targetsnpselectionincomplexdiseaseassociationstudies