Cargando…

Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks

BACKGROUND: Recent development of high-resolution single nucleotide polymorphism (SNP) arrays allows detailed assessment of genome-wide human genome variations. There is increasing recognition of the importance of SNPs for medicine and developmental biology. However, SNP data set typically has a lar...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yang, Ng, Michael
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2982692/
https://www.ncbi.nlm.nih.gov/pubmed/20840732
http://dx.doi.org/10.1186/1752-0509-4-S2-S5
_version_ 1782191758186119168
author Liu, Yang
Ng, Michael
author_facet Liu, Yang
Ng, Michael
author_sort Liu, Yang
collection PubMed
description BACKGROUND: Recent development of high-resolution single nucleotide polymorphism (SNP) arrays allows detailed assessment of genome-wide human genome variations. There is increasing recognition of the importance of SNPs for medicine and developmental biology. However, SNP data set typically has a large number of SNPs (e.g., 400 thousand SNPs in genome-wide Parkinson disease data set) and a few hundred of samples. Conventional classification methods may not be effective when applied to such genome-wide SNP data. RESULTS: In this paper, we use shrunken dissimilarity measure to analyze and select relevant SNPs for classification problems. Examples of HapMap data and Parkinson disease (PD) data are given to demonstrate the effectiveness of the proposed method, and illustrate it has a potential to become a useful analysis tool for SNP data sets. We use Parkinson disease data as an example, and perform a whole genome analysis. For the 367440 SNPs with less than 1% missing percentage from all 22 chromosomes, we can select 357 SNPs from this data set. For the unique genes that those SNPs are located in, a gene-gene similarity value is computed using GOSemSim and gene pairs that has a similarity value being greater than a threshold are selected to construct several groups of genes. For the SNPs that involved in these groups of genes, a statistical software PLINK is employed to compute the pair-wise SNP-SNP interactions, and SNPs with significance of P < 0.01 are chosen to identify SNPs networks based on their P values. Here SNPs networks are constructed based on Gene Ontology knowledge, and therefore each SNP network plays a role in the biological process. An analysis shows that such networks have relationships directly or indirectly to Parkinson disease. CONCLUSIONS: Experimental results show that our approach is suitable to handle genetic variations, and provide useful knowledge in a genome-wide SNP study.
format Text
id pubmed-2982692
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29826922010-11-17 Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks Liu, Yang Ng, Michael BMC Syst Biol Proceedings BACKGROUND: Recent development of high-resolution single nucleotide polymorphism (SNP) arrays allows detailed assessment of genome-wide human genome variations. There is increasing recognition of the importance of SNPs for medicine and developmental biology. However, SNP data set typically has a large number of SNPs (e.g., 400 thousand SNPs in genome-wide Parkinson disease data set) and a few hundred of samples. Conventional classification methods may not be effective when applied to such genome-wide SNP data. RESULTS: In this paper, we use shrunken dissimilarity measure to analyze and select relevant SNPs for classification problems. Examples of HapMap data and Parkinson disease (PD) data are given to demonstrate the effectiveness of the proposed method, and illustrate it has a potential to become a useful analysis tool for SNP data sets. We use Parkinson disease data as an example, and perform a whole genome analysis. For the 367440 SNPs with less than 1% missing percentage from all 22 chromosomes, we can select 357 SNPs from this data set. For the unique genes that those SNPs are located in, a gene-gene similarity value is computed using GOSemSim and gene pairs that has a similarity value being greater than a threshold are selected to construct several groups of genes. For the SNPs that involved in these groups of genes, a statistical software PLINK is employed to compute the pair-wise SNP-SNP interactions, and SNPs with significance of P < 0.01 are chosen to identify SNPs networks based on their P values. Here SNPs networks are constructed based on Gene Ontology knowledge, and therefore each SNP network plays a role in the biological process. An analysis shows that such networks have relationships directly or indirectly to Parkinson disease. CONCLUSIONS: Experimental results show that our approach is suitable to handle genetic variations, and provide useful knowledge in a genome-wide SNP study. BioMed Central 2010-09-13 /pmc/articles/PMC2982692/ /pubmed/20840732 http://dx.doi.org/10.1186/1752-0509-4-S2-S5 Text en Copyright ©2010 Ng and Liu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Liu, Yang
Ng, Michael
Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks
title Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks
title_full Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks
title_fullStr Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks
title_full_unstemmed Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks
title_short Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks
title_sort shrunken methodology to genome-wide snps selection and construction of snps networks
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2982692/
https://www.ncbi.nlm.nih.gov/pubmed/20840732
http://dx.doi.org/10.1186/1752-0509-4-S2-S5
work_keys_str_mv AT liuyang shrunkenmethodologytogenomewidesnpsselectionandconstructionofsnpsnetworks
AT ngmichael shrunkenmethodologytogenomewidesnpsselectionandconstructionofsnpsnetworks