Cargando…

Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data

Various types of genomic data (e.g., SNPs and mRNA transcripts) have been employed to identify risk genes for complex diseases. However, the analysis of these data has largely been performed in isolation. Combining these multiple data for integrative analysis can take advantage of complementary info...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Hongbao, Lei, Shufeng, Deng, Hong-Wen, Wang, Yu-Ping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3434191/
https://www.ncbi.nlm.nih.gov/pubmed/22957024
http://dx.doi.org/10.1371/journal.pone.0042755
_version_ 1782242413468712960
author Cao, Hongbao
Lei, Shufeng
Deng, Hong-Wen
Wang, Yu-Ping
author_facet Cao, Hongbao
Lei, Shufeng
Deng, Hong-Wen
Wang, Yu-Ping
author_sort Cao, Hongbao
collection PubMed
description Various types of genomic data (e.g., SNPs and mRNA transcripts) have been employed to identify risk genes for complex diseases. However, the analysis of these data has largely been performed in isolation. Combining these multiple data for integrative analysis can take advantage of complementary information and thus can have higher power to identify genes (and/or their functions) that would otherwise be impossible with individual data analysis. Due to the different nature, structure, and format of diverse sets of genomic data, multiple genomic data integration is challenging. Here we address the problem by developing a sparse representation based clustering (SRC) method for integrative data analysis. As an example, we applied the SRC method to the integrative analysis of 376821 SNPs in 200 subjects (100 cases and 100 controls) and expression data for 22283 genes in 80 subjects (40 cases and 40 controls) to identify significant genes for osteoporosis (OP). Comparing our results with previous studies, we identified some genes known related to OP risk (e.g., ‘THSD4’, ‘CRHR1’, ‘HSD11B1’, ‘THSD7A’, ‘BMPR1B’ ‘ADCY10’, ‘PRL’, ‘CA8’,’ESRRA’, ‘CALM1’, ‘CALM1’, ‘SPARC’, and ‘LRP1’). Moreover, we uncovered novel osteoporosis susceptible genes (‘DICER1’, ‘PTMA’, etc.) that were not found previously but play functionally important roles in osteoporosis etiology from existing studies. In addition, the SRC method identified genes can lead to higher accuracy for the diagnosis/classification of osteoporosis subjects when compared with the traditional T-test and Fisher-exact test, which further validates the proposed SRC approach for integrative analysis.
format Online
Article
Text
id pubmed-3434191
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34341912012-09-06 Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data Cao, Hongbao Lei, Shufeng Deng, Hong-Wen Wang, Yu-Ping PLoS One Research Article Various types of genomic data (e.g., SNPs and mRNA transcripts) have been employed to identify risk genes for complex diseases. However, the analysis of these data has largely been performed in isolation. Combining these multiple data for integrative analysis can take advantage of complementary information and thus can have higher power to identify genes (and/or their functions) that would otherwise be impossible with individual data analysis. Due to the different nature, structure, and format of diverse sets of genomic data, multiple genomic data integration is challenging. Here we address the problem by developing a sparse representation based clustering (SRC) method for integrative data analysis. As an example, we applied the SRC method to the integrative analysis of 376821 SNPs in 200 subjects (100 cases and 100 controls) and expression data for 22283 genes in 80 subjects (40 cases and 40 controls) to identify significant genes for osteoporosis (OP). Comparing our results with previous studies, we identified some genes known related to OP risk (e.g., ‘THSD4’, ‘CRHR1’, ‘HSD11B1’, ‘THSD7A’, ‘BMPR1B’ ‘ADCY10’, ‘PRL’, ‘CA8’,’ESRRA’, ‘CALM1’, ‘CALM1’, ‘SPARC’, and ‘LRP1’). Moreover, we uncovered novel osteoporosis susceptible genes (‘DICER1’, ‘PTMA’, etc.) that were not found previously but play functionally important roles in osteoporosis etiology from existing studies. In addition, the SRC method identified genes can lead to higher accuracy for the diagnosis/classification of osteoporosis subjects when compared with the traditional T-test and Fisher-exact test, which further validates the proposed SRC approach for integrative analysis. Public Library of Science 2012-09-05 /pmc/articles/PMC3434191/ /pubmed/22957024 http://dx.doi.org/10.1371/journal.pone.0042755 Text en © 2012 Cao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Cao, Hongbao
Lei, Shufeng
Deng, Hong-Wen
Wang, Yu-Ping
Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data
title Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data
title_full Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data
title_fullStr Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data
title_full_unstemmed Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data
title_short Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data
title_sort identification of genes for complex diseases using integrated analysis of multiple types of genomic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3434191/
https://www.ncbi.nlm.nih.gov/pubmed/22957024
http://dx.doi.org/10.1371/journal.pone.0042755
work_keys_str_mv AT caohongbao identificationofgenesforcomplexdiseasesusingintegratedanalysisofmultipletypesofgenomicdata
AT leishufeng identificationofgenesforcomplexdiseasesusingintegratedanalysisofmultipletypesofgenomicdata
AT denghongwen identificationofgenesforcomplexdiseasesusingintegratedanalysisofmultipletypesofgenomicdata
AT wangyuping identificationofgenesforcomplexdiseasesusingintegratedanalysisofmultipletypesofgenomicdata