Cargando…

Data Integration in Genetics and Genomics: Methods and Challenges

Due to rapid technological advances, various types of genomic and proteomic data with different sizes, formats, and structures have become available. Among them are gene expression, single nucleotide polymorphism, copy number variation, and protein-protein/gene-gene interactions. Each of these disti...

Descripción completa

Detalles Bibliográficos
Autores principales: Hamid, Jemila S., Hu, Pingzhao, Roslin, Nicole M., Ling, Vicki, Greenwood, Celia M. T., Beyene, Joseph
Formato: Texto
Lenguaje:English
Publicado: SAGE-Hindawi Access to Research 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2950414/
https://www.ncbi.nlm.nih.gov/pubmed/20948564
http://dx.doi.org/10.4061/2009/869093
_version_ 1782187655503544320
author Hamid, Jemila S.
Hu, Pingzhao
Roslin, Nicole M.
Ling, Vicki
Greenwood, Celia M. T.
Beyene, Joseph
author_facet Hamid, Jemila S.
Hu, Pingzhao
Roslin, Nicole M.
Ling, Vicki
Greenwood, Celia M. T.
Beyene, Joseph
author_sort Hamid, Jemila S.
collection PubMed
description Due to rapid technological advances, various types of genomic and proteomic data with different sizes, formats, and structures have become available. Among them are gene expression, single nucleotide polymorphism, copy number variation, and protein-protein/gene-gene interactions. Each of these distinct data types provides a different, partly independent and complementary, view of the whole genome. However, understanding functions of genes, proteins, and other aspects of the genome requires more information than provided by each of the datasets. Integrating data from different sources is, therefore, an important part of current research in genomics and proteomics. Data integration also plays important roles in combining clinical, environmental, and demographic data with high-throughput genomic data. Nevertheless, the concept of data integration is not well defined in the literature and it may mean different things to different researchers. In this paper, we first propose a conceptual framework for integrating genetic, genomic, and proteomic data. The framework captures fundamental aspects of data integration and is developed taking the key steps in genetic, genomic, and proteomic data fusion. Secondly, we provide a review of some of the most commonly used current methods and approaches for combining genomic data with focus on the statistical aspects.
format Text
id pubmed-2950414
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher SAGE-Hindawi Access to Research
record_format MEDLINE/PubMed
spelling pubmed-29504142010-10-14 Data Integration in Genetics and Genomics: Methods and Challenges Hamid, Jemila S. Hu, Pingzhao Roslin, Nicole M. Ling, Vicki Greenwood, Celia M. T. Beyene, Joseph Hum Genomics Proteomics Review Article Due to rapid technological advances, various types of genomic and proteomic data with different sizes, formats, and structures have become available. Among them are gene expression, single nucleotide polymorphism, copy number variation, and protein-protein/gene-gene interactions. Each of these distinct data types provides a different, partly independent and complementary, view of the whole genome. However, understanding functions of genes, proteins, and other aspects of the genome requires more information than provided by each of the datasets. Integrating data from different sources is, therefore, an important part of current research in genomics and proteomics. Data integration also plays important roles in combining clinical, environmental, and demographic data with high-throughput genomic data. Nevertheless, the concept of data integration is not well defined in the literature and it may mean different things to different researchers. In this paper, we first propose a conceptual framework for integrating genetic, genomic, and proteomic data. The framework captures fundamental aspects of data integration and is developed taking the key steps in genetic, genomic, and proteomic data fusion. Secondly, we provide a review of some of the most commonly used current methods and approaches for combining genomic data with focus on the statistical aspects. SAGE-Hindawi Access to Research 2009-01-12 /pmc/articles/PMC2950414/ /pubmed/20948564 http://dx.doi.org/10.4061/2009/869093 Text en Copyright © 2009 Jemila S. Hamid et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review Article
Hamid, Jemila S.
Hu, Pingzhao
Roslin, Nicole M.
Ling, Vicki
Greenwood, Celia M. T.
Beyene, Joseph
Data Integration in Genetics and Genomics: Methods and Challenges
title Data Integration in Genetics and Genomics: Methods and Challenges
title_full Data Integration in Genetics and Genomics: Methods and Challenges
title_fullStr Data Integration in Genetics and Genomics: Methods and Challenges
title_full_unstemmed Data Integration in Genetics and Genomics: Methods and Challenges
title_short Data Integration in Genetics and Genomics: Methods and Challenges
title_sort data integration in genetics and genomics: methods and challenges
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2950414/
https://www.ncbi.nlm.nih.gov/pubmed/20948564
http://dx.doi.org/10.4061/2009/869093
work_keys_str_mv AT hamidjemilas dataintegrationingeneticsandgenomicsmethodsandchallenges
AT hupingzhao dataintegrationingeneticsandgenomicsmethodsandchallenges
AT roslinnicolem dataintegrationingeneticsandgenomicsmethodsandchallenges
AT lingvicki dataintegrationingeneticsandgenomicsmethodsandchallenges
AT greenwoodceliamt dataintegrationingeneticsandgenomicsmethodsandchallenges
AT beyenejoseph dataintegrationingeneticsandgenomicsmethodsandchallenges