Cargando…

Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices

Private and public breeding programs, as well as companies and universities, have developed different genomics technologies that have resulted in the generation of unprecedented amounts of sequence data, which bring new challenges in terms of data management, query, and analysis. The magnitude and c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Akdemir, Deniz, Knox, Ron, Isidro y Sánchez, Julio
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Plant Science
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7381228/ https://www.ncbi.nlm.nih.gov/pubmed/32765543 http://dx.doi.org/10.3389/fpls.2020.00947

_version_	1783563004142419968
author	Akdemir, Deniz Knox, Ron Isidro y Sánchez, Julio
author_facet	Akdemir, Deniz Knox, Ron Isidro y Sánchez, Julio
author_sort	Akdemir, Deniz
collection	PubMed
description	Private and public breeding programs, as well as companies and universities, have developed different genomics technologies that have resulted in the generation of unprecedented amounts of sequence data, which bring new challenges in terms of data management, query, and analysis. The magnitude and complexity of these datasets bring new challenges but also an opportunity to use the data available as a whole. Detailed phenotype data, combined with increasing amounts of genomic data, have an enormous potential to accelerate the identification of key traits to improve our understanding of quantitative genetics. Data harmonization enables cross-national and international comparative research, facilitating the extraction of new scientific knowledge. In this paper, we address the complex issue of combining high dimensional and unbalanced omics data. More specifically, we propose a covariance-based method for combining partial datasets in the genotype to phenotype spectrum. This method can be used to combine partially overlapping relationship/covariance matrices. Here, we show with applications that our approach might be advantageous to feature imputation based approaches; we demonstrate how this method can be used in genomic prediction using heterogeneous marker data and also how to combine the data from multiple phenotypic experiments to make inferences about previously unobserved trait relationships. Our results demonstrate that it is possible to harmonize datasets to improve available information across gene-banks, data repositories, or other data resources.
format	Online Article Text
id	pubmed-7381228
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-73812282020-08-05 Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices Akdemir, Deniz Knox, Ron Isidro y Sánchez, Julio Front Plant Sci Plant Science Private and public breeding programs, as well as companies and universities, have developed different genomics technologies that have resulted in the generation of unprecedented amounts of sequence data, which bring new challenges in terms of data management, query, and analysis. The magnitude and complexity of these datasets bring new challenges but also an opportunity to use the data available as a whole. Detailed phenotype data, combined with increasing amounts of genomic data, have an enormous potential to accelerate the identification of key traits to improve our understanding of quantitative genetics. Data harmonization enables cross-national and international comparative research, facilitating the extraction of new scientific knowledge. In this paper, we address the complex issue of combining high dimensional and unbalanced omics data. More specifically, we propose a covariance-based method for combining partial datasets in the genotype to phenotype spectrum. This method can be used to combine partially overlapping relationship/covariance matrices. Here, we show with applications that our approach might be advantageous to feature imputation based approaches; we demonstrate how this method can be used in genomic prediction using heterogeneous marker data and also how to combine the data from multiple phenotypic experiments to make inferences about previously unobserved trait relationships. Our results demonstrate that it is possible to harmonize datasets to improve available information across gene-banks, data repositories, or other data resources. Frontiers Media S.A. 2020-07-14 /pmc/articles/PMC7381228/ /pubmed/32765543 http://dx.doi.org/10.3389/fpls.2020.00947 Text en Copyright © 2020 Akdemir, Knox and Isidro y Sánchez http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Plant Science Akdemir, Deniz Knox, Ron Isidro y Sánchez, Julio Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices
title	Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices
title_full	Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices
title_fullStr	Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices
title_full_unstemmed	Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices
title_short	Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices
title_sort	combining partially overlapping multi-omics data in databases using relationship matrices
topic	Plant Science
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7381228/ https://www.ncbi.nlm.nih.gov/pubmed/32765543 http://dx.doi.org/10.3389/fpls.2020.00947
work_keys_str_mv	AT akdemirdeniz combiningpartiallyoverlappingmultiomicsdataindatabasesusingrelationshipmatrices AT knoxron combiningpartiallyoverlappingmultiomicsdataindatabasesusingrelationshipmatrices AT isidroysanchezjulio combiningpartiallyoverlappingmultiomicsdataindatabasesusingrelationshipmatrices

Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices

Ejemplares similares