Cargando…

BioGraph: Data Model for Linking and Querying Diverse Biological Metadata

Studying the association of gene function, diseases, and regulatory gene network reconstruction demands data compatibility. Data from different databases follow distinct schemas and are accessible in heterogenic ways. Although the experiments differ, data may still be related to the same biological...

Descripción completa

Detalles Bibliográficos
Autores principales: Veljković, Aleksandar N., Orlov, Yuriy L., Mitić, Nenad S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138499/
https://www.ncbi.nlm.nih.gov/pubmed/37108117
http://dx.doi.org/10.3390/ijms24086954
Descripción
Sumario:Studying the association of gene function, diseases, and regulatory gene network reconstruction demands data compatibility. Data from different databases follow distinct schemas and are accessible in heterogenic ways. Although the experiments differ, data may still be related to the same biological entities. Some entities may not be strictly biological, such as geolocations of habitats or paper references, but they provide a broader context for other entities. The same entities from different datasets can share similar properties, which may or may not be found within other datasets. Joint, simultaneous data fetching from multiple data sources is complicated for the end-user or, in many cases, unsupported and inefficient due to differences in data structures and ways of accessing the data. We propose BioGraph—a new model that enables connecting and retrieving information from the linked biological data that originated from diverse datasets. We have tested the model on metadata collected from five diverse public datasets and successfully constructed a knowledge graph containing more than 17 million model objects, of which 2.5 million are individual biological entity objects. The model enables the selection of complex patterns and retrieval of matched results that can be discovered only by joining the data from multiple sources.