Cargando…
A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx)
To be computationally reproducible and efficient, integration of disparate data depends on shared entities whose matching meaning (semantics) can be computationally assessed. For biodiversity data one of the most prevalent shared entities for linking data records is the associated taxon concept. Unl...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8855714/ https://www.ncbi.nlm.nih.gov/pubmed/35186448 http://dx.doi.org/10.7717/peerj.12618 |
_version_ | 1784653705914613760 |
---|---|
author | Vaidya, Gaurav Cellinese, Nico Lapp, Hilmar |
author_facet | Vaidya, Gaurav Cellinese, Nico Lapp, Hilmar |
author_sort | Vaidya, Gaurav |
collection | PubMed |
description | To be computationally reproducible and efficient, integration of disparate data depends on shared entities whose matching meaning (semantics) can be computationally assessed. For biodiversity data one of the most prevalent shared entities for linking data records is the associated taxon concept. Unlike Linnaean taxon names, the traditional way in which taxon concepts are provided, phylogenetic definitions are native to phylogenetic trees and offer well-defined semantics that can be transformed into formal, computationally evaluable logic expressions. These attributes make them highly suitable for phylogeny-driven comparative biology by allowing computationally verifiable and reproducible integration of taxon-linked data against Tree of Life-scale phylogenies. To achieve this, the first step is transforming phylogenetic definitions from the natural language text in which they are published to a structured interoperable data format that maintains strong ties to semantics and lends itself well to sharing, reuse, and long-term archival. To this end, we developed the Phyloreference Exchange Format (Phyx), a JSON-LD-based text format encompassing rich metadata for all elements of a phylogenetic definition, and we created a supporting software library, phyx.js, to streamline computational management of such files. Together they form a foundation layer for digitizing and computing with phylogenetic definitions of clades. |
format | Online Article Text |
id | pubmed-8855714 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-88557142022-02-19 A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) Vaidya, Gaurav Cellinese, Nico Lapp, Hilmar PeerJ Bioinformatics To be computationally reproducible and efficient, integration of disparate data depends on shared entities whose matching meaning (semantics) can be computationally assessed. For biodiversity data one of the most prevalent shared entities for linking data records is the associated taxon concept. Unlike Linnaean taxon names, the traditional way in which taxon concepts are provided, phylogenetic definitions are native to phylogenetic trees and offer well-defined semantics that can be transformed into formal, computationally evaluable logic expressions. These attributes make them highly suitable for phylogeny-driven comparative biology by allowing computationally verifiable and reproducible integration of taxon-linked data against Tree of Life-scale phylogenies. To achieve this, the first step is transforming phylogenetic definitions from the natural language text in which they are published to a structured interoperable data format that maintains strong ties to semantics and lends itself well to sharing, reuse, and long-term archival. To this end, we developed the Phyloreference Exchange Format (Phyx), a JSON-LD-based text format encompassing rich metadata for all elements of a phylogenetic definition, and we created a supporting software library, phyx.js, to streamline computational management of such files. Together they form a foundation layer for digitizing and computing with phylogenetic definitions of clades. PeerJ Inc. 2022-02-15 /pmc/articles/PMC8855714/ /pubmed/35186448 http://dx.doi.org/10.7717/peerj.12618 Text en ©2022 Vaidya et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Vaidya, Gaurav Cellinese, Nico Lapp, Hilmar A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) |
title | A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) |
title_full | A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) |
title_fullStr | A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) |
title_full_unstemmed | A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) |
title_short | A new phylogenetic data standard for computable clade definitions: the Phyloreference Exchange Format (Phyx) |
title_sort | new phylogenetic data standard for computable clade definitions: the phyloreference exchange format (phyx) |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8855714/ https://www.ncbi.nlm.nih.gov/pubmed/35186448 http://dx.doi.org/10.7717/peerj.12618 |
work_keys_str_mv | AT vaidyagaurav anewphylogeneticdatastandardforcomputablecladedefinitionsthephyloreferenceexchangeformatphyx AT cellinesenico anewphylogeneticdatastandardforcomputablecladedefinitionsthephyloreferenceexchangeformatphyx AT lapphilmar anewphylogeneticdatastandardforcomputablecladedefinitionsthephyloreferenceexchangeformatphyx AT vaidyagaurav newphylogeneticdatastandardforcomputablecladedefinitionsthephyloreferenceexchangeformatphyx AT cellinesenico newphylogeneticdatastandardforcomputablecladedefinitionsthephyloreferenceexchangeformatphyx AT lapphilmar newphylogeneticdatastandardforcomputablecladedefinitionsthephyloreferenceexchangeformatphyx |