Cargando…

Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature

BACKGROUND: The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, a...

Descripción completa

Detalles Bibliográficos
Autores principales: Dahdul, Wasila M., Balhoff, James P., Engeman, Jeffrey, Grande, Terry, Hilton, Eric J., Kothari, Cartik, Lapp, Hilmar, Lundberg, John G., Midford, Peter E., Vision, Todd J., Westerfield, Monte, Mabee, Paula M.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873956/
https://www.ncbi.nlm.nih.gov/pubmed/20505755
http://dx.doi.org/10.1371/journal.pone.0010708
_version_ 1782181421730758656
author Dahdul, Wasila M.
Balhoff, James P.
Engeman, Jeffrey
Grande, Terry
Hilton, Eric J.
Kothari, Cartik
Lapp, Hilmar
Lundberg, John G.
Midford, Peter E.
Vision, Todd J.
Westerfield, Monte
Mabee, Paula M.
author_facet Dahdul, Wasila M.
Balhoff, James P.
Engeman, Jeffrey
Grande, Terry
Hilton, Eric J.
Kothari, Cartik
Lapp, Hilmar
Lundberg, John G.
Midford, Peter E.
Vision, Todd J.
Westerfield, Monte
Mabee, Paula M.
author_sort Dahdul, Wasila M.
collection PubMed
description BACKGROUND: The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies. METHODOLOGY/PRINCIPAL FINDINGS: We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish, http://zfin.org). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators. CONCLUSIONS/SIGNIFICANCE: The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.
format Text
id pubmed-2873956
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28739562010-05-26 Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature Dahdul, Wasila M. Balhoff, James P. Engeman, Jeffrey Grande, Terry Hilton, Eric J. Kothari, Cartik Lapp, Hilmar Lundberg, John G. Midford, Peter E. Vision, Todd J. Westerfield, Monte Mabee, Paula M. PLoS One Research Article BACKGROUND: The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies. METHODOLOGY/PRINCIPAL FINDINGS: We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish, http://zfin.org). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators. CONCLUSIONS/SIGNIFICANCE: The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics. Public Library of Science 2010-05-20 /pmc/articles/PMC2873956/ /pubmed/20505755 http://dx.doi.org/10.1371/journal.pone.0010708 Text en Dahdul et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dahdul, Wasila M.
Balhoff, James P.
Engeman, Jeffrey
Grande, Terry
Hilton, Eric J.
Kothari, Cartik
Lapp, Hilmar
Lundberg, John G.
Midford, Peter E.
Vision, Todd J.
Westerfield, Monte
Mabee, Paula M.
Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
title Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
title_full Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
title_fullStr Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
title_full_unstemmed Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
title_short Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
title_sort evolutionary characters, phenotypes and ontologies: curating data from the systematic biology literature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873956/
https://www.ncbi.nlm.nih.gov/pubmed/20505755
http://dx.doi.org/10.1371/journal.pone.0010708
work_keys_str_mv AT dahdulwasilam evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT balhoffjamesp evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT engemanjeffrey evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT grandeterry evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT hiltonericj evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT kotharicartik evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT lapphilmar evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT lundbergjohng evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT midfordpetere evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT visiontoddj evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT westerfieldmonte evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature
AT mabeepaulam evolutionarycharactersphenotypesandontologiescuratingdatafromthesystematicbiologyliterature