Cargando…

The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database

The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and convert free-text information into a structured format using official nomenclature, integrat...

Descripción completa

Detalles Bibliográficos
Autores principales: Davis, Allan Peter, Wiegers, Thomas C., Murphy, Cynthia G., Mattingly, Carolyn J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3176677/
https://www.ncbi.nlm.nih.gov/pubmed/21933848
http://dx.doi.org/10.1093/database/bar034
_version_ 1782212240090333184
author Davis, Allan Peter
Wiegers, Thomas C.
Murphy, Cynthia G.
Mattingly, Carolyn J.
author_facet Davis, Allan Peter
Wiegers, Thomas C.
Murphy, Cynthia G.
Mattingly, Carolyn J.
author_sort Davis, Allan Peter
collection PubMed
description The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and convert free-text information into a structured format using official nomenclature, integrating third party controlled vocabularies for chemicals, genes, diseases and organisms, and a novel controlled vocabulary for molecular interactions. Manual curation produces a robust, richly annotated dataset of highly accurate and detailed information. Currently, CTD describes over 349 000 molecular interactions between 6800 chemicals, 20 900 genes (for 330 organisms) and 4300 diseases that have been manually curated from over 25 400 peer-reviewed articles. This manually curated data are further integrated with other third party data (e.g. Gene Ontology, KEGG and Reactome annotations) to generate a wealth of toxicogenomic relationships. Here, we describe our approach to manual curation that uses a powerful and efficient paradigm involving mnemonic codes. This strategy allows biocurators to quickly capture detailed information from articles by generating simple statements using codes to represent the relationships between data types. The paradigm is versatile, expandable, and able to accommodate new data challenges that arise. We have incorporated this strategy into a web-based curation tool to further increase efficiency and productivity, implement quality control in real-time and accommodate biocurators working remotely. Database URL: http://ctd.mdibl.org
format Online
Article
Text
id pubmed-3176677
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31766772011-09-20 The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database Davis, Allan Peter Wiegers, Thomas C. Murphy, Cynthia G. Mattingly, Carolyn J. Database (Oxford) Database Tool The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and convert free-text information into a structured format using official nomenclature, integrating third party controlled vocabularies for chemicals, genes, diseases and organisms, and a novel controlled vocabulary for molecular interactions. Manual curation produces a robust, richly annotated dataset of highly accurate and detailed information. Currently, CTD describes over 349 000 molecular interactions between 6800 chemicals, 20 900 genes (for 330 organisms) and 4300 diseases that have been manually curated from over 25 400 peer-reviewed articles. This manually curated data are further integrated with other third party data (e.g. Gene Ontology, KEGG and Reactome annotations) to generate a wealth of toxicogenomic relationships. Here, we describe our approach to manual curation that uses a powerful and efficient paradigm involving mnemonic codes. This strategy allows biocurators to quickly capture detailed information from articles by generating simple statements using codes to represent the relationships between data types. The paradigm is versatile, expandable, and able to accommodate new data challenges that arise. We have incorporated this strategy into a web-based curation tool to further increase efficiency and productivity, implement quality control in real-time and accommodate biocurators working remotely. Database URL: http://ctd.mdibl.org Oxford University Press 2011-09-20 /pmc/articles/PMC3176677/ /pubmed/21933848 http://dx.doi.org/10.1093/database/bar034 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Tool
Davis, Allan Peter
Wiegers, Thomas C.
Murphy, Cynthia G.
Mattingly, Carolyn J.
The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database
title The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database
title_full The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database
title_fullStr The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database
title_full_unstemmed The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database
title_short The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database
title_sort curation paradigm and application tool used for manual curation of the scientific literature at the comparative toxicogenomics database
topic Database Tool
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3176677/
https://www.ncbi.nlm.nih.gov/pubmed/21933848
http://dx.doi.org/10.1093/database/bar034
work_keys_str_mv AT davisallanpeter thecurationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT wiegersthomasc thecurationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT murphycynthiag thecurationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT mattinglycarolynj thecurationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT davisallanpeter curationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT wiegersthomasc curationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT murphycynthiag curationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase
AT mattinglycarolynj curationparadigmandapplicationtoolusedformanualcurationofthescientificliteratureatthecomparativetoxicogenomicsdatabase