Cargando…

Annotation and query of tissue microarray data using the NCI Thesaurus

BACKGROUND: The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations a...

Descripción completa

Detalles Bibliográficos
Autores principales: Shah, Nigam H, Rubin, Daniel L, Espinosa, Inigo, Montgomery, Kelli, Musen, Mark A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1988837/
https://www.ncbi.nlm.nih.gov/pubmed/17686183
http://dx.doi.org/10.1186/1471-2105-8-296
_version_ 1782135430484852736
author Shah, Nigam H
Rubin, Daniel L
Espinosa, Inigo
Montgomery, Kelli
Musen, Mark A
author_facet Shah, Nigam H
Rubin, Daniel L
Espinosa, Inigo
Montgomery, Kelli
Musen, Mark A
author_sort Shah, Nigam H
collection PubMed
description BACKGROUND: The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations are not structured according to any ontology, making future integration of this resource with other biological and clinical data difficult. RESULTS: We developed methods to map these annotations to the NCI thesaurus. Using the NCI-T we can effectively represent annotations for about 86% of the samples. We demonstrate how this mapping enables ontology driven integration and querying of tissue microarray data. We have deployed the mapping and ontology driven querying tools at the TMAD site for general use. CONCLUSION: We have demonstrated that we can effectively map the diagnosis-related terms describing a sample in TMAD to the NCI-T. The NCI thesaurus terms have a wide coverage and provide terms for about 86% of the samples. In our opinion the NCI thesaurus can facilitate integration of this resource with other biological data.
format Text
id pubmed-1988837
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19888372007-09-21 Annotation and query of tissue microarray data using the NCI Thesaurus Shah, Nigam H Rubin, Daniel L Espinosa, Inigo Montgomery, Kelli Musen, Mark A BMC Bioinformatics Software BACKGROUND: The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations are not structured according to any ontology, making future integration of this resource with other biological and clinical data difficult. RESULTS: We developed methods to map these annotations to the NCI thesaurus. Using the NCI-T we can effectively represent annotations for about 86% of the samples. We demonstrate how this mapping enables ontology driven integration and querying of tissue microarray data. We have deployed the mapping and ontology driven querying tools at the TMAD site for general use. CONCLUSION: We have demonstrated that we can effectively map the diagnosis-related terms describing a sample in TMAD to the NCI-T. The NCI thesaurus terms have a wide coverage and provide terms for about 86% of the samples. In our opinion the NCI thesaurus can facilitate integration of this resource with other biological data. BioMed Central 2007-08-08 /pmc/articles/PMC1988837/ /pubmed/17686183 http://dx.doi.org/10.1186/1471-2105-8-296 Text en Copyright © 2007 Shah et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Shah, Nigam H
Rubin, Daniel L
Espinosa, Inigo
Montgomery, Kelli
Musen, Mark A
Annotation and query of tissue microarray data using the NCI Thesaurus
title Annotation and query of tissue microarray data using the NCI Thesaurus
title_full Annotation and query of tissue microarray data using the NCI Thesaurus
title_fullStr Annotation and query of tissue microarray data using the NCI Thesaurus
title_full_unstemmed Annotation and query of tissue microarray data using the NCI Thesaurus
title_short Annotation and query of tissue microarray data using the NCI Thesaurus
title_sort annotation and query of tissue microarray data using the nci thesaurus
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1988837/
https://www.ncbi.nlm.nih.gov/pubmed/17686183
http://dx.doi.org/10.1186/1471-2105-8-296
work_keys_str_mv AT shahnigamh annotationandqueryoftissuemicroarraydatausingthencithesaurus
AT rubindaniell annotationandqueryoftissuemicroarraydatausingthencithesaurus
AT espinosainigo annotationandqueryoftissuemicroarraydatausingthencithesaurus
AT montgomerykelli annotationandqueryoftissuemicroarraydatausingthencithesaurus
AT musenmarka annotationandqueryoftissuemicroarraydatausingthencithesaurus