Cargando…

Extraction of Transcript Diversity from Scientific Literature

Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is...

Descripción completa

Detalles Bibliográficos
Autores principales: Shah, Parantu K, Jensen, Lars J, Boué, Stéphanie, Bork, Peer
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1183516/
https://www.ncbi.nlm.nih.gov/pubmed/16103899
http://dx.doi.org/10.1371/journal.pcbi.0010010
_version_ 1782124703689736192
author Shah, Parantu K
Jensen, Lars J
Boué, Stéphanie
Bork, Peer
author_facet Shah, Parantu K
Jensen, Lars J
Boué, Stéphanie
Bork, Peer
author_sort Shah, Parantu K
collection PubMed
description Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term “alternative splicing” to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/.
format Text
id pubmed-1183516
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-11835162005-08-12 Extraction of Transcript Diversity from Scientific Literature Shah, Parantu K Jensen, Lars J Boué, Stéphanie Bork, Peer PLoS Comput Biol Research Article Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term “alternative splicing” to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/. Public Library of Science 2005-06 2005-06-24 /pmc/articles/PMC1183516/ /pubmed/16103899 http://dx.doi.org/10.1371/journal.pcbi.0010010 Text en Copyright: © 2005 Shah et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Shah, Parantu K
Jensen, Lars J
Boué, Stéphanie
Bork, Peer
Extraction of Transcript Diversity from Scientific Literature
title Extraction of Transcript Diversity from Scientific Literature
title_full Extraction of Transcript Diversity from Scientific Literature
title_fullStr Extraction of Transcript Diversity from Scientific Literature
title_full_unstemmed Extraction of Transcript Diversity from Scientific Literature
title_short Extraction of Transcript Diversity from Scientific Literature
title_sort extraction of transcript diversity from scientific literature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1183516/
https://www.ncbi.nlm.nih.gov/pubmed/16103899
http://dx.doi.org/10.1371/journal.pcbi.0010010
work_keys_str_mv AT shahparantuk extractionoftranscriptdiversityfromscientificliterature
AT jensenlarsj extractionoftranscriptdiversityfromscientificliterature
AT bouestephanie extractionoftranscriptdiversityfromscientificliterature
AT borkpeer extractionoftranscriptdiversityfromscientificliterature