Cargando…

Extraction of candidate terms from a corpus of non-specialized, general language

Linguistic phenomena associated with the analysis of document content and employed for the purpose of organization and retrieval are well-visited objects of study in the field of library and information science. Language often acts as a gatekeeper, admitting or excluding people from gaining access t...

Descripción completa

Detalles Bibliográficos
Autores principales: Anguiano Peña, Gilberto, Naumis Peña, Catalina
Formato: Online Artículo
Lenguaje:spa
Publicado: Instituto de Investigaciones Bibliotecológicas y de la Información 2016
Materias:
Acceso en línea:http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471
https://dx.doi.org/10.1016/j.ibbai.2016.02.035
_version_ 1780761202118361088
author Anguiano Peña, Gilberto
Naumis Peña, Catalina
author_facet Anguiano Peña, Gilberto
Naumis Peña, Catalina
author_sort Anguiano Peña, Gilberto
collection Investigación Bibliotecológica: archivonomía, bibliotecología e información
description Linguistic phenomena associated with the analysis of document content and employed for the purpose of organization and retrieval are well-visited objects of study in the field of library and information science. Language often acts as a gatekeeper, admitting or excluding people from gaining access to knowledge. As such, the terms used in the scientific and technical language of research need to be kept up and their behavior within the domain examined. Documental content analysis of scientific texts provides knowledge of specialized lexicons and their specific applications, while differentiating them from common use in order to establish indexing languages. Thus, as proposed herein, the application of lexicographic techniques to documental content analysis of non-specialized language yields the components needed to describe and extract lexical units of the specialized language. 
format Online
Article
id oai_unam-bibliotecologica-article-54471
institution Universidad Nacional Autónoma de México
language spa
publishDate 2016
publisher Instituto de Investigaciones Bibliotecológicas y de la Información
record_format ojs
spelling oai_unam-bibliotecologica-article-544712017-05-19T12:18:01Z Extraction of candidate terms from a corpus of non-specialized, general language Extracción de candidatos a términos de un corpus de la lengua general Anguiano Peña, Gilberto Naumis Peña, Catalina Content Analysis Term Extraction Scientific Language Corpus of General Language Análisis de contenido Extracción Linguistic phenomena associated with the analysis of document content and employed for the purpose of organization and retrieval are well-visited objects of study in the field of library and information science. Language often acts as a gatekeeper, admitting or excluding people from gaining access to knowledge. As such, the terms used in the scientific and technical language of research need to be kept up and their behavior within the domain examined. Documental content analysis of scientific texts provides knowledge of specialized lexicons and their specific applications, while differentiating them from common use in order to establish indexing languages. Thus, as proposed herein, the application of lexicographic techniques to documental content analysis of non-specialized language yields the components needed to describe and extract lexical units of the specialized language.  Entre los objetos de estudio de la Bibliotecologia e Informacion se incluyen los fenomenos linguisticos asociados al analisis de contenido documental tanto para organizar la informacion como para recuperarla. Para ello, se deben rescatar los terminos usados en el lenguaje cientifico y tecnico, estudiar su ambito de dominio y comportamiento. A traves de la lengua se controla y se excluye el conocimiento que una poblacion pueda obtener. El analisis documental del contenido, en este caso de los textos de difusion cientifica, permite obtener un conocimiento de las unidades lexicas,  sus aplicaciones significativas y separar los terminos de la lengua general para crear lenguajes de indizacion.Es asi que por medio del analisis de contenido documental en un corpus de lengua general marcado con los metodos de la lexicografia se obtienen y caracterizan los componentes que permiten extraer unidades lexicas del lenguaje especializado mediante las tecnicas propuestas en el presente trabajo. Instituto de Investigaciones Bibliotecológicas y de la Información 2016-02-08 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion application/pdf text/html http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471 10.1016/j.ibbai.2016.02.035 Investigación Bibliotecológica. Archivonomía, bibliotecología e información; Vol. 29 No. 67 (2015); 19-45 Investigación Bibliotecológica: archivonomía, bibliotecología e información; Vol. 29 Núm. 67 (2015); 19-45 Investigación Bibliotecológica: archivonomía, bibliotecología e información; v. 29 n. 67 (2015); 19-45 2448-8321 0187-358X spa http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471/48448 http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471/51661 Derechos de autor 2015 Investigación Bibliotecológica: archivonomía, bibliotecología e información
spellingShingle Content Analysis
Term Extraction
Scientific Language
Corpus of General Language
Análisis de contenido
Extracción
Anguiano Peña, Gilberto
Naumis Peña, Catalina
Extraction of candidate terms from a corpus of non-specialized, general language
title Extraction of candidate terms from a corpus of non-specialized, general language
title_alt Extracción de candidatos a términos de un corpus de la lengua general
title_full Extraction of candidate terms from a corpus of non-specialized, general language
title_fullStr Extraction of candidate terms from a corpus of non-specialized, general language
title_full_unstemmed Extraction of candidate terms from a corpus of non-specialized, general language
title_short Extraction of candidate terms from a corpus of non-specialized, general language
title_sort extraction of candidate terms from a corpus of non-specialized, general language
topic Content Analysis
Term Extraction
Scientific Language
Corpus of General Language
Análisis de contenido
Extracción
topic_facet Content Analysis
Term Extraction
Scientific Language
Corpus of General Language
Análisis de contenido
Extracción
url http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471
https://dx.doi.org/10.1016/j.ibbai.2016.02.035
work_keys_str_mv AT anguianopenagilberto extractionofcandidatetermsfromacorpusofnonspecializedgenerallanguage
AT naumispenacatalina extractionofcandidatetermsfromacorpusofnonspecializedgenerallanguage
AT anguianopenagilberto extracciondecandidatosaterminosdeuncorpusdelalenguageneral
AT naumispenacatalina extracciondecandidatosaterminosdeuncorpusdelalenguageneral