Cargando…
Extraction of candidate terms from a corpus of non-specialized, general language
Linguistic phenomena associated with the analysis of document content and employed for the purpose of organization and retrieval are well-visited objects of study in the field of library and information science. Language often acts as a gatekeeper, admitting or excluding people from gaining access t...
Autores principales: | , |
---|---|
Formato: | Online Artículo |
Lenguaje: | spa |
Publicado: |
Instituto de Investigaciones Bibliotecológicas y de la Información
2016
|
Materias: | |
Acceso en línea: | http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471 https://dx.doi.org/10.1016/j.ibbai.2016.02.035 |
_version_ | 1780761202118361088 |
---|---|
author | Anguiano Peña, Gilberto Naumis Peña, Catalina |
author_facet | Anguiano Peña, Gilberto Naumis Peña, Catalina |
author_sort | Anguiano Peña, Gilberto |
collection | Investigación Bibliotecológica: archivonomía, bibliotecología e información |
description | Linguistic phenomena associated with the analysis of document content and employed for the purpose of organization and retrieval are well-visited objects of study in the field of library and information science. Language often acts as a gatekeeper, admitting or excluding people from gaining access to knowledge. As such, the terms used in the scientific and technical language of research need to be kept up and their behavior within the domain examined. Documental content analysis of scientific texts provides knowledge of specialized lexicons and their specific applications, while differentiating them from common use in order to establish indexing languages. Thus, as proposed herein, the application of lexicographic techniques to documental content analysis of non-specialized language yields the components needed to describe and extract lexical units of the specialized language. |
format | Online Article |
id | oai_unam-bibliotecologica-article-54471 |
institution | Universidad Nacional Autónoma de México |
language | spa |
publishDate | 2016 |
publisher | Instituto de Investigaciones Bibliotecológicas y de la Información |
record_format | ojs |
spelling | oai_unam-bibliotecologica-article-544712017-05-19T12:18:01Z Extraction of candidate terms from a corpus of non-specialized, general language Extracción de candidatos a términos de un corpus de la lengua general Anguiano Peña, Gilberto Naumis Peña, Catalina Content Analysis Term Extraction Scientific Language Corpus of General Language Análisis de contenido Extracción Linguistic phenomena associated with the analysis of document content and employed for the purpose of organization and retrieval are well-visited objects of study in the field of library and information science. Language often acts as a gatekeeper, admitting or excluding people from gaining access to knowledge. As such, the terms used in the scientific and technical language of research need to be kept up and their behavior within the domain examined. Documental content analysis of scientific texts provides knowledge of specialized lexicons and their specific applications, while differentiating them from common use in order to establish indexing languages. Thus, as proposed herein, the application of lexicographic techniques to documental content analysis of non-specialized language yields the components needed to describe and extract lexical units of the specialized language. Entre los objetos de estudio de la Bibliotecologia e Informacion se incluyen los fenomenos linguisticos asociados al analisis de contenido documental tanto para organizar la informacion como para recuperarla. Para ello, se deben rescatar los terminos usados en el lenguaje cientifico y tecnico, estudiar su ambito de dominio y comportamiento. A traves de la lengua se controla y se excluye el conocimiento que una poblacion pueda obtener. El analisis documental del contenido, en este caso de los textos de difusion cientifica, permite obtener un conocimiento de las unidades lexicas, sus aplicaciones significativas y separar los terminos de la lengua general para crear lenguajes de indizacion.Es asi que por medio del analisis de contenido documental en un corpus de lengua general marcado con los metodos de la lexicografia se obtienen y caracterizan los componentes que permiten extraer unidades lexicas del lenguaje especializado mediante las tecnicas propuestas en el presente trabajo. Instituto de Investigaciones Bibliotecológicas y de la Información 2016-02-08 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion application/pdf text/html http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471 10.1016/j.ibbai.2016.02.035 Investigación Bibliotecológica. Archivonomía, bibliotecología e información; Vol. 29 No. 67 (2015); 19-45 Investigación Bibliotecológica: archivonomía, bibliotecología e información; Vol. 29 Núm. 67 (2015); 19-45 Investigación Bibliotecológica: archivonomía, bibliotecología e información; v. 29 n. 67 (2015); 19-45 2448-8321 0187-358X spa http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471/48448 http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471/51661 Derechos de autor 2015 Investigación Bibliotecológica: archivonomía, bibliotecología e información |
spellingShingle | Content Analysis Term Extraction Scientific Language Corpus of General Language Análisis de contenido Extracción Anguiano Peña, Gilberto Naumis Peña, Catalina Extraction of candidate terms from a corpus of non-specialized, general language |
title | Extraction of candidate terms from a corpus of non-specialized, general language |
title_alt | Extracción de candidatos a términos de un corpus de la lengua general |
title_full | Extraction of candidate terms from a corpus of non-specialized, general language |
title_fullStr | Extraction of candidate terms from a corpus of non-specialized, general language |
title_full_unstemmed | Extraction of candidate terms from a corpus of non-specialized, general language |
title_short | Extraction of candidate terms from a corpus of non-specialized, general language |
title_sort | extraction of candidate terms from a corpus of non-specialized, general language |
topic | Content Analysis Term Extraction Scientific Language Corpus of General Language Análisis de contenido Extracción |
topic_facet | Content Analysis Term Extraction Scientific Language Corpus of General Language Análisis de contenido Extracción |
url | http://rev-ib.unam.mx/ib/index.php/ib/article/view/54471 https://dx.doi.org/10.1016/j.ibbai.2016.02.035 |
work_keys_str_mv | AT anguianopenagilberto extractionofcandidatetermsfromacorpusofnonspecializedgenerallanguage AT naumispenacatalina extractionofcandidatetermsfromacorpusofnonspecializedgenerallanguage AT anguianopenagilberto extracciondecandidatosaterminosdeuncorpusdelalenguageneral AT naumispenacatalina extracciondecandidatosaterminosdeuncorpusdelalenguageneral |