Cargando…

The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research

In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perfor...

Descripción completa

Detalles Bibliográficos
Autores principales: Estivalet, Gustavo L., Meunier, Fanny
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4668042/
https://www.ncbi.nlm.nih.gov/pubmed/26630138
http://dx.doi.org/10.1371/journal.pone.0144016
_version_ 1782403921314054144
author Estivalet, Gustavo L.
Meunier, Fanny
author_facet Estivalet, Gustavo L.
Meunier, Fanny
author_sort Estivalet, Gustavo L.
collection PubMed
description In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior.
format Online
Article
Text
id pubmed-4668042
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46680422015-12-10 The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research Estivalet, Gustavo L. Meunier, Fanny PLoS One Research Article In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. Public Library of Science 2015-12-02 /pmc/articles/PMC4668042/ /pubmed/26630138 http://dx.doi.org/10.1371/journal.pone.0144016 Text en © 2015 Estivalet, Meunier http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Estivalet, Gustavo L.
Meunier, Fanny
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
title The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
title_full The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
title_fullStr The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
title_full_unstemmed The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
title_short The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
title_sort brazilian portuguese lexicon: an instrument for psycholinguistic research
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4668042/
https://www.ncbi.nlm.nih.gov/pubmed/26630138
http://dx.doi.org/10.1371/journal.pone.0144016
work_keys_str_mv AT estivaletgustavol thebrazilianportugueselexiconaninstrumentforpsycholinguisticresearch
AT meunierfanny thebrazilianportugueselexiconaninstrumentforpsycholinguisticresearch
AT estivaletgustavol brazilianportugueselexiconaninstrumentforpsycholinguisticresearch
AT meunierfanny brazilianportugueselexiconaninstrumentforpsycholinguisticresearch