Cargando…
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research
In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perfor...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4668042/ https://www.ncbi.nlm.nih.gov/pubmed/26630138 http://dx.doi.org/10.1371/journal.pone.0144016 |
_version_ | 1782403921314054144 |
---|---|
author | Estivalet, Gustavo L. Meunier, Fanny |
author_facet | Estivalet, Gustavo L. Meunier, Fanny |
author_sort | Estivalet, Gustavo L. |
collection | PubMed |
description | In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. |
format | Online Article Text |
id | pubmed-4668042 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-46680422015-12-10 The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research Estivalet, Gustavo L. Meunier, Fanny PLoS One Research Article In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. Public Library of Science 2015-12-02 /pmc/articles/PMC4668042/ /pubmed/26630138 http://dx.doi.org/10.1371/journal.pone.0144016 Text en © 2015 Estivalet, Meunier http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Estivalet, Gustavo L. Meunier, Fanny The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research |
title | The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research |
title_full | The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research |
title_fullStr | The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research |
title_full_unstemmed | The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research |
title_short | The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research |
title_sort | brazilian portuguese lexicon: an instrument for psycholinguistic research |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4668042/ https://www.ncbi.nlm.nih.gov/pubmed/26630138 http://dx.doi.org/10.1371/journal.pone.0144016 |
work_keys_str_mv | AT estivaletgustavol thebrazilianportugueselexiconaninstrumentforpsycholinguisticresearch AT meunierfanny thebrazilianportugueselexiconaninstrumentforpsycholinguisticresearch AT estivaletgustavol brazilianportugueselexiconaninstrumentforpsycholinguisticresearch AT meunierfanny brazilianportugueselexiconaninstrumentforpsycholinguisticresearch |