Cargando…

The ParlaMint corpora of parliamentary proceedings

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Depend...

Descripción completa

Detalles Bibliográficos
Autores principales: Erjavec, Tomaž, Ogrodniczuk, Maciej, Osenova, Petya, Ljubešić, Nikola, Simov, Kiril, Pančur, Andrej, Rudolf, Michał, Kopp, Matyáš, Barkarson, Starkaður, Steingrímsson, Steinþór, Çöltekin, Çağrı, de Does, Jesse, Depuydt, Katrien, Agnoloni, Tommaso, Venturi, Giulia, Pérez, María Calzada, de Macedo, Luciana D., Navarretta, Costanza, Luxardo, Giancarlo, Coole, Matthew, Rayson, Paul, Morkevičius, Vaidas, Krilavičius, Tomas, Darǵis, Roberts, Ring, Orsolya, van Heusden, Ruben, Marx, Maarten, Fišer, Darja
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8807381/
https://www.ncbi.nlm.nih.gov/pubmed/35125984
http://dx.doi.org/10.1007/s10579-021-09574-0
_version_ 1784643654421315584
author Erjavec, Tomaž
Ogrodniczuk, Maciej
Osenova, Petya
Ljubešić, Nikola
Simov, Kiril
Pančur, Andrej
Rudolf, Michał
Kopp, Matyáš
Barkarson, Starkaður
Steingrímsson, Steinþór
Çöltekin, Çağrı
de Does, Jesse
Depuydt, Katrien
Agnoloni, Tommaso
Venturi, Giulia
Pérez, María Calzada
de Macedo, Luciana D.
Navarretta, Costanza
Luxardo, Giancarlo
Coole, Matthew
Rayson, Paul
Morkevičius, Vaidas
Krilavičius, Tomas
Darǵis, Roberts
Ring, Orsolya
van Heusden, Ruben
Marx, Maarten
Fišer, Darja
author_facet Erjavec, Tomaž
Ogrodniczuk, Maciej
Osenova, Petya
Ljubešić, Nikola
Simov, Kiril
Pančur, Andrej
Rudolf, Michał
Kopp, Matyáš
Barkarson, Starkaður
Steingrímsson, Steinþór
Çöltekin, Çağrı
de Does, Jesse
Depuydt, Katrien
Agnoloni, Tommaso
Venturi, Giulia
Pérez, María Calzada
de Macedo, Luciana D.
Navarretta, Costanza
Luxardo, Giancarlo
Coole, Matthew
Rayson, Paul
Morkevičius, Vaidas
Krilavičius, Tomas
Darǵis, Roberts
Ring, Orsolya
van Heusden, Ruben
Marx, Maarten
Fišer, Darja
author_sort Erjavec, Tomaž
collection PubMed
description This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.
format Online
Article
Text
id pubmed-8807381
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-88073812022-02-02 The ParlaMint corpora of parliamentary proceedings Erjavec, Tomaž Ogrodniczuk, Maciej Osenova, Petya Ljubešić, Nikola Simov, Kiril Pančur, Andrej Rudolf, Michał Kopp, Matyáš Barkarson, Starkaður Steingrímsson, Steinþór Çöltekin, Çağrı de Does, Jesse Depuydt, Katrien Agnoloni, Tommaso Venturi, Giulia Pérez, María Calzada de Macedo, Luciana D. Navarretta, Costanza Luxardo, Giancarlo Coole, Matthew Rayson, Paul Morkevičius, Vaidas Krilavičius, Tomas Darǵis, Roberts Ring, Orsolya van Heusden, Ruben Marx, Maarten Fišer, Darja Lang Resour Eval Project Notes This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis. Springer Netherlands 2022-02-02 2023 /pmc/articles/PMC8807381/ /pubmed/35125984 http://dx.doi.org/10.1007/s10579-021-09574-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Project Notes
Erjavec, Tomaž
Ogrodniczuk, Maciej
Osenova, Petya
Ljubešić, Nikola
Simov, Kiril
Pančur, Andrej
Rudolf, Michał
Kopp, Matyáš
Barkarson, Starkaður
Steingrímsson, Steinþór
Çöltekin, Çağrı
de Does, Jesse
Depuydt, Katrien
Agnoloni, Tommaso
Venturi, Giulia
Pérez, María Calzada
de Macedo, Luciana D.
Navarretta, Costanza
Luxardo, Giancarlo
Coole, Matthew
Rayson, Paul
Morkevičius, Vaidas
Krilavičius, Tomas
Darǵis, Roberts
Ring, Orsolya
van Heusden, Ruben
Marx, Maarten
Fišer, Darja
The ParlaMint corpora of parliamentary proceedings
title The ParlaMint corpora of parliamentary proceedings
title_full The ParlaMint corpora of parliamentary proceedings
title_fullStr The ParlaMint corpora of parliamentary proceedings
title_full_unstemmed The ParlaMint corpora of parliamentary proceedings
title_short The ParlaMint corpora of parliamentary proceedings
title_sort parlamint corpora of parliamentary proceedings
topic Project Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8807381/
https://www.ncbi.nlm.nih.gov/pubmed/35125984
http://dx.doi.org/10.1007/s10579-021-09574-0
work_keys_str_mv AT erjavectomaz theparlamintcorporaofparliamentaryproceedings
AT ogrodniczukmaciej theparlamintcorporaofparliamentaryproceedings
AT osenovapetya theparlamintcorporaofparliamentaryproceedings
AT ljubesicnikola theparlamintcorporaofparliamentaryproceedings
AT simovkiril theparlamintcorporaofparliamentaryproceedings
AT pancurandrej theparlamintcorporaofparliamentaryproceedings
AT rudolfmichał theparlamintcorporaofparliamentaryproceedings
AT koppmatyas theparlamintcorporaofparliamentaryproceedings
AT barkarsonstarkaður theparlamintcorporaofparliamentaryproceedings
AT steingrimssonsteinþor theparlamintcorporaofparliamentaryproceedings
AT coltekincagrı theparlamintcorporaofparliamentaryproceedings
AT dedoesjesse theparlamintcorporaofparliamentaryproceedings
AT depuydtkatrien theparlamintcorporaofparliamentaryproceedings
AT agnolonitommaso theparlamintcorporaofparliamentaryproceedings
AT venturigiulia theparlamintcorporaofparliamentaryproceedings
AT perezmariacalzada theparlamintcorporaofparliamentaryproceedings
AT demacedolucianad theparlamintcorporaofparliamentaryproceedings
AT navarrettacostanza theparlamintcorporaofparliamentaryproceedings
AT luxardogiancarlo theparlamintcorporaofparliamentaryproceedings
AT coolematthew theparlamintcorporaofparliamentaryproceedings
AT raysonpaul theparlamintcorporaofparliamentaryproceedings
AT morkeviciusvaidas theparlamintcorporaofparliamentaryproceedings
AT krilaviciustomas theparlamintcorporaofparliamentaryproceedings
AT dargisroberts theparlamintcorporaofparliamentaryproceedings
AT ringorsolya theparlamintcorporaofparliamentaryproceedings
AT vanheusdenruben theparlamintcorporaofparliamentaryproceedings
AT marxmaarten theparlamintcorporaofparliamentaryproceedings
AT fiserdarja theparlamintcorporaofparliamentaryproceedings
AT erjavectomaz parlamintcorporaofparliamentaryproceedings
AT ogrodniczukmaciej parlamintcorporaofparliamentaryproceedings
AT osenovapetya parlamintcorporaofparliamentaryproceedings
AT ljubesicnikola parlamintcorporaofparliamentaryproceedings
AT simovkiril parlamintcorporaofparliamentaryproceedings
AT pancurandrej parlamintcorporaofparliamentaryproceedings
AT rudolfmichał parlamintcorporaofparliamentaryproceedings
AT koppmatyas parlamintcorporaofparliamentaryproceedings
AT barkarsonstarkaður parlamintcorporaofparliamentaryproceedings
AT steingrimssonsteinþor parlamintcorporaofparliamentaryproceedings
AT coltekincagrı parlamintcorporaofparliamentaryproceedings
AT dedoesjesse parlamintcorporaofparliamentaryproceedings
AT depuydtkatrien parlamintcorporaofparliamentaryproceedings
AT agnolonitommaso parlamintcorporaofparliamentaryproceedings
AT venturigiulia parlamintcorporaofparliamentaryproceedings
AT perezmariacalzada parlamintcorporaofparliamentaryproceedings
AT demacedolucianad parlamintcorporaofparliamentaryproceedings
AT navarrettacostanza parlamintcorporaofparliamentaryproceedings
AT luxardogiancarlo parlamintcorporaofparliamentaryproceedings
AT coolematthew parlamintcorporaofparliamentaryproceedings
AT raysonpaul parlamintcorporaofparliamentaryproceedings
AT morkeviciusvaidas parlamintcorporaofparliamentaryproceedings
AT krilaviciustomas parlamintcorporaofparliamentaryproceedings
AT dargisroberts parlamintcorporaofparliamentaryproceedings
AT ringorsolya parlamintcorporaofparliamentaryproceedings
AT vanheusdenruben parlamintcorporaofparliamentaryproceedings
AT marxmaarten parlamintcorporaofparliamentaryproceedings
AT fiserdarja parlamintcorporaofparliamentaryproceedings