Cargando…

Human Rights Texts: Converting Human Rights Primary Source Documents into Data

We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In...

Descripción completa

Detalles Bibliográficos
Autores principales: Fariss, Christopher J., Linder, Fridolin J., Jones, Zachary M., Crabtree, Charles D., Biek, Megan A., Ross, Ana-Sophia M., Kaur, Taranamol, Tsai, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4587949/
https://www.ncbi.nlm.nih.gov/pubmed/26418817
http://dx.doi.org/10.1371/journal.pone.0138935
_version_ 1782392547436396544
author Fariss, Christopher J.
Linder, Fridolin J.
Jones, Zachary M.
Crabtree, Charles D.
Biek, Megan A.
Ross, Ana-Sophia M.
Kaur, Taranamol
Tsai, Michael
author_facet Fariss, Christopher J.
Linder, Fridolin J.
Jones, Zachary M.
Crabtree, Charles D.
Biek, Megan A.
Ross, Ana-Sophia M.
Kaur, Taranamol
Tsai, Michael
author_sort Fariss, Christopher J.
collection PubMed
description We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In addition to the digitized text, we also make available and describe document-term matrices, which are datasets that systematically organize the word counts from each unique document by each unique term within the corpus of human rights documents. To contextualize the importance of this corpus, we describe the development of coding procedures in the human rights community and several existing categorical indicators that have been created by human coding of the human rights documents contained in the corpus. We then discuss how the new human rights corpus and the existing human rights datasets can be used with a variety of statistical analyses and machine learning algorithms to help scholars understand how human rights practices and reporting have evolved over time. We close with a discussion of our plans for dataset maintenance, updating, and availability.
format Online
Article
Text
id pubmed-4587949
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-45879492015-10-02 Human Rights Texts: Converting Human Rights Primary Source Documents into Data Fariss, Christopher J. Linder, Fridolin J. Jones, Zachary M. Crabtree, Charles D. Biek, Megan A. Ross, Ana-Sophia M. Kaur, Taranamol Tsai, Michael PLoS One Research Article We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In addition to the digitized text, we also make available and describe document-term matrices, which are datasets that systematically organize the word counts from each unique document by each unique term within the corpus of human rights documents. To contextualize the importance of this corpus, we describe the development of coding procedures in the human rights community and several existing categorical indicators that have been created by human coding of the human rights documents contained in the corpus. We then discuss how the new human rights corpus and the existing human rights datasets can be used with a variety of statistical analyses and machine learning algorithms to help scholars understand how human rights practices and reporting have evolved over time. We close with a discussion of our plans for dataset maintenance, updating, and availability. Public Library of Science 2015-09-29 /pmc/articles/PMC4587949/ /pubmed/26418817 http://dx.doi.org/10.1371/journal.pone.0138935 Text en © 2015 Fariss et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Fariss, Christopher J.
Linder, Fridolin J.
Jones, Zachary M.
Crabtree, Charles D.
Biek, Megan A.
Ross, Ana-Sophia M.
Kaur, Taranamol
Tsai, Michael
Human Rights Texts: Converting Human Rights Primary Source Documents into Data
title Human Rights Texts: Converting Human Rights Primary Source Documents into Data
title_full Human Rights Texts: Converting Human Rights Primary Source Documents into Data
title_fullStr Human Rights Texts: Converting Human Rights Primary Source Documents into Data
title_full_unstemmed Human Rights Texts: Converting Human Rights Primary Source Documents into Data
title_short Human Rights Texts: Converting Human Rights Primary Source Documents into Data
title_sort human rights texts: converting human rights primary source documents into data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4587949/
https://www.ncbi.nlm.nih.gov/pubmed/26418817
http://dx.doi.org/10.1371/journal.pone.0138935
work_keys_str_mv AT farisschristopherj humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT linderfridolinj humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT joneszacharym humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT crabtreecharlesd humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT biekmegana humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT rossanasophiam humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT kaurtaranamol humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata
AT tsaimichael humanrightstextsconvertinghumanrightsprimarysourcedocumentsintodata