Cargando…
Pantheon 1.0, a manually verified dataset of globally famous biographies
We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified d...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4700860/ https://www.ncbi.nlm.nih.gov/pubmed/26731133 http://dx.doi.org/10.1038/sdata.2015.75 |
_version_ | 1782408391755300864 |
---|---|
author | Yu, Amy Zhao Ronen, Shahar Hu, Kevin Lu, Tiffany Hidalgo, César A. |
author_facet | Yu, Amy Zhao Ronen, Shahar Hu, Kevin Lu, Tiffany Hidalgo, César A. |
author_sort | Yu, Amy Zhao |
collection | PubMed |
description | We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified demographic information (place and date of birth, gender) (ii) a taxonomy of occupations classifying each biography at three levels of aggregation and (iii) two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008–2013). We compare the Pantheon 1.0 dataset to data from the 2003 book, Human Accomplishments, and also to external measures of accomplishment in individual games and sports: Tennis, Swimming, Car Racing, and Chess. In all of these cases we find that measures of popularity (L and HPI) correlate highly with individual accomplishment, suggesting that measures of global popularity proxy the historical impact of individuals. |
format | Online Article Text |
id | pubmed-4700860 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-47008602016-01-08 Pantheon 1.0, a manually verified dataset of globally famous biographies Yu, Amy Zhao Ronen, Shahar Hu, Kevin Lu, Tiffany Hidalgo, César A. Sci Data Data Descriptor We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified demographic information (place and date of birth, gender) (ii) a taxonomy of occupations classifying each biography at three levels of aggregation and (iii) two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008–2013). We compare the Pantheon 1.0 dataset to data from the 2003 book, Human Accomplishments, and also to external measures of accomplishment in individual games and sports: Tennis, Swimming, Car Racing, and Chess. In all of these cases we find that measures of popularity (L and HPI) correlate highly with individual accomplishment, suggesting that measures of global popularity proxy the historical impact of individuals. Nature Publishing Group 2016-01-05 /pmc/articles/PMC4700860/ /pubmed/26731133 http://dx.doi.org/10.1038/sdata.2015.75 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0 This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 Metadata associated with this Data Descriptor is available at http://www.nature.com/sdata/ and is released under the CC0 waiver to maximize reuse. |
spellingShingle | Data Descriptor Yu, Amy Zhao Ronen, Shahar Hu, Kevin Lu, Tiffany Hidalgo, César A. Pantheon 1.0, a manually verified dataset of globally famous biographies |
title | Pantheon 1.0, a manually verified dataset of globally famous biographies |
title_full | Pantheon 1.0, a manually verified dataset of globally famous biographies |
title_fullStr | Pantheon 1.0, a manually verified dataset of globally famous biographies |
title_full_unstemmed | Pantheon 1.0, a manually verified dataset of globally famous biographies |
title_short | Pantheon 1.0, a manually verified dataset of globally famous biographies |
title_sort | pantheon 1.0, a manually verified dataset of globally famous biographies |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4700860/ https://www.ncbi.nlm.nih.gov/pubmed/26731133 http://dx.doi.org/10.1038/sdata.2015.75 |
work_keys_str_mv | AT yuamyzhao pantheon10amanuallyverifieddatasetofgloballyfamousbiographies AT ronenshahar pantheon10amanuallyverifieddatasetofgloballyfamousbiographies AT hukevin pantheon10amanuallyverifieddatasetofgloballyfamousbiographies AT lutiffany pantheon10amanuallyverifieddatasetofgloballyfamousbiographies AT hidalgocesara pantheon10amanuallyverifieddatasetofgloballyfamousbiographies |