Cargando…
Automatically assembling a full census of an academic field
The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114776/ https://www.ncbi.nlm.nih.gov/pubmed/30157278 http://dx.doi.org/10.1371/journal.pone.0202223 |
_version_ | 1783351255611998208 |
---|---|
author | Morgan, Allison C. Way, Samuel F. Clauset, Aaron |
author_facet | Morgan, Allison C. Way, Samuel F. Clauset, Aaron |
author_sort | Morgan, Allison C. |
collection | PubMed |
description | The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method can quickly construct a complete census of the field, and achieve over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction. |
format | Online Article Text |
id | pubmed-6114776 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-61147762018-09-17 Automatically assembling a full census of an academic field Morgan, Allison C. Way, Samuel F. Clauset, Aaron PLoS One Research Article The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method can quickly construct a complete census of the field, and achieve over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction. Public Library of Science 2018-08-29 /pmc/articles/PMC6114776/ /pubmed/30157278 http://dx.doi.org/10.1371/journal.pone.0202223 Text en © 2018 Morgan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Morgan, Allison C. Way, Samuel F. Clauset, Aaron Automatically assembling a full census of an academic field |
title | Automatically assembling a full census of an academic field |
title_full | Automatically assembling a full census of an academic field |
title_fullStr | Automatically assembling a full census of an academic field |
title_full_unstemmed | Automatically assembling a full census of an academic field |
title_short | Automatically assembling a full census of an academic field |
title_sort | automatically assembling a full census of an academic field |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114776/ https://www.ncbi.nlm.nih.gov/pubmed/30157278 http://dx.doi.org/10.1371/journal.pone.0202223 |
work_keys_str_mv | AT morganallisonc automaticallyassemblingafullcensusofanacademicfield AT waysamuelf automaticallyassemblingafullcensusofanacademicfield AT clausetaaron automaticallyassemblingafullcensusofanacademicfield |