Cargando…

Automatically assembling a full census of an academic field

The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain,...

Descripción completa

Detalles Bibliográficos
Autores principales: Morgan, Allison C., Way, Samuel F., Clauset, Aaron
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114776/
https://www.ncbi.nlm.nih.gov/pubmed/30157278
http://dx.doi.org/10.1371/journal.pone.0202223
_version_ 1783351255611998208
author Morgan, Allison C.
Way, Samuel F.
Clauset, Aaron
author_facet Morgan, Allison C.
Way, Samuel F.
Clauset, Aaron
author_sort Morgan, Allison C.
collection PubMed
description The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method can quickly construct a complete census of the field, and achieve over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction.
format Online
Article
Text
id pubmed-6114776
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61147762018-09-17 Automatically assembling a full census of an academic field Morgan, Allison C. Way, Samuel F. Clauset, Aaron PLoS One Research Article The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method can quickly construct a complete census of the field, and achieve over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction. Public Library of Science 2018-08-29 /pmc/articles/PMC6114776/ /pubmed/30157278 http://dx.doi.org/10.1371/journal.pone.0202223 Text en © 2018 Morgan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Morgan, Allison C.
Way, Samuel F.
Clauset, Aaron
Automatically assembling a full census of an academic field
title Automatically assembling a full census of an academic field
title_full Automatically assembling a full census of an academic field
title_fullStr Automatically assembling a full census of an academic field
title_full_unstemmed Automatically assembling a full census of an academic field
title_short Automatically assembling a full census of an academic field
title_sort automatically assembling a full census of an academic field
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114776/
https://www.ncbi.nlm.nih.gov/pubmed/30157278
http://dx.doi.org/10.1371/journal.pone.0202223
work_keys_str_mv AT morganallisonc automaticallyassemblingafullcensusofanacademicfield
AT waysamuelf automaticallyassemblingafullcensusofanacademicfield
AT clausetaaron automaticallyassemblingafullcensusofanacademicfield