Cargando…
Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learnin...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112737/ https://www.ncbi.nlm.nih.gov/pubmed/33981842 http://dx.doi.org/10.1177/2378023118820157 |
_version_ | 1783690727025278976 |
---|---|
author | Filippova, Anna Gilroy, Connor Kashyap, Ridhi Kirchner, Antje Morgan, Allison C. Polimis, Kivan Usmani, Adaner Wang, Tong |
author_facet | Filippova, Anna Gilroy, Connor Kashyap, Ridhi Kirchner, Antje Morgan, Allison C. Polimis, Kivan Usmani, Adaner Wang, Tong |
author_sort | Filippova, Anna |
collection | PubMed |
description | Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learning methods to address these problems. The authors implement such a “human-in-the-loop” approach in the Fragile Families Challenge. The authors use surveys to elicit knowledge from experts and laypeople about the importance of different variables to different outcomes. This strategy offers the option to subset the data before prediction or to incorporate human knowledge as scores in prediction models, or both together. The authors find that human intervention is not obviously helpful. Human-informed subsetting reduces predictive performance, and considered alone, approaches incorporating scores perform marginally worse than approaches that do not. However, incorporating human knowledge may still improve predictive performance, and future research should consider new ways of doing so. |
format | Online Article Text |
id | pubmed-8112737 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
record_format | MEDLINE/PubMed |
spelling | pubmed-81127372021-05-11 Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data Filippova, Anna Gilroy, Connor Kashyap, Ridhi Kirchner, Antje Morgan, Allison C. Polimis, Kivan Usmani, Adaner Wang, Tong Socius Article Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learning methods to address these problems. The authors implement such a “human-in-the-loop” approach in the Fragile Families Challenge. The authors use surveys to elicit knowledge from experts and laypeople about the importance of different variables to different outcomes. This strategy offers the option to subset the data before prediction or to incorporate human knowledge as scores in prediction models, or both together. The authors find that human intervention is not obviously helpful. Human-informed subsetting reduces predictive performance, and considered alone, approaches incorporating scores perform marginally worse than approaches that do not. However, incorporating human knowledge may still improve predictive performance, and future research should consider new ways of doing so. 2019-09-10 2019 /pmc/articles/PMC8112737/ /pubmed/33981842 http://dx.doi.org/10.1177/2378023118820157 Text en https://sagepub.com/journals-permissionsArticle reuse guidelines: sagepub.com/journals-permissions (https://sagepub.com/journals-permissions) https://creativecommons.org/licenses/by-nc/4.0/Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Article Filippova, Anna Gilroy, Connor Kashyap, Ridhi Kirchner, Antje Morgan, Allison C. Polimis, Kivan Usmani, Adaner Wang, Tong Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data |
title | Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data |
title_full | Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data |
title_fullStr | Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data |
title_full_unstemmed | Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data |
title_short | Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data |
title_sort | humans in the loop: incorporating expert and crowd-sourced knowledge for predictions using survey data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112737/ https://www.ncbi.nlm.nih.gov/pubmed/33981842 http://dx.doi.org/10.1177/2378023118820157 |
work_keys_str_mv | AT filippovaanna humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT gilroyconnor humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT kashyapridhi humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT kirchnerantje humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT morganallisonc humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT polimiskivan humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT usmaniadaner humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata AT wangtong humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata |