Cargando…

Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data

Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learnin...

Descripción completa

Detalles Bibliográficos
Autores principales: Filippova, Anna, Gilroy, Connor, Kashyap, Ridhi, Kirchner, Antje, Morgan, Allison C., Polimis, Kivan, Usmani, Adaner, Wang, Tong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112737/
https://www.ncbi.nlm.nih.gov/pubmed/33981842
http://dx.doi.org/10.1177/2378023118820157
_version_ 1783690727025278976
author Filippova, Anna
Gilroy, Connor
Kashyap, Ridhi
Kirchner, Antje
Morgan, Allison C.
Polimis, Kivan
Usmani, Adaner
Wang, Tong
author_facet Filippova, Anna
Gilroy, Connor
Kashyap, Ridhi
Kirchner, Antje
Morgan, Allison C.
Polimis, Kivan
Usmani, Adaner
Wang, Tong
author_sort Filippova, Anna
collection PubMed
description Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learning methods to address these problems. The authors implement such a “human-in-the-loop” approach in the Fragile Families Challenge. The authors use surveys to elicit knowledge from experts and laypeople about the importance of different variables to different outcomes. This strategy offers the option to subset the data before prediction or to incorporate human knowledge as scores in prediction models, or both together. The authors find that human intervention is not obviously helpful. Human-informed subsetting reduces predictive performance, and considered alone, approaches incorporating scores perform marginally worse than approaches that do not. However, incorporating human knowledge may still improve predictive performance, and future research should consider new ways of doing so.
format Online
Article
Text
id pubmed-8112737
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-81127372021-05-11 Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data Filippova, Anna Gilroy, Connor Kashyap, Ridhi Kirchner, Antje Morgan, Allison C. Polimis, Kivan Usmani, Adaner Wang, Tong Socius Article Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learning methods to address these problems. The authors implement such a “human-in-the-loop” approach in the Fragile Families Challenge. The authors use surveys to elicit knowledge from experts and laypeople about the importance of different variables to different outcomes. This strategy offers the option to subset the data before prediction or to incorporate human knowledge as scores in prediction models, or both together. The authors find that human intervention is not obviously helpful. Human-informed subsetting reduces predictive performance, and considered alone, approaches incorporating scores perform marginally worse than approaches that do not. However, incorporating human knowledge may still improve predictive performance, and future research should consider new ways of doing so. 2019-09-10 2019 /pmc/articles/PMC8112737/ /pubmed/33981842 http://dx.doi.org/10.1177/2378023118820157 Text en https://sagepub.com/journals-permissionsArticle reuse guidelines: sagepub.com/journals-permissions (https://sagepub.com/journals-permissions) https://creativecommons.org/licenses/by-nc/4.0/Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Article
Filippova, Anna
Gilroy, Connor
Kashyap, Ridhi
Kirchner, Antje
Morgan, Allison C.
Polimis, Kivan
Usmani, Adaner
Wang, Tong
Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
title Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
title_full Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
title_fullStr Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
title_full_unstemmed Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
title_short Humans in the Loop: Incorporating Expert and Crowd-Sourced Knowledge for Predictions Using Survey Data
title_sort humans in the loop: incorporating expert and crowd-sourced knowledge for predictions using survey data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112737/
https://www.ncbi.nlm.nih.gov/pubmed/33981842
http://dx.doi.org/10.1177/2378023118820157
work_keys_str_mv AT filippovaanna humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT gilroyconnor humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT kashyapridhi humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT kirchnerantje humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT morganallisonc humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT polimiskivan humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT usmaniadaner humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata
AT wangtong humansintheloopincorporatingexpertandcrowdsourcedknowledgeforpredictionsusingsurveydata