Cargando…

A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals

While prior research has shown that facial images signal personal information, publications in this field tend to assess the predictability of a single variable or a small set of variables at a time, which is problematic. Reported prediction quality is hard to compare and generalize across studies d...

Descripción completa

Detalles Bibliográficos
Autores principales: Tkachenko, Yegor, Jedidi, Kamel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687237/
https://www.ncbi.nlm.nih.gov/pubmed/38030632
http://dx.doi.org/10.1038/s41598-023-42054-9
_version_ 1785151940571693056
author Tkachenko, Yegor
Jedidi, Kamel
author_facet Tkachenko, Yegor
Jedidi, Kamel
author_sort Tkachenko, Yegor
collection PubMed
description While prior research has shown that facial images signal personal information, publications in this field tend to assess the predictability of a single variable or a small set of variables at a time, which is problematic. Reported prediction quality is hard to compare and generalize across studies due to different study conditions. Another issue is selection bias: researchers may choose to study variables intuitively expected to be predictable and underreport unpredictable variables (the ‘file drawer’ problem). Policy makers thus have an incomplete picture for a risk-benefit analysis of facial analysis technology. To address these limitations, we perform a megastudy—a survey-based study that reports the predictability of numerous personal attributes (349 binary variables) from 2646 distinct facial images of 969 individuals. Using deep learning, we find 82/349 personal attributes (23%) are predictable better than random from facial image pixels. Adding facial images substantially boosts prediction quality versus demographics-only benchmark model. Our unexpected finding of strong predictability of iPhone versus Galaxy preference variable shows how testing many hypotheses simultaneously can facilitate knowledge discovery. Our proposed L1-regularized image decomposition method and other techniques point to smartphone camera artifacts, BMI, skin properties, and facial hair as top candidate non-demographic signals in facial images.
format Online
Article
Text
id pubmed-10687237
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106872372023-11-30 A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals Tkachenko, Yegor Jedidi, Kamel Sci Rep Article While prior research has shown that facial images signal personal information, publications in this field tend to assess the predictability of a single variable or a small set of variables at a time, which is problematic. Reported prediction quality is hard to compare and generalize across studies due to different study conditions. Another issue is selection bias: researchers may choose to study variables intuitively expected to be predictable and underreport unpredictable variables (the ‘file drawer’ problem). Policy makers thus have an incomplete picture for a risk-benefit analysis of facial analysis technology. To address these limitations, we perform a megastudy—a survey-based study that reports the predictability of numerous personal attributes (349 binary variables) from 2646 distinct facial images of 969 individuals. Using deep learning, we find 82/349 personal attributes (23%) are predictable better than random from facial image pixels. Adding facial images substantially boosts prediction quality versus demographics-only benchmark model. Our unexpected finding of strong predictability of iPhone versus Galaxy preference variable shows how testing many hypotheses simultaneously can facilitate knowledge discovery. Our proposed L1-regularized image decomposition method and other techniques point to smartphone camera artifacts, BMI, skin properties, and facial hair as top candidate non-demographic signals in facial images. Nature Publishing Group UK 2023-11-29 /pmc/articles/PMC10687237/ /pubmed/38030632 http://dx.doi.org/10.1038/s41598-023-42054-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Tkachenko, Yegor
Jedidi, Kamel
A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
title A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
title_full A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
title_fullStr A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
title_full_unstemmed A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
title_short A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
title_sort megastudy on the predictability of personal information from facial images: disentangling demographic and non-demographic signals
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687237/
https://www.ncbi.nlm.nih.gov/pubmed/38030632
http://dx.doi.org/10.1038/s41598-023-42054-9
work_keys_str_mv AT tkachenkoyegor amegastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals
AT jedidikamel amegastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals
AT tkachenkoyegor megastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals
AT jedidikamel megastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals