Cargando…
A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals
While prior research has shown that facial images signal personal information, publications in this field tend to assess the predictability of a single variable or a small set of variables at a time, which is problematic. Reported prediction quality is hard to compare and generalize across studies d...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687237/ https://www.ncbi.nlm.nih.gov/pubmed/38030632 http://dx.doi.org/10.1038/s41598-023-42054-9 |
_version_ | 1785151940571693056 |
---|---|
author | Tkachenko, Yegor Jedidi, Kamel |
author_facet | Tkachenko, Yegor Jedidi, Kamel |
author_sort | Tkachenko, Yegor |
collection | PubMed |
description | While prior research has shown that facial images signal personal information, publications in this field tend to assess the predictability of a single variable or a small set of variables at a time, which is problematic. Reported prediction quality is hard to compare and generalize across studies due to different study conditions. Another issue is selection bias: researchers may choose to study variables intuitively expected to be predictable and underreport unpredictable variables (the ‘file drawer’ problem). Policy makers thus have an incomplete picture for a risk-benefit analysis of facial analysis technology. To address these limitations, we perform a megastudy—a survey-based study that reports the predictability of numerous personal attributes (349 binary variables) from 2646 distinct facial images of 969 individuals. Using deep learning, we find 82/349 personal attributes (23%) are predictable better than random from facial image pixels. Adding facial images substantially boosts prediction quality versus demographics-only benchmark model. Our unexpected finding of strong predictability of iPhone versus Galaxy preference variable shows how testing many hypotheses simultaneously can facilitate knowledge discovery. Our proposed L1-regularized image decomposition method and other techniques point to smartphone camera artifacts, BMI, skin properties, and facial hair as top candidate non-demographic signals in facial images. |
format | Online Article Text |
id | pubmed-10687237 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-106872372023-11-30 A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals Tkachenko, Yegor Jedidi, Kamel Sci Rep Article While prior research has shown that facial images signal personal information, publications in this field tend to assess the predictability of a single variable or a small set of variables at a time, which is problematic. Reported prediction quality is hard to compare and generalize across studies due to different study conditions. Another issue is selection bias: researchers may choose to study variables intuitively expected to be predictable and underreport unpredictable variables (the ‘file drawer’ problem). Policy makers thus have an incomplete picture for a risk-benefit analysis of facial analysis technology. To address these limitations, we perform a megastudy—a survey-based study that reports the predictability of numerous personal attributes (349 binary variables) from 2646 distinct facial images of 969 individuals. Using deep learning, we find 82/349 personal attributes (23%) are predictable better than random from facial image pixels. Adding facial images substantially boosts prediction quality versus demographics-only benchmark model. Our unexpected finding of strong predictability of iPhone versus Galaxy preference variable shows how testing many hypotheses simultaneously can facilitate knowledge discovery. Our proposed L1-regularized image decomposition method and other techniques point to smartphone camera artifacts, BMI, skin properties, and facial hair as top candidate non-demographic signals in facial images. Nature Publishing Group UK 2023-11-29 /pmc/articles/PMC10687237/ /pubmed/38030632 http://dx.doi.org/10.1038/s41598-023-42054-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Tkachenko, Yegor Jedidi, Kamel A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals |
title | A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals |
title_full | A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals |
title_fullStr | A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals |
title_full_unstemmed | A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals |
title_short | A megastudy on the predictability of personal information from facial images: Disentangling demographic and non-demographic signals |
title_sort | megastudy on the predictability of personal information from facial images: disentangling demographic and non-demographic signals |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687237/ https://www.ncbi.nlm.nih.gov/pubmed/38030632 http://dx.doi.org/10.1038/s41598-023-42054-9 |
work_keys_str_mv | AT tkachenkoyegor amegastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals AT jedidikamel amegastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals AT tkachenkoyegor megastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals AT jedidikamel megastudyonthepredictabilityofpersonalinformationfromfacialimagesdisentanglingdemographicandnondemographicsignals |