Cargando…

Questionnaire data analysis using information geometry

The analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric appr...

Descripción completa

Detalles Bibliográficos
Autores principales: Har-Shemesh, Omri, Quax, Rick, Lansing, J. Stephen, Sloot, Peter M. A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7248094/
https://www.ncbi.nlm.nih.gov/pubmed/32451420
http://dx.doi.org/10.1038/s41598-020-63760-8
_version_ 1783538294756212736
author Har-Shemesh, Omri
Quax, Rick
Lansing, J. Stephen
Sloot, Peter M. A.
author_facet Har-Shemesh, Omri
Quax, Rick
Lansing, J. Stephen
Sloot, Peter M. A.
author_sort Har-Shemesh, Omri
collection PubMed
description The analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric approach based on Fisher Information which obtains a low-dimensional embedding of a statistical manifold (SM). The SM has deep connections with parametric statistical models and the theory of phase transitions in statistical physics. Firstly we simulate questionnaire responses based on a non-linear SM and validate our method compared to other methods. Secondly we apply our method to two empirical datasets containing largely categorical variables: an anthropological survey of rice farmers in Bali and a cohort study on health inequality in Amsterdam. Compare to previous analysis and known anthropological knowledge we conclude that our method best discriminates between different behaviours, paving the way to dimension reduction as effective as for continuous data.
format Online
Article
Text
id pubmed-7248094
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-72480942020-06-04 Questionnaire data analysis using information geometry Har-Shemesh, Omri Quax, Rick Lansing, J. Stephen Sloot, Peter M. A. Sci Rep Article The analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric approach based on Fisher Information which obtains a low-dimensional embedding of a statistical manifold (SM). The SM has deep connections with parametric statistical models and the theory of phase transitions in statistical physics. Firstly we simulate questionnaire responses based on a non-linear SM and validate our method compared to other methods. Secondly we apply our method to two empirical datasets containing largely categorical variables: an anthropological survey of rice farmers in Bali and a cohort study on health inequality in Amsterdam. Compare to previous analysis and known anthropological knowledge we conclude that our method best discriminates between different behaviours, paving the way to dimension reduction as effective as for continuous data. Nature Publishing Group UK 2020-05-25 /pmc/articles/PMC7248094/ /pubmed/32451420 http://dx.doi.org/10.1038/s41598-020-63760-8 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Har-Shemesh, Omri
Quax, Rick
Lansing, J. Stephen
Sloot, Peter M. A.
Questionnaire data analysis using information geometry
title Questionnaire data analysis using information geometry
title_full Questionnaire data analysis using information geometry
title_fullStr Questionnaire data analysis using information geometry
title_full_unstemmed Questionnaire data analysis using information geometry
title_short Questionnaire data analysis using information geometry
title_sort questionnaire data analysis using information geometry
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7248094/
https://www.ncbi.nlm.nih.gov/pubmed/32451420
http://dx.doi.org/10.1038/s41598-020-63760-8
work_keys_str_mv AT harshemeshomri questionnairedataanalysisusinginformationgeometry
AT quaxrick questionnairedataanalysisusinginformationgeometry
AT lansingjstephen questionnairedataanalysisusinginformationgeometry
AT slootpeterma questionnairedataanalysisusinginformationgeometry