Cargando…
Questionnaire data analysis using information geometry
The analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric appr...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7248094/ https://www.ncbi.nlm.nih.gov/pubmed/32451420 http://dx.doi.org/10.1038/s41598-020-63760-8 |
_version_ | 1783538294756212736 |
---|---|
author | Har-Shemesh, Omri Quax, Rick Lansing, J. Stephen Sloot, Peter M. A. |
author_facet | Har-Shemesh, Omri Quax, Rick Lansing, J. Stephen Sloot, Peter M. A. |
author_sort | Har-Shemesh, Omri |
collection | PubMed |
description | The analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric approach based on Fisher Information which obtains a low-dimensional embedding of a statistical manifold (SM). The SM has deep connections with parametric statistical models and the theory of phase transitions in statistical physics. Firstly we simulate questionnaire responses based on a non-linear SM and validate our method compared to other methods. Secondly we apply our method to two empirical datasets containing largely categorical variables: an anthropological survey of rice farmers in Bali and a cohort study on health inequality in Amsterdam. Compare to previous analysis and known anthropological knowledge we conclude that our method best discriminates between different behaviours, paving the way to dimension reduction as effective as for continuous data. |
format | Online Article Text |
id | pubmed-7248094 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-72480942020-06-04 Questionnaire data analysis using information geometry Har-Shemesh, Omri Quax, Rick Lansing, J. Stephen Sloot, Peter M. A. Sci Rep Article The analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric approach based on Fisher Information which obtains a low-dimensional embedding of a statistical manifold (SM). The SM has deep connections with parametric statistical models and the theory of phase transitions in statistical physics. Firstly we simulate questionnaire responses based on a non-linear SM and validate our method compared to other methods. Secondly we apply our method to two empirical datasets containing largely categorical variables: an anthropological survey of rice farmers in Bali and a cohort study on health inequality in Amsterdam. Compare to previous analysis and known anthropological knowledge we conclude that our method best discriminates between different behaviours, paving the way to dimension reduction as effective as for continuous data. Nature Publishing Group UK 2020-05-25 /pmc/articles/PMC7248094/ /pubmed/32451420 http://dx.doi.org/10.1038/s41598-020-63760-8 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Har-Shemesh, Omri Quax, Rick Lansing, J. Stephen Sloot, Peter M. A. Questionnaire data analysis using information geometry |
title | Questionnaire data analysis using information geometry |
title_full | Questionnaire data analysis using information geometry |
title_fullStr | Questionnaire data analysis using information geometry |
title_full_unstemmed | Questionnaire data analysis using information geometry |
title_short | Questionnaire data analysis using information geometry |
title_sort | questionnaire data analysis using information geometry |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7248094/ https://www.ncbi.nlm.nih.gov/pubmed/32451420 http://dx.doi.org/10.1038/s41598-020-63760-8 |
work_keys_str_mv | AT harshemeshomri questionnairedataanalysisusinginformationgeometry AT quaxrick questionnairedataanalysisusinginformationgeometry AT lansingjstephen questionnairedataanalysisusinginformationgeometry AT slootpeterma questionnairedataanalysisusinginformationgeometry |