Cargando…

Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study

BACKGROUND: The principal component analysis (PCA) is known as a multivariate statistical model for reducing dimensions into a representation of principal components. Thus, the PCA is commonly adopted for establishing psychometric properties, i.e., the construct validity. Autoencoder is a neural net...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Yen-Kuang, Lee, Chen-Yin, Chen, Chen-Yueh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044230/
https://www.ncbi.nlm.nih.gov/pubmed/35494838
http://dx.doi.org/10.7717/peerj-cs.782
_version_ 1784695060073283584
author Lin, Yen-Kuang
Lee, Chen-Yin
Chen, Chen-Yueh
author_facet Lin, Yen-Kuang
Lee, Chen-Yin
Chen, Chen-Yueh
author_sort Lin, Yen-Kuang
collection PubMed
description BACKGROUND: The principal component analysis (PCA) is known as a multivariate statistical model for reducing dimensions into a representation of principal components. Thus, the PCA is commonly adopted for establishing psychometric properties, i.e., the construct validity. Autoencoder is a neural network model, which has also been shown to perform well in dimensionality reduction. Although there are several ways the PCA and autoencoders could be compared for their differences, most of the recent literature focused on differences in image reconstruction, which are often sufficient for training data. In the current study, we looked at details of each autoencoder classifier and how they may provide neural network superiority that can better generalize non-normally distributed small datasets. METHODOLOGY: A Monte Carlo simulation was conducted, varying the levels of non-normality, sample sizes, and levels of communality. The performances of autoencoders and a PCA were compared using the mean square error, mean absolute value, and Euclidian distance. The feasibility of autoencoders with small sample sizes was examined. CONCLUSIONS: With extreme flexibility in decoding representation using linear and non-linear mapping, this study demonstrated that the autoencoder can robustly reduce dimensions, and hence was effective in building the construct validity with a sample size as small as 100. The autoencoders could obtain a smaller mean square error and small Euclidian distance between original dataset and predictions for a small non-normal dataset. Hence, when behavioral scientists attempt to explore the construct validity of a newly designed questionnaire, an autoencoder could also be considered an alternative to a PCA.
format Online
Article
Text
id pubmed-9044230
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-90442302022-04-28 Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study Lin, Yen-Kuang Lee, Chen-Yin Chen, Chen-Yueh PeerJ Comput Sci Algorithms and Analysis of Algorithms BACKGROUND: The principal component analysis (PCA) is known as a multivariate statistical model for reducing dimensions into a representation of principal components. Thus, the PCA is commonly adopted for establishing psychometric properties, i.e., the construct validity. Autoencoder is a neural network model, which has also been shown to perform well in dimensionality reduction. Although there are several ways the PCA and autoencoders could be compared for their differences, most of the recent literature focused on differences in image reconstruction, which are often sufficient for training data. In the current study, we looked at details of each autoencoder classifier and how they may provide neural network superiority that can better generalize non-normally distributed small datasets. METHODOLOGY: A Monte Carlo simulation was conducted, varying the levels of non-normality, sample sizes, and levels of communality. The performances of autoencoders and a PCA were compared using the mean square error, mean absolute value, and Euclidian distance. The feasibility of autoencoders with small sample sizes was examined. CONCLUSIONS: With extreme flexibility in decoding representation using linear and non-linear mapping, this study demonstrated that the autoencoder can robustly reduce dimensions, and hence was effective in building the construct validity with a sample size as small as 100. The autoencoders could obtain a smaller mean square error and small Euclidian distance between original dataset and predictions for a small non-normal dataset. Hence, when behavioral scientists attempt to explore the construct validity of a newly designed questionnaire, an autoencoder could also be considered an alternative to a PCA. PeerJ Inc. 2022-02-09 /pmc/articles/PMC9044230/ /pubmed/35494838 http://dx.doi.org/10.7717/peerj-cs.782 Text en ©2022 Lin et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Lin, Yen-Kuang
Lee, Chen-Yin
Chen, Chen-Yueh
Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study
title Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study
title_full Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study
title_fullStr Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study
title_full_unstemmed Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study
title_short Robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a Monte Carlo simulation study and a sports fan curiosity study
title_sort robustness of autoencoders for establishing psychometric properties based on small sample sizes: results from a monte carlo simulation study and a sports fan curiosity study
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9044230/
https://www.ncbi.nlm.nih.gov/pubmed/35494838
http://dx.doi.org/10.7717/peerj-cs.782
work_keys_str_mv AT linyenkuang robustnessofautoencodersforestablishingpsychometricpropertiesbasedonsmallsamplesizesresultsfromamontecarlosimulationstudyandasportsfancuriositystudy
AT leechenyin robustnessofautoencodersforestablishingpsychometricpropertiesbasedonsmallsamplesizesresultsfromamontecarlosimulationstudyandasportsfancuriositystudy
AT chenchenyueh robustnessofautoencodersforestablishingpsychometricpropertiesbasedonsmallsamplesizesresultsfromamontecarlosimulationstudyandasportsfancuriositystudy