Cargando…

Data Mining Techniques in Analyzing Process Data: A Didactic

Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage...

Descripción completa

Detalles Bibliográficos
Autores principales: Qiao, Xin, Jiao, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6265513/
https://www.ncbi.nlm.nih.gov/pubmed/30532716
http://dx.doi.org/10.3389/fpsyg.2018.02231
_version_ 1783375651831545856
author Qiao, Xin
Jiao, Hong
author_facet Qiao, Xin
Jiao, Hong
author_sort Qiao, Xin
collection PubMed
description Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.
format Online
Article
Text
id pubmed-6265513
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-62655132018-12-07 Data Mining Techniques in Analyzing Process Data: A Didactic Qiao, Xin Jiao, Hong Front Psychol Psychology Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided. Frontiers Media S.A. 2018-11-23 /pmc/articles/PMC6265513/ /pubmed/30532716 http://dx.doi.org/10.3389/fpsyg.2018.02231 Text en Copyright © 2018 Qiao and Jiao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Qiao, Xin
Jiao, Hong
Data Mining Techniques in Analyzing Process Data: A Didactic
title Data Mining Techniques in Analyzing Process Data: A Didactic
title_full Data Mining Techniques in Analyzing Process Data: A Didactic
title_fullStr Data Mining Techniques in Analyzing Process Data: A Didactic
title_full_unstemmed Data Mining Techniques in Analyzing Process Data: A Didactic
title_short Data Mining Techniques in Analyzing Process Data: A Didactic
title_sort data mining techniques in analyzing process data: a didactic
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6265513/
https://www.ncbi.nlm.nih.gov/pubmed/30532716
http://dx.doi.org/10.3389/fpsyg.2018.02231
work_keys_str_mv AT qiaoxin dataminingtechniquesinanalyzingprocessdataadidactic
AT jiaohong dataminingtechniquesinanalyzingprocessdataadidactic