Cargando…

Assessment of Model Accuracy in Eyes Open and Closed EEG Data: Effect of Data Pre-Processing and Validation Methods

Eyes open and eyes closed data is often used to validate novel human brain activity classification methods. The cross-validation of models trained on minimally preprocessed data is frequently utilized, regardless of electroencephalography data comprised of data resulting from muscle activity and env...

Descripción completa

Detalles Bibliográficos
Autores principales: Mattiev, Jamolbek, Sajovic, Jakob, Drevenšek, Gorazd, Rogelj, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9854523/
https://www.ncbi.nlm.nih.gov/pubmed/36671614
http://dx.doi.org/10.3390/bioengineering10010042
Descripción
Sumario:Eyes open and eyes closed data is often used to validate novel human brain activity classification methods. The cross-validation of models trained on minimally preprocessed data is frequently utilized, regardless of electroencephalography data comprised of data resulting from muscle activity and environmental noise, affecting classification accuracy. Moreover, electroencephalography data of a single subject is often divided into smaller parts, due to limited availability of large datasets. The most frequently used method for model validation is cross-validation, even though the results may be affected by overfitting to the specifics of brain activity of limited subjects. To test the effects of preprocessing and classifier validation on classification accuracy, we tested fourteen classification algorithms implemented in WEKA and MATLAB, tested on comprehensively and simply preprocessed electroencephalography data. Hold-out and cross-validation were used to compare the classification accuracy of eyes open and closed data. The data of 50 subjects, with four minutes of data with eyes closed and open each was used. The algorithms trained on simply preprocessed data were superior to the ones trained on comprehensively preprocessed data in cross-validation testing. The reverse was true when hold-out accuracy was examined. Significant increases in hold-out accuracy were observed if the data of different subjects was not strictly separated between the test and training datasets, showing the presence of overfitting. The results show that comprehensive data preprocessing can be advantageous for subject invariant classification, while higher subject-specific accuracy can be attained with simple preprocessing. Researchers should thus state the final intended use of their classifier.