Cargando…

pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment

High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-sourc...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yinsheng, Shang, Qian, Zhang, Guoming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7887408/
https://www.ncbi.nlm.nih.gov/pubmed/33644472
http://dx.doi.org/10.1016/j.heliyon.2021.e06199
Descripción
Sumario:High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-source Python package pyDRMetrics. Supported metrics include reconstruction error, distance matrix, residual variance, ranking matrix, co-ranking matrix, trustworthiness, continuity, co-k-nearest neighbor size, LCMC (local continuity meta criterion), and rank-based local/global properties. pyDRMetrics provides a native Python class and a web-oriented API. A case study of mass spectra is conducted to demonstrate the package functions. A web GUI wrapper is also published to support user-friendly B/S applications.