Cargando…

Unsupervised random forest for affinity estimation

This paper presents an unsupervised clustering random-forest-based metric for affinity estimation in large and high-dimensional data. The criterion used for node splitting during forest construction can handle rank-deficiency when measuring cluster compactness. The binary forest-based metric is exte...

Descripción completa

Detalles Bibliográficos
Autores principales: Yi, Yunai, Sun, Diya, Li, Peixin, Kim, Tae-Kyun, Xu, Tianmin, Pei, Yuru
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Tsinghua University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8645415/
https://www.ncbi.nlm.nih.gov/pubmed/34900375
http://dx.doi.org/10.1007/s41095-021-0241-9
Descripción
Sumario:This paper presents an unsupervised clustering random-forest-based metric for affinity estimation in large and high-dimensional data. The criterion used for node splitting during forest construction can handle rank-deficiency when measuring cluster compactness. The binary forest-based metric is extended to continuous metrics by exploiting both the common traversal path and the smallest shared parent node. The proposed forest-based metric efficiently estimates affinity by passing down data pairs in the forest using a limited number of decision trees. A pseudo-leaf-splitting (PLS) algorithm is introduced to account for spatial relationships, which regularizes affinity measures and overcomes inconsistent leaf assign-ments. The random-forest-based metric with PLS facilitates the establishment of consistent and point-wise correspondences. The proposed method has been applied to automatic phrase recognition using color and depth videos and point-wise correspondence. Extensive experiments demonstrate the effectiveness of the proposed method in affinity estimation in a comparison with the state-of-the-art. [Image: see text]