ISSN 2079-3537      

 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
Scientific Visualization
Issue Year: 2016
Quarter: 3
Volume: 8
Number: 3
Pages: 1 - 24
Article Name: VISUAL ANALYSIS OF CLUSTERS FOR A MULTIDIMENSIONAL TEXTUAL DATASET
Authors: A.E. Bondarev (Russian Federation), A.V. Bondarenko (Russian Federation), V.A. Galaktionov (Russian Federation), E.S. Klyshinsky (Russian Federation)
Address: A.E. Bondarev
bond@keldysh.ru
Keldysh Institute of Applied Mathematics RAS, Moscow, Russian Federation

A.V. Bondarenko
GOSNIIAS, Moscow, Russian Federation

V.A. Galaktionov
vlgal@gin.keldysh.ru
Keldysh Institute of Applied Mathematics RAS, Moscow, Russian Federation

E.S. Klyshinsky
klyshinsky@mail.ru
Keldysh Institute of Applied Mathematics RAS, Moscow, Russian Federation
National Research University Higher School of Economics, Moscow, Russian Federation
Abstract: The paper considers the problems of visual analysis of clusters for multidimensional textual datasets. To analyze clusters in original data volume the elastic maps are used as the methods of original data points mapping to enclosed manifolds having less dimensionality. Diminishing the elasticity parameters one can design map surface which approximates the multidimensional textual dataset in question much better. This approach of elastic maps does not require any apriori information about data in question and does not depend on data nature, data origin, etc. Probabilistic algorithm t-SNE (t-distributed stochastic neighbor embedding) has similar properties and is quite close in ideology to elastic maps. The paper describes the results of both (elastic maps and t-SNE) approaches application to visual analysis of clusters in multidimensional textual datasets. For elastic maps a technology «Quasi-Zoom» is proposed. This technology allows to improve the results of cluster analysis in the fields of data points concentration. Presented results illustrate an efficiency and applicability of both approaches to cluster analysis of natural language terms.
Language: Russian


Open Article   Download ZIP archive