ISSN 2079-3537      

 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

Scientific Visualization, 2024, volume 16, number 5, pages 120 - 150, DOI: 10.26583/sv.16.5.09

Stochastic Semantics of Big Data (Parallel Computing and Visualization)

Authors: D. V. Manakov1,A, P. A. Vasev2,A

Krasovsky Institute of Mathematics and Mechanics, Ural Branch of the Russian Academy of Sciences

1 ORCID: 0000-0001-6852-8096, manakov@imm.uran.ru

2 ORCID: 0000-0003-3854-0670, vasev@imm.uran.ru

 

Abstract

First of all the paper considers the problem of verification or formalization of the online visualization and parallel computing system from the point of view of dynamic systems as a development of the theory of computational complexity for random processes. Considering problems involving truly big data inevitably leads to the use of a block approach which is also used in both information theory and stochastic differential equations. As a natural metaphor the graph signals were chosen. This is a graph in nodes, of which a spectral function is defined in the examples considered this is a function of color (RGB), height or amount of data. In parallel computing, a block can be associated with a computing unit (processor) and consider the problem of entropy (performance) maximization. In the developed on-line visualization and concurrent computing system for geometric parallelization, it is possible to implement and compare a stationary random process (equiprobable messages implemented using broadcasting and mixins) and a steady-state random process (point-to-point messages), which have different analytical solutions. Together, this allows concluding that the proposed implementation of a stationary process has a certain novelty; in addition, it was intended to be more convenient for automated parallelization. The problems of automatic load balancing (interpolation problem) and optimal scalability of parallel computing (extrapolation problem) are also considered. Not much has been done in the field of visualization verification for example a mesh visualization has been proposed to be considered as a parameterized model of a white-noise random process. Of course, this work cannot be considered complete, but the direction that the authors called stochastic semantics is obviously promising.

The authors intend to take a close look at the established perturbed processes in the field of visualizations including those that take into account the human factor (the sketches of the formalization in the form of a discussion are given).

 

Keywords: signal graphs (graph signals), dynamic systems, load balancing, entropy, visualization of a digital surface model.