Agglomerative Fast Super-Paramagnetic Clustering. (arXiv:1908.00951v1 [q-fin.CP])

Sun, 04 Aug 2019 23:01:39 GMT

We consider the problem of fast time-series data clustering. Building on
previous work modeling the correlation-based Hamiltonian of spin variables we
present a fast non-expensive agglomerative algorithm. The method is tested on
synthetic correlated time-series and noisy synthetic data-sets with built-in
cluster structure to demonstrate that the algorithm produces meaningful
non-trivial results. We argue that ASPC can reduce compute time costs and
resource usage cost for large scale clustering while being serialized and hence
has no obvious parallelization requirement. The algorithm can be an effective
choice for state-detection for online learning in a fast non-linear data
environment because the algorithm requires no prior information about the
number of clusters.