# Sync: Oscillatory based Clustering Algorithm

Kuramoto model is able to ensure global synchronization in oscillatory networks. Global synchronization is a state of oscillatory network where each oscillator has the same phase coordinate and this state is not interested from practical point of view, because nothing is encoded by oscillators (see previous note ‘Oscillatory networks based on Kuramoto model #1 – Introduction‘). Thus, Kuramoto model should be modified to ensure partial synchronization when there are two or more groups of oscillators where each oscillator within group has the same phase coordinate. In other words each group encodes objects that are similar to each other that and such set of objects is considered as a cluster.

There are different clustering algorithms that are based on modified Kuramoto model: Sync, hSync, Sync-SOM and so on. Let’s start from the first one and consider Sync algorithm in this note.

### Algorithm Sync

Sync clustering algorithm is based on oscillatory network that uses modified Kuramoto model. Each oscillator corresponds to only one object from input data and thus amount of oscillators is equal to amount of objects. Two oscillators are connected if Euclidean distance between two corresponding objects is less than threshold value. Output dynamic of each oscillator is described by following differential equation:

$\dot{\theta}_{i}= \dfrac{K}{N_{i}} \displaystyle\sum_{j \in N_{i}} sin({\theta}_{j} - {\theta}_{i})$

This formula is simpler than original Kuramoto model: intrinsic frequency is omitted because it affects phase coordinate as a permanent bias. ${N_{i}}$ is a amount of neighbors of oscillator i. And sum is for neighbors of oscillator i as well. K is a constant value that denotes coupling strength between oscillators. Phase coordinate is main variable of each oscillator.

When should simulation process be stopped? Or in other words, when can we obtain results of cluster analysis? Estimation of local synchronization is used for that purpose, when it close to 1.0 then clustering is close to the end and output dynamic of the oscillatory network can be used for extracting clusters. The estimation is defined by following formula:

$r_{c} = \left| \displaystyle\sum_{i = 0}^{N} \dfrac{1}{N_{i}} \displaystyle\sum_{j \in N_{i}} e^{{\theta}_{j} - {\theta}_{i}} \right |$

Ability to estimate end of the synchronization process is distinctive feature of oscillatory networks based on Kuramoto model. Such estimation helps to simulate network only required number of steps until desire state of system is not reached.

Implementation of the Sync algorithm can be found in pyclustering library. Using pyclustering let’s perform cluster analysis of sample LSUN:

from pyclustering.cluster.syncnet import syncnet, syncnet_visualizer;

from pyclustering.samples.definitions import FCPS_SAMPLES;
from pyclustering.utils import draw_clusters;

# Read sample 'Lsun' from file.

# Create oscillatory network based algorithm Sync, using connectivity
# radius 0.45 and ccore calculatation to make process faster.
network = syncnet(sample, 0.45, ccore = True);

# Start simulation process until rc = 0.998 (default value, can be changed).
analyser = network.process();

# Show output dynamic of the network.
syncnet_visualizer.show_output_dynamic(analyser);

# Allocate clusters and display them
clusters = analyser.allocate_clusters();
draw_clusters(sample, clusters);


Here, connectivity radius is 0.45 – as it has been mentioned this parameter is similar to connectivity radius in DSCAN and OPTICS. Two oscillators are connected if Euclidean distance between their objects is less than 0.45. Estimation of local synchronization is considered as a stop condition for simulation – time point when clustering process is finished. In our example local synchronization is 0.998 by default.

Let’s consider output dynamic of the oscillatory network – figure 1.

Oscillators that have the same phase coordinate encode cluster and therefore it is easy to extract clusters. Using pyclustering library and method draw_clusters() from package pyclustering.utils:

clusters = analyser.allocate_clusters();
draw_clusters(sample, clusters);


Or using instance of the class cluster_visualizer from package pyclustering.cluster:

clusters = analyser.allocate_clusters();
visualizer = cluster_visualizer();
visualizer.append_clusters(clusters, sample);
visualizer.show();


Points that have the same color belong to the one cluster and as it can be observed on figure 2 – three clusters are allocated by the Sync algorithm. Class cluster_visualizer or function draw_clusters can visualize only 1-, 2-, 3-dimensional data, but anyway clusters can be printed to file or to console, clustering result is represented by list of clusters, where each cluster is represented by list of object indexes from input data.

Animation of clustering process of sample ‘Hepta’ by Sync algorithm is shown in following movie:

1. Clustering results of Sync are similar to DBSCAN algorithm when amount of neighbors is equal to 1. For example, this algorithm is able to process data with elongated clusters with homogenous density.

2. Output dynamic can be analysed to extract true number of clusters in some cases (this idea has been extended in hSync algorithm that will be considered in the next note ‘hSync: Oscillatory based Clustering Algorithm‘) when connectivity radius is slightly bigger. Illustration of the cluster allocation by output dynamic analysis is shown on figure 3.

3. Differential equation that used to calculate phase coordinate of each oscillator can be simplified and as a result increase performance (pyclustering uses simplified equation by default):

${\theta}_{i}[k + 1] = {\theta}_{i}[k] +\dfrac{K}{N_{i}} \displaystyle\sum_{j \in N_{i}} sin({\theta}_{j}[k] - {\theta}_{i}[k])$

2. Complexity tends to $O(n^3)$ in case of elongated clusters such as LSUN. Execution time of processing of each sample from FCPS collection is much slower in compare to traditional algorithms (DBSCAN, OPTICS).