Variable Clustering
-
This tool performs clustering of network variables into groups of variables that are close semantically (from the Analysis menu or the shortcut ).
-
These clusters are designed according to the node proximity in the graph and based on the force of the arcs. A color is automatically associated with each cluster to highlight the clustering:
- The number of clusters is automatically computed by using the Arc Force metric.
- The associated toolbar contains a slider that allows you to choose the desired number of clusters, e.g., 4 in the previous example:
- The button displays a hierarchical representation of the current clustering as a dendrogram. It is always possible to modify the number of clusters and to observe the result immediately in the dendrogram.
- A contextual menu allows displaying the comment associated with the node instead of the name.
- You can also copy the graph as an image.
- The length of the links joining the clusters is inversely proportional to the strength of the relationships between the two sets of variables: the shorter the link, the stronger the relationship.
If the cursor is moved over the junction point of the links in the Dendrogram, a tooltip displays the Arc Force.
The button validates the current clustering and associates each set of variables with a class named [Factor_i]. When the button is pressed, a HTML report of the current clustering is displayed:
The button concludes the variable clustering.
Once the classes are created and associated with the clusters, you can perform Multiple Clustering, which generates, for each class named [Factor_i], a synthetic variable from the nodes belonging to this class.
The number of clusters automatically determined by the algorithm can be limited by the option in the settings. The user can modify also the stop threshold corresponding to the max KL weight a cluster can reach.