Searching for clusters of entities
A cluster is a group of entities that have more connections to each other than to entities outside the group. Finding clusters is useful in charts that contain high volumes of data with many interlinked entities to find potential groupings such as key players in a criminal investigation.
About this task
As part of the search process, Analyst's Notebook measures how interconnected each group of entities is. This measure of interconnectivity is known as the binding strength.
The method that Analyst's Notebook uses to calculate binding strength is based on link connectivity. In its simplest form, this method considers the binding strength of a group of entities to be the number of links that must be deleted to split the group into two distinct chart fragments. For example, the binding strength of the chart in the following diagram is 1. To split it into two, you need to delete only the single link between Entity C and Entity E.
Splitting the chart identifies two chart fragments. To identify clusters Analyst's Notebook continues to split the groups of entities, until it can no longer divide the chart any further. It assigns the highest binding strength that it can to each item during the process. Any connected group of entities that has a binding strength greater than its immediate neighbors is considered to be a cluster. In this chart, Analyst's Notebook finds two clusters, one with a binding strength of three (entities A, B, C, and D) and the other with a binding strength of four (entities E, F, G, H, and I). You can filter clusters by specifying a binding strength threshold. Any cluster with a binding strength below the threshold is not reported. In the example, if the binding strength threshold is set to three, both clusters are reported. If the threshold is raised to four, Analyst's Notebook reports only the cluster with a binding strength of four.
Every group of connected entities has a binding strength; the higher the binding strength, the more interconnected the entities. Link connectivity clusters cannot overlap, so entities cannot belong to more than one cluster.
- Click the Analyze tab.
To modify or review the search criteria, in the Find
Networks group, click . The Setup Clusters window opens.
Enter a binding strength threshold.
The binding strength threshold determines the minimum binding strength a group of entities must have before Analyst's Notebook considers it to be a cluster.
When you enter a binding strength threshold, you can choose how Analyst's Notebook determines the binding strength when there are multiple
links between entities or values on the links. This setting is known as the link weight.
Link Weight Setting Description Connections Only Treats each connection as a single link regardless of how many links there are between two entities. This setting is the most common option. Connection Links Counts all links between entities. For example, by using the Connection Links setting, this group of entities has a binding strength of six because six links must be deleted to split the chart in two. Link Attribute Adds the values of a specified Number attribute on each link in a connection. Select the attribute in the Link Weight Attribute list. Connection Sum Links Calculates the total value of all the numeric parts of the link labels between two entities.
In the Cluster Members area, specify how Analyst's Notebook handles
any clusters that it finds.
Setting Description Select Cluster Members Selects entities and the links between them that are found in a cluster. The selection of all other chart items is cleared. Hide Others Hides all entities and links that do not belong to a cluster.Note: If Reveal Hidden mode is selected when you search for clusters, then the hidden items remain visible. Cluster Attribute Assigns an attribute to every cluster member. You can select an existing Text attribute from the list or enter a new attribute name. Analyst's Notebook creates a new Text attribute with this name. Analyst's Notebook assigns attribute values to the cluster members:
- C1 for members of the cluster that has the highest binding strength.
- C2 for the cluster that has the next highest binding strength, and so on.
If two clusters have the same binding strength, the cluster with the most entities takes priority when the cluster attribute is assigned.
Binding Strength Attribute Assigns an attribute to every chart item to indicate the binding strength that it would be assigned if it was part of a cluster. You can select an existing Number attribute from the list or enter a new attribute name. Analyst's Notebook creates a new Number attribute with this name.
- Enter a binding strength threshold.
- To search for clusters, in the Find Networks group, click .
What to do next
To emphasize any clusters that are found, you can change the color or line width of the links in each cluster. To format the links in a cluster, select all the links in the cluster, right-click on one of the links, then click Combined Properties. In the Edit Chart Items window, you can collectively format the links.