The transform toolbar includes two operators to find clusters: the Clusters and the Clusters (Zahn) operators. The two operators both find clusters of points but use slightly different approaches. Both employ manifold.net graph theoretic algorithms to find clusters.

· The Clusters form uses relative neighborhood networks to distinguish clusters based on a combination of network and geometric relationships. Applying this operator with a parameter of 0 results in a relative neighborhood network. Relative neighborhood networks incorporate reckoning of the overall arrangement of a point set and are superior to ordinary, local nearest neighborhood methods.

· The Clusters (Zahn) form uses minimum spanning tree networks to distinguish clusters based on minimum path spanning trees. It is named to honor Charles T. Zahn for his work in describing the general use of minimum spanning tree graphs in cluster detection. The links created using this operator are a subset of the links created when a minimum spanning tree is formed. Minimum spanning trees incorporate links based on the minimum traversal of a branched tree that reaches all points, so this form may better reveal clusters arising from spatial propagation.

Clusters are simply collections of points that appear to form groups or to otherwise be related to each other. We may use various mathematical techniques to identify them; however, what counts in most GIS analysis and data mining is the use of cluster-finding methods to identify groups of points for further examination. The essential thing is if the software reveals patterns that make sense to us once we see them but which otherwise could not be found by eye.

The Clusters operators work by creating links between points that are parts of a cluster. The value given in the source / argument box guides the operation. Smaller parameters will result in larger clusters.

Suppose we begin with the set of points shown above. If we apply the Clusters operator with a parameter of 50 it will create lines between points for the clusters found as shown below:

By increasing or decreasing the parameter we can force fewer or more of the unconnected points to be assigned to a cluster.

The (Zahn) form of the operator results in the above clusters when run with a parameter of 50.

See Also

Relative Neighborhood Network - The transform operator that creates relative neighborhood networks. This topic includes a note on characteristics of relative neighborhood networks.

Spanning Tree - The transform operator that creates minimum spanning trees.

Notes

In graph theory, networks are called graphs. When searching Internet for information on these topics try searching for words like "relative neighborhood", "graph", "spanning tree", "cluster" and similar. These ideas are applied in an astonishing range of disciplines, from fungal spore distribution patterns to the characterization of finds in archeological sites.

The points for the examples above were created by making centroids for provinces in Mexico using the Centroids (Weight) transform operator. The points and results of the transform operators are shown in a map with a drawing of Mexico as a backdrop. We used the centroids as a source set of points because they are dispersed in a geographically interesting way. There is no meaning to the map of Mexico; however, if we were to place one antenna in the geographic center of each province and then we wished to link the antennas in a network we would likely create and study maps such as these.