## Jarvis-Patrick Clustering## Brian T. Luke, Ph.D. |

Main Index |
Jarvis-Patrick Clustering [1] uses a To run this type of clustering, you need a measure of the distance between two objects and two integers, J and K. J is the size of the neighbor list, and K is the number of common neighbors. This method is as follows. - Determine the J nearest neighbors for each object in the set to be clustered.
- Two objects are placed in the same cluster if
- they are contained in each other's neighbor list.
- they have at least K nearest neighbors in common.
Several points need to be made about this procedure. The first is that
it is not hierarchical. Only a single pass through all pairs of objects
needs to be made. This produces clusters that are similar to those
produced by a Simple Linkage Agglomerative
Clustering, only it is much faster. In other words, if objects
The number of clusters produced is dependent upon the input values of
J and K. Therefore, if a large number of objects needs to be clustered,
several values may have to be tried before a reasonable number of
clusters are formed and the number of singletons (or Daylight Systems suggests several methods to reduce the number of singletons. They also found that removing the first criterion "usually results in improved clustering" with a significant increase in the CPU-cost. This method has the advantage of identifying tight clusters within normal clusters by using two K-values, K and K' (K < K'). Two objects lie in the same cluster if the number of common neighbors is at least K, while they are in a tight cluster if this is at least K'.
References: |