Balanced clustering

Balanced clustering

Balanced clustering is a special case of clustering, where in the strictest.sense, the cluster sizes are constrained to \lfloor {n\over k}\rfloor or \lceil{n \over k}\rceil, where n is the number of points and k is the number of clusters.[1] This type of balanced clustering is called balance-constrained clustering. Typical algorithm is Balanced k-Means, which minimizes mean square error (MSE). There is also another type of balanced clustering, it is called balance-driven clustering. In it the cost function is two-objective that minimizes both inbalance and MSE. Typical cost functions are Ratio cut[2] and Ncut.[3]

Software

There exists implementations for Balanced k-Means[4] and Ncut[5]

References

  1. M. I. Malinen and P. Fränti (August 2014). "Balanced k-Means for Clustering". Joint Int. Workshop on Structural, Syntactic, and Statistical Pattern Recognition (S+SSPR 2014), LNCS 8621.
  2. L. Hagen and A. B. Kahng (1992). "New spectral methods for ratio cut partitioning and clustering". IEEE Transactions on Computer-Aided Design.
  3. J. Shi and J. Malik (2000). "Normalized cuts and image segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence.
  4. M. I. Malinen and P. Fränti. "Balanced k-Means implementation". University of Eastern Finland.
  5. T. Cour, S. Yu and J. Shi. "Ncut implementation". University of Pennsylvania.
This article is issued from Wikipedia - version of the Wednesday, March 23, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.