Itakura–Saito distance

The Itakura–Saito distance (or Itakura–Saito divergence) is a measure of the difference between an original spectrum P(\omega) and an approximation \hat{P}(\omega) of that spectrum. Although it is not a perceptual measure it is intended to reflect perceptual (dis)similarity. It was proposed by Fumitada Itakura and Shuzo Saito in the 1960s while they were with NTT.[1]

The distance is defined as:[2]

D_{IS}(P(\omega),\hat{P}(\omega))=\frac{1}{2\pi}\int_{-\pi}^{\pi} \left[ \frac{P(\omega)}{\hat{P}(\omega)}-\log \frac{P(\omega)}{\hat{P}(\omega)} - 1 \right] \, d\omega

The Itakura–Saito distance is a Bregman divergence, but is not a true metric since it is not symmetric[3] and it does not fulfil triangle inequality.

In Non-negative matrix factorization the Itakura-Saito divergence can be used as a measure of the quality of the factorization: this implies a meaningful statistical model of the components and can be solved through an iterative method.[4]

See also

References

  1. Itakura, F., & Saito, S. (1968). Analysis synthesis telephony based on the maximum likelihood method. In Proc. 6th of the International Congress on Acoustics (pp. C–17–C–20). Los Alamitos, CA: IEEE.
  2. Alan H. S. Chan, Sio-Iong Ao (2008). Advances in industrial engineering and operations research. Springer. p. 51. ISBN 978-0-387-74903-7.
  3. A. Banerjee; et al. (2004). "Clustering with Bregman Divergences". In Michael W. Berry, Umeshwar Dayal, Chandrika Kamath, and David Skillicorn. Proceedings of the Fourth SIAM International Conference on Data Mining. SIAM. pp. 234–245. ISBN 978-0-89871-568-2.
  4. Cédric Févotte, Nancy Bertin, and Jean-Louis Durrieu (2009). "Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis". Neural Computation 21 (3): 793–830. doi:10.1162/neco.2008.04-08-771. PMID 18785855.
This article is issued from Wikipedia - version of the Monday, May 02, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.