Structural similarity

"SSIM" redirects here. For other uses, see SSIM (disambiguation).

The structural similarity (SSIM) index is a method for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos. It was first developed in the Laboratory for Image and Video Engineering (LIVE) at The University of Texas at Austin and in subsequent collaboration with New York University.

SSIM is used for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measurement or prediction of image quality is based on an initial uncompressed or distortion-free image as reference. SSIM is designed to improve on traditional methods such as peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proven to be inconsistent with human visual perception.

History

The first version of SSIM, called Universal Quality Index (UQI), or Wang-Bovik index, was developed by Zhou Wang and Al Bovik in 2001. It was modified into the current version of SSIM (many variations now exist) along with Hamid Sheikh and Eero Simoncelli, and described in print in a paper entitled "Image quality assessment: From error visibility to structural similarity”, which was published in the IEEE Transactions on Image Processing in April 2004.[1]

The 2004 SSIM paper has been cited more than 10,000 times according to Google Scholar, making it among the highest cited papers in the image processing and video engineering fields, ever. It was accorded the IEEE Signal Processing Society Best Paper Award for 2009.[2] The inventors of SSIM were each accorded an individual Primetime Engineering Emmy Award in 2015.

Structural similarity

The difference with respect to other techniques mentioned previously such as MSE or PSNR is that these approaches estimate absolute errors; on the other hand, SSIM is a perception-based model that considers image degradation as perceived change in structural information, while also incorporating important perceptual phenomena, including both luminance masking and contrast masking terms. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. Luminance masking is a phenomenon whereby image distortions (in this context) tend to be less visible in bright regions, while contrast masking is a phenomenon whereby distortions become less visible where there is significant activity or "texture" in the image.

Algorithm

The SSIM index is calculated on various windows of an image. The measure between two windows x and y of common size N×N is:

\hbox{SSIM}(x,y) = \frac{(2\mu_x\mu_y + c_1)(2\sigma_{xy} + c_2)}{(\mu_x^2 + \mu_y^2 + c_1)(\sigma_x^2 + \sigma_y^2 + c_2)}

with

In order to evaluate the image quality this formula is usually applied only on luma, although it may also be applied on color (e.g., RGB) values or chromatic (e.g. YCbCr) values. The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of two identical sets of data. Typically it is calculated on window sizes of 8×8. The window can be displaced pixel-by-pixel on the image but the authors propose to use only a subgroup of the possible windows to reduce the complexity of the calculation.

Variants

Multi-Scale SSIM

A more advanced form of SSIM, called Multiscale SSIM[3] is conducted over multiple scales through a process of multiple stages of sub-sampling, reminiscent of multiscale processing in the early vision system. The performance of both SSIM and Multiscale SSIM is very high in regards to correlations to human judgments, as measured on widely used public image quality databases, including the LIVE Image Quality Database and the TID Database. Most competitive image quality models are some form or variation of the SSIM concept.

Structural Dissimilarity

Structural dissimilarity (DSSIM) is a distance metric derived from SSIM (though the triangle inequality is not necessarily satisfied).

\hbox{DSSIM}(x,y) = \frac{1 - \hbox{SSIM}(x, y)}{2}

Video quality metrics

It is worth noting that the original version SSIM was designed to measure the quality of still images. It doesn't contain any parameters directly related to temporal effects of human perception and human judgment. However, several temporal variants of SSIM have been developed.

A simple application of SSIM to estimate video quality would be to calculate the average SSIM value over all frames in the video sequence.

Application

Owing to its excellent performance and extremely low computation cost, SSIM has become very widely used in the broadcast, cable and satellite television industries where it has become a dominant method of measuring video quality in broadcast and post-production houses throughout the television industry.

SSIM is the basis for a number of video quality measurement tools used globally, including those marketed by Video Clarity, National Instruments, Rodhe and Schwarz, and SSIMWave. Overall, SSIM and its variants – such as Multiscale SSIM – are amongst the most widely used full-reference perceptual image and video quality models throughout the world.

Discussions over performance

A paper by Dosselmann and Yang suggests that SSIM is not as precise as it claims to be.[4]

They dispute the perceptual basis of SSIM, suggesting that its formula does not contain any elaborate visual perception modelling and that SSIM possibly relies on non-perceptual computations. For example, the human visual system does not compute a product between the mean values of the two images..

However as shown in the original 2004 paper, the SSIM model and algorithm includes accurate models of key elements of visual distortion perception, including luminance masking and contrast masking mechanisms. Many variations of SSIM exist that incorporate temporal mechanisms, although the original SSIM model did not. Because of its high performance, SSIM is widely used throughout the cinematic and global broadcast, cable, and satellite television industries. SSIM is undoubtably the most widely tested and validated picture quality model in existence.

See also

References

  1. Wang, Zhou; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. (2004-04-01). "Image quality assessment: from error visibility to structural similarity". IEEE Transactions on Image Processing 13 (4): 600–612. doi:10.1109/TIP.2003.819861. ISSN 1057-7149.
  2. "IEEE Signal Processing Society, Best Paper Award" (PDF).
  3. Wang, Z.; Simoncelli, E.P.; Bovik, A.C. (2003-11-01). "Multiscale structural similarity for image quality assessment". Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, 2004 2: 1398–1402 Vol.2. doi:10.1109/ACSSC.2003.1292216.
  4. Dosselmann, Richard; Yang, Xue Dong (2009-11-06). "A comprehensive assessment of the structural similarity index". Signal, Image and Video Processing 5 (1): 81–91. doi:10.1007/s11760-009-0144-1. ISSN 1863-1703.

External links

This article is issued from Wikipedia - version of the Friday, May 06, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.