Image scaling

In computer graphics, image scaling is the process of resizing a digital image. Scaling is a non-trivial process that involves a trade-off between efficiency, smoothness and sharpness. With bitmap graphics, as the size of an image is reduced or enlarged, the pixels that form the image become increasingly visible, making the image appear "soft" if pixels are averaged, or jagged if not. With vector graphics the trade-off may be in processing power for re-rendering the image, which may be noticeable as slow re-rendering with still graphics, or slower frame rate and frame skipping in computer animation.

Apart from fitting a smaller display area, image size is most commonly decreased (or subsampled or downsampled) in order to produce thumbnails. Enlarging an image (upsampling or interpolating) is generally common for making smaller imagery fit a bigger screen. In “zooming” a bitmap image, it is not possible to discover any more information in the image than already exists, and image quality inevitably suffers. However, there are several methods of increasing the number of pixels that an image contains, which evens out the appearance of the original pixels.

Scaling methods

An image size can be changed in several ways. Consider quadrupling the size of the following photographic thumbnail image (40x40 pixel), and doubling the size of the following text based image:

Reference 160x160 pixel thumbnail

Nearest-neighbor interpolation

One of the simpler ways of increasing the size is nearest-neighbor interpolation, replacing every pixel with a number of pixels of the same color:

The resulting image is larger than the original, and preserves all the original detail, but has (possibly undesirable) jaggedness. The diagonal lines of the W, for example, now show the "stairway" shape characteristic of nearest-neighbor interpolation.

Other scaling methods below are better at preserving smooth contours in the image:

Bilinear interpolation

For example, bilinear interpolation produces the following results:

Linear (or bilinear, in two dimensions) interpolation is typically good for changing the size of an image, but causes some undesirable softening of details and can still be somewhat jagged.

Bicubic interpolation

Better scaling methods include bicubic interpolation and Lanczos resampling.

Fourier-based interpolation

Simple fourier based interpolation based on padding of the frequency domain with zero components (a smooth window based approach would reduce the ringing). Beside the good conservation (even recovering) of details, notable is the ringing and the circular bleeding of content from the left border to right border (and way around).

Edge-directed interpolation algorithms

Edge-directed interpolation algorithms aim to preserve edges in the image after scaling, unlike other algorithms which can produce staircase artifacts around diagonal lines or curves.

Examples of algorithms for this task include New Edge-Directed Interpolation (NEDI),[1][2] Edge-Guided Image Interpolation (EGGI),[3] Iterative Curvature-Based Interpolation (ICBI),[4] and Directional Cubic Convolution Interpolation (DCCI).[5] An article from 2013 compared the four algorithms above, and found that DCCI had the best scores in PSNR and SSIM on a series of test images.[6]

hqx

For magnifying computer graphics with low resolution and/or few colors (usually from 2 to 256 colors), better results will be achieved by hqx or other pixel art scaling algorithms. These produce sharp edges and maintain high level of detail.

Supersampling

For scaling photos (and raster images with many colors), see also anti-aliasing algorithms called supersampling.

Vectorization

An entirely different approach is vector extraction or vectorization. Vectorization first creates a resolution independent vector representation of the graphic to be scaled. Then the resolution-independent version is rendered as a raster image at the desired resolution. This technique is used by Adobe Illustrator Live Trace, Inkscape, and several recent papers.[7] Scalable Vector Graphics are well suited to simple geometric images, while photographs do not fare well with vectorization due to their complexity.

SFG conversion

Another approach is scalable function graphic conversion. As with vectorization, a conversion process creates a resolution independent representation of the graphic to be scaled. The conversion requires a large amount of processing time, but the resulting function is capable of scaling complex images such as photographs.[8]

Mipmap

A unique problem occurs with downscaling. A scaling algorithm that relies on sampling a specific number of pixels would sample non-adjacent pixels when downscaling below a certain threshold, which can break the sampling and produce an unsmooth result. This can be avoided by using box sampling or a mipmap which contains many already geometrically downscaled copies. Any simple scaling algorithm can then be used on one of the prescaled copies and give an accurate result.

Algorithms

Two standard scaling algorithms are bilinear and bicubic interpolation. Filters like these work by interpolating pixel color values, introducing a continuous transition into the output even where the original material has discrete transitions. Although this is desirable for continuous-tone images, some algorithms reduce contrast (sharp edges) in a way that may be undesirable for line art.

Nearest-neighbor interpolation preserves these sharp edges, but it increases aliasing (or jaggies; where diagonal lines and curves appear pixelated). Several approaches have been developed that attempt to optimize for bitmap art by interpolating areas of continuous tone, preserve the sharpness of horizontal and vertical lines and smooth all other curves.

Pixel art scaling algorithms

As pixel art graphics are usually in very low resolutions, they rely on careful placing of individual pixels, often with a limited palette of colors. This results in graphics that rely on a high amount of stylized visual cues to define complex shapes with very little resolution, down to individual pixels.

A number of specialized algorithms[9] have been developed to handle pixel art graphics, as the traditional scaling algorithms do not take such perceptual cues into account.

Efficiency

Since a typical application of this technology is improving the appearance of fourth-generation and earlier video games on arcade and console emulators, many are designed to run in real time for sufficiently small input images at 60 frames per second.

Many work only on specific scale factors: 2× is the most common, with 3×, 4×, 5x and 6x also present.

EPX/Scale2×/AdvMAME2×

Eric's Pixel Expansion (EPX) is an algorithm developed by Eric Johnston at LucasArts around 1992,[10] when porting the SCUMM engine games from the IBM PC (which ran at 320×200×256 colors) to the early color Macintosh computers, which ran at more or less double that resolution.[11] The algorithm works as follows:

  A    --\ 1 2
C P B  --/ 3 4
  D
 1=P; 2=P; 3=P; 4=P;
 IF C==A => 1=A
 IF A==B => 2=B
 IF B==D => 4=D
 IF D==C => 3=C
 IF of A, B, C, D, three or more are identical: 1=2=3=4=P

Later implementations of this same algorithm (as AdvMAME2× and Scale2×, developed around 2001) have a slightly more efficient but functionally identical implementation:

  A    --\ 1 2
C P B  --/ 3 4
  D
 1=P; 2=P; 3=P; 4=P;
 IF C==A AND C!=D AND A!=B => 1=A
 IF A==B AND A!=C AND B!=D => 2=B
 IF B==D AND B!=A AND D!=C => 4=D
 IF D==C AND D!=B AND C!=A => 3=C

The AdvMAME4×/Scale4× algorithm is just EPX applied twice to get 4× resolution.

Scale3×/AdvMAME3× and ScaleFX

The AdvMAME3×/Scale3× algorithm can be thought of as a generalization of EPX to the 3× case. The corner pixels are calculated identically to EPX.

A B C --\  1 2 3
D E F    > 4 5 6
G H I --/  7 8 9
 1=E; 2=E; 3=E; 4=E; 5=E; 6=E; 7=E; 8=E; 9=E;
 IF D==B AND D!=H AND B!=F => 1=D
 IF (D==B AND D!=H AND B!=F AND E!=C) OR (B==F AND B!=D AND F!=H AND E!=A) => 2=B
 IF B==F AND B!=D AND F!=H => 3=F
 IF (H==D AND H!=F AND D!=B AND E!=A) OR (D==B AND D!=H AND B!=F AND E!=G) => 4=D
 5=E
 IF (B==F AND B!=D AND F!=H AND E!=I) OR (F==H AND F!=B AND H!=D AND E!=C) => 6=F
 IF H==D AND H!=F AND D!=B => 7=D
 IF (F==H AND F!=B AND H!=D AND E!=G) OR (H==D AND H!=F AND D!=B AND E!=I) => 8=H
 IF F==H AND F!=B AND H!=D => 9=F

There is also a variant improved over Scale3× called ScaleFX, developed by Sp00kyFox, and a version combined with Reverse-AA called ScaleFX-Hybrid.[12][13]

Eagle

Eagle works as follows: for every in pixel we will generate 4 out pixels. First, set all 4 to the color of the in pixel we are currently scaling (as nearest-neighbor). Next look at the pixels up and to the left; if they are the same color as each other, set the top left pixel to that color. Continue doing the same for all four pixels, and then move to the next one.[14]

Assume an input matrix of 3×3 pixels where the center most pixel is the pixel to be scaled, and an output matrix of 2×2 pixels (i.e., the scaled pixel)

first:        |Then
. . . --\ CC  |S T U  --\ 1 2
. C . --/ CC  |V C W  --/ 3 4
. . .         |X Y Z
              | IF V==S==T => 1=S
              | IF T==U==W => 2=U
              | IF V==X==Y => 3=X
              | IF W==Z==Y => 4=Z

Thus if we have a black pixel on a white background it will vanish. This is a bug in the Eagle algorithm, but is solved by its successors such as 2xSaI and HQ3x.

2×SaI

2×SaI, short for 2× Scale and Interpolation engine, was inspired by Eagle. It was designed by Derek Liauw Kie Fa, also known as Kreed, primarily for use in console and computer emulators, and it has remained fairly popular in this niche. Many of the most popular emulators, including ZSNES and VisualBoyAdvance, offer this scaling algorithm as a feature.

Since Kreed released[15] the source code under the GNU General Public License, it is freely available to anyone wishing to utilize it in a project released under that license. Developers wishing to use it in a non-GPL project would be required to rewrite the algorithm without using any of Kreed's existing code.

Super 2×SaI and Super Eagle

Several slightly different versions of the scaling algorithm are available, and these are often referred to as Super 2×SaI and Super Eagle. Super Eagle, which is also written by Kreed, is similar to the 2×SaI engine, but does more blending. Super 2×SaI, which is also written by Kreed, is a filter that smooths the graphics, but it blends more than the Super Eagle engine.

hqnx family

Main article: hqx

Maxim Stepin's hq2x, hq3x, and hq4x are for scale factors of 2:1, 3:1, and 4:1 respectively. Each works by comparing the color value of each pixel to those of its eight immediate neighbours, marking the neighbours as close or distant, and using a pregenerated lookup table to find the proper proportion of input pixels' values for each of the 4, 9 or 16 corresponding output pixels. The hq3x family will perfectly smooth any diagonal line whose slope is ±0.5, ±1, or ±2 and which is not anti-aliased in the input; one with any other slope will alternate between two slopes in the output. It will also smooth very tight curves. Unlike 2xSaI, it anti-aliases the output.[16] Shader long thought to be hqx was in fact another shader of lower quality called HqFilter/ScaleHQ.[17] True hqx[18] has comparable quality to early versions of xBR.

Image enlarged 3× with the nearest-neighbor interpolation
Image enlarged in size by 3× with hq3x algorithm

hqnx was initially created for the Super Nintendo emulator ZSNES. The author of bsnes has released a space-efficient implementation of hq2x (ScaleHQ) to the public domain.[19]

xBR family

There are 6 filters in this family: xBR , xBRZ, xBR-Hybrid, Super xBR, xBR+3D and Super xBR+3D.

xBR,[20] created by Hyllian, works much the same way as HQx (based on pattern recognition), and would generate the same result as HQx when given the above pattern. However, it goes further than HQx by using a 2-stage set of interpolation rules, which better handle more complex patterns such as anti-aliased lines and curves. Scaled background textures keep the sharp characteristics of the original image, rather than becoming blurred like HQx(In reality ScaleHQ) tends to do. Newest xBR versions are multi-pass and can preserve small details better. There is also a version of xBR combined with Reverse-AA shader called xBR-Hybrid.[21] xBR+3D is a version with a 3D mask that only filters 2D elements.

xBRZ,[22] is a modified version of xBR, created by Zenju and implemented from scratch as a CPU-based filter in C++. It uses the same basic idea as xBR's pattern recognition and interpolation, but with a different rule set designed to preserve fine image details as small a few pixels. This makes it useful for scaling the details in faces, and in particular eyes. xBRZ is optimized for multi-core CPUs and 64-bit architectures and shows 40–60% better performance than HQx even when running on a single CPU core only. It supports scaling images with an alpha channel, and scaling by factors from 2x up to 6x.

Super xBR[23][24] is an algorithm developed by Hylian in 2015. It uses some combinations of known linear filters along with xBR edge detection rules in a non-linear way. It works in two passes and can only scale an image by two (or multiples of two by reapplying it and also has anti-ringing filter). Super xBR+3D is a version with a 3D mask that only filters 2D elements. There is also a Super xBR version rewritten in C/C++.[25]

RotSprite

RotSprite is a scaling and rotation algorithm for sprites developed by Xenowhirl. It produces far fewer artifacts than nearest-neighbor rotation algorithms, and like EPX, it does not introduce new colors into the image (unlike most interpolation systems).[26]

The algorithm first scales the image to 8 times its original size with a modified Scale2× algorithm which treats similar (rather than identical) pixels as matches. It then calculates what rotation offset to use by favoring sampled points which are not boundary pixels. Next, the rotated image is created with a nearest-neighbor scaling and rotation algorithm that simultaneously shrinks the big image back to its original size and rotates the image. Finally, overlooked single-pixel details are restored if the corresponding pixel in the source image is different and the destination pixel has three identical neighbors.[27]

Kopf–Lischinski

The Kopf–Lischinski algorithm is a novel way to extract resolution-independent vectors from pixel art described in the 2011 paper "Depixelizing Pixel Art".[7]

SuperRes

The SuperRes[28] shaders use a different scaling method which can be used in combination with NEDI (or any other scaling algorithm). This method is explained in detail here.[29] This method seems to give better results than just using NEDI, and rival those of NNEDI3. These are now also available as an MPDN renderscript.

EDIUpsizer

EDIUpsizer[30] is a resampling filter that upsizes an image by a factor of two both horizontally and vertically using NEDI (new edge-directed interpolation).[31] EDIUpsizer also uses a few modifications to basic NEDI in order to prevent a lot of the artifacts that NEDI creates in detailed areas. These include condition number testing and adaptive window size,[32] as well as capping constraints. All modifications and constraints to NEDI are optional (can be turned on and off) and are user configurable. Just note that this filter is rather slow

FastEDIUpsizer

FastEDIUpsizer is a slimmed down version of EDIUpsizer that is slightly more tuned for speed. It uses a constant 8x8 window size, only performs NEDI on the luma plane, and only uses either bicubic or bilinear interpolation as the fall back interpolation method.

eedi3

Another edge-directed interpolation filter. Works by minimizing a cost functional involving every pixel in a scan line. It is slow.

EEDI2

EEDI2 resizes an image by 2x in the vertical direction by copying the existing image to 2*y(n) and interpolating the missing field. It is intended for edge-directed interpolation for deinterlacing (i.e. not really made for resizing a normal image, but can do that as well). EEDI2 can be used with both TDeint and TIVTC, see the discussion link for more info on how to do this.[33]

NEDI

NEDI[34] Well, the idea behind edge-directed interpolation (EDI) is to use statistical sampling to ensure quality of an image when you scale it up. This is inasmuch as scaling is used on the actual image data to interpolate to a new superresolution result as opposed to zooming in on an image, where you actually want to see individual pixels. There were several earlier methods that involved detecting edges to generate blending weights for linear interpolation or classifying pixels according to their neighbour conditions and using different otherwise isotropic interpolation schemes based on the classification. Any given interpolation approach boils down to weighted averages of neighbouring pixels. The goal here is to make sure that we find optimal weights. In the case of bilinear interpolation, we set all the weights to be equal. In higher order interpolation methods like bicubic or sinc interpolation, we consider a larger number of neighbours than just the adjacent ones. In NEDI (New Edge-Directed Interpolation), we compute local covariances in the original image, and use them to adapt the interpolation at high resolution

NNEDI

NNEDI[35] - nnedi is an intra-field only deinterlacer. It takes in a frame, throws away one field, and then interpolates the missing pixels using only information from the kept field. It has same rate and double rate modes, and works with YUY2 and YV12 input. nnedi can also be used to enlarge images by powers of two.

NNEDI2

NNEDI2 is an intra-field only deinterlacer. It takes in a frame, throws away one field, and then interpolates the missing pixels using only information from the kept field. It has same rate and double rate modes, and works with YV12, YUY2, and RGB24 input. nnedi2 is also very good for enlarging images by powers of 2, and includes a function 'nnedi2_rpow2' for that purpose.

ChromaNEDI

ChromaNEDI[36] is a way of using NEDI to upscale chroma using information from the luma channels.

NNEDI3

nnedi2 with improved predictor neural network architecture and local neighborhood pre-processing. nnedi3 also has multiple local neighborhood size options to better handle image enlargement vs deinterlacing and give more quality vs speed options. NNEDI3 has a "predictor neural network" that consists of neurons. Possible settings for madvr NNEDI3 neurons are 16, 32, 64, 128, and 256. 16 is fastest. 256 is slowest, but should give the best quality. This is a quality vs speed option; however, differences are usually small between the amount of neurons for a specific resize factor, however the performance difference between the count of neurons becomes larger as you quadruple the image size. If you are only planning on doubling the resolution then you won't see massive differences between 16 and 256 neurons. There is still a noticeable difference between the highest and lowest options, but not orders of magnitude different.

SuperChromaRes

With techniques similar to those of SuperRes it's also possible to do chroma scaling. One major advantage is that this makes it possible to do chroma scaling in linear light, which would normally be impossible. This can improve image quality greatly for images consisting of saturated colours (especially red) on a white background. This is also available as an MPDN renderscript, but I've also decided to make the original experimental shaders available to make it possible to try it out with other renderers. Be warned that support for these experimental shaders will be minimal, I will not be backporting all the improvements made in the renderscript, nor will I explain all the options, they also have some of the same issues as ChromaNEDI but will generally work well for HD sources.

Waifu2x

Waifu2x[37] is Image Super-Resolution for Anime-style art using Deep Convolutional Neural Networks. And it supports photo. Demo application can be found at http://waifu2x.udp.jp/

Comparison

The following table shows a comparison of the above pixel scaling algorithms generated with the tool 2dimagefilter (linked below).

Algorithm Image
(Original image)
Super-xBR 4x
Eagle 3x
hq3x
Scale 3x
XBR 3x
SuperEagle
SuperSaI
SaI 2x
Scale 2x

Sampling theorem background

Image scaling/interpolation can be also interpreted as image resampling or image reconstruction under consideration of the Nyquist sampling theorem. According to the theorem the downsampling to the 40x40 thumbnail image from the 160x160 original image should be only carried out only after applying a suitable 2D lowpass filter to prevent aliasing artifacts. The image is reduced to the informations which can be carried by the smaller image, 1/16 in this case (6.25%). As natural images have their information clustered on the lower spatial frequencies (sparse in the fourier domain) the effective information loss is significant smaller, a property also JPEG compression relies on. In this case 32.99% of the relevant image information is retained. If this is done, the downsampling step by factor four is lossless (the information loss happened with the filtering) and can be reconstructed ("interpolation / image scaling") lossless (beside rounding errors) again back to the 160x160 filtered image or any other resolution. The last image shows the aliasing artifacts happening if a lowpass filter is not applied in this process.

Original image Original image in spatial fourier domain 2D low pass filtered filtered image in spatial fourier domain 4x downsampled 4x fourier upsampling (correct reconstruction) 4x fourier upsampling (with aliasing)

Applications

Applications to arcade and console emulators

On sufficiently fast hardware, these algorithms are suitable for gaming and other real-time image processing software. These highly optimized algorithms provide sharp, crisp graphics while minimizing blur. Scaling art algorithms have been implemented in a wide range of emulators, 2D game engines and game engine recreations like HqMAME, DOSBox, and ScummVM. They have gained wide recognition with gamers, with whom these technologies have encouraged a revival of '80s and '90s gaming experiences.

Such filters are currently used in commercial emulators on Xbox Live, Virtual Console, and PSN to allow classic low resolution games to be more visually appealing on modern HD displays. Recently released games that incorporate these filters include Sonic's Ultimate Genesis Collection, Castlevania: The Dracula X Chronicles, Castlevania: Symphony of the Night, and Akumajō Dracula X Chi no Rondo.

See also

References

  1. "Edge-Directed Interpolation". Retrieved 19 February 2016.
  2. Xin Li; Michael T. Orchard. "NEW EDGE DIRECTED INTERPOLATION" (PDF). 2000 IEEE International Conference on Image Processing: 311.
  3. Zhang, D.; Xiaolin Wu. "An Edge-Guided Image Interpolation Algorithm via Directional Filtering and Data Fusion" (PDF).
  4. K.Sreedhar Reddy; Dr.K.Rama Linga Reddy (December 2013). "Enlargement of Image Based Upon Interpolation Techniques" (PDF). International Journal of Advanced Research in Computer and Communication Engineering 2 (12): 4631.
  5. Dengwen Zhou; Xiaoliu Shen. "Image Zooming Using Directional Cubic Convolution Interpolation". Retrieved 13 September 2015.
  6. Shaode Yu; Rongmao Li; Rui Zhang; Mou An; Shibin Wu; Yaoqin Xie. "Performance evaluation of edge-directed interpolation methods for noise-free images". Retrieved 13 September 2015.
  7. 1 2 Johannes Kopf and Dani Lischinski (2011). "Depixelizing Pixel Art". ACM Transactions on Graphics (Proceedings of SIGGRAPH 2011) 30 (4): 99:1–99:8. doi:10.1145/2010324.1964994. Archived from the original on 2015-09-01. Retrieved 24 October 2012.
  8. "Gain Superpowered Vision With Scalable Function Graphics: H+ Magazine". 2014.
  9. "Pixel Scalers". Retrieved 19 February 2016.
  10. "Indiana Jones and the Fate of Atlantis" (PNG screenshot).
  11. Thomas, Kas (1999). "Fast Blit Strategies: A Mac Programmer's Guide". MacTech.
  12. libretro. "common-shaders/scalenx at master · libretro/common-shaders · GitHub". GitHub. Retrieved 19 February 2016.
  13. "ScaleNx - Artifact Removal and Algorithm Improvement [Archive] - Libretro Forums". Retrieved 19 February 2016.
  14. "Eagle (idea)". Everything2. 2007-01-18.
  15. "Gmane Loom". Retrieved 19 February 2016.
  16. Stepin, Maxim. "hq3x Magnification Filter". Retrieved 2007-07-03.
  17. Hunter K. "Filthy Pants: A Computer Blog". Retrieved 19 February 2016.
  18. libretro. "common-shaders/hqx at master · libretro/common-shaders · GitHub". GitHub. Retrieved 19 February 2016.
  19. Byuu. Release announcement Accessed 2011-08-14.
  20. "xBR algorithm tutorial". Retrieved 19 February 2016.
  21. libretro. "common-shaders/xbr at master · libretro/common-shaders · GitHub". GitHub. Retrieved 19 February 2016.
  22. zenju. "xBRZ". SourceForge. Retrieved 19 February 2016.
  23. "Super-xBR.pdf". Google Docs. Retrieved 19 February 2016.
  24. libretro. "common-shaders/xbr/shaders/super-xbr at master · libretro/common-shaders · GitHub". GitHub. Retrieved 19 February 2016.
  25. http://pastebin.com/cbH8ZQQT
  26. "RotSprite". Sonic Retro. Retrieved 19 February 2016.
  27. "Sprite Rotation Utility". Sonic and Sega Retro Message Board. Retrieved 19 February 2016.
  28. "nnedi3 vs NeuronDoubler - Doom9's Forum". Retrieved 19 February 2016.
  29. "Shader implementation of the NEDI algorithm - Page 6 - Doom9's Forum". Retrieved 19 February 2016.
  30. http://web.missouri.edu/~kes25c/
  31. http://web.archive.org/web/20101126091759/http://neuron2.net/library/nedi.pdf
  32. http://web.archive.org/web/20041221052401/http://www.cs.ucdavis.edu:80/~bai/ECS231/finaltzeng.pdf
  33. "TDeint and TIVTC - Page 21 - Doom9's Forum". Retrieved 19 February 2016.
  34. https://www.doom9.org/showthread.php?s=7fb2fb184cfe82b7d76b63bb26df481a&t=170727
  35. "NNEDI - intra-field deinterlacing filter - Doom9's Forum". Retrieved 19 February 2016.
  36. "Shader implementation of the NEDI algorithm - Doom9's Forum". Retrieved 19 February 2016.
  37. nagadomi. "GitHub - nagadomi/waifu2x: Image Super-Resolution for Anime-Style Art". GitHub. Retrieved 19 February 2016.

External links

This article is issued from Wikipedia - version of the Monday, April 18, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.