JBIG2

JBIG2
Developed by	Joint Bi-level Image Experts Group
Latest release	2
Contained by	Portable Document Format, FAX
Standard	ITU T.88 & ISO/IEC 14492
Website	jbig2.com

JBIG2 is an image compression standard for bi-level images, developed by the Joint Bi-level Image Experts Group. It is suitable for both lossless and lossy compression. According to a press release^[1] from the Group, in its lossless mode JBIG2 typically generates files one third to one fifth the size of Fax Group 4 and one half to one quarter the size of JBIG, the previous bi-level compression standard released by the Group. JBIG2 has been published in 2000 as the international standard ITU T.88,^[2] and in 2001 as ISO/IEC 14492.^[3]

Functionality

Ideally, a JBIG2 encoder will segment the input page into regions of text, regions of halftone images, and regions of other data. Regions which are neither text nor halftones are typically compressed using a context-dependent arithmetic coding algorithm called the QM coder. Textual regions are compressed as follows: the foreground pixels in the regions are grouped into symbols. A dictionary of symbols is then created and encoded, typically also using context-dependent arithmetic coding, and the regions are encoded by describing which symbols appear where. Typically, a symbol will correspond to a character of text, but this is not required by the compression method. For lossy compression the difference between similar symbols (e.g., slightly different impressions of the same letter) can be neglected; for lossless compression, this difference is taken into account by compressing one similar symbol using another as a template. Halftone images may be compressed by reconstructing the grayscale image used to generate the halftone and then sending this image together with a dictionary of halftone patterns.^[4] Overall, the algorithm used by JBIG2 to compress text is very similar to the JB2 compression scheme used in the DjVu file format for coding binary images.

PDF files versions 1.4 and above may contain JBIG2 compressed data. Open source decoders for JBIG2 are jbig2dec,^[5] the java-based jbig2-imageio^[6] and the decoder found in versions 2.00 and above of xpdf. An open source encoder is jbig2enc.^[7]

Technical details

Typically, a bi-level image consists mainly of a large amount of textual and halftone data in which the same shapes appear repeatedly and the bi-level image is segmented into three regions: text, halftone, and generic regions. Each region is coded differently and the coding methodologies are described in the following passage.

Text image data

Text coding is based on the nature of human visual interpretation. A human observer cannot tell the difference of two instances of the same characters in a bi-level image even though they may not exactly match pixel by pixel. Therefore, only the bitmap of one representative character instance needs to be coded instead of coding the bitmaps of each occurrence of the same character individually. For each character instance, the coded instance of the character is then stored into a "symbol dictionary".^[8] There are two encoding methods for text image data: pattern matching and substitution (PM&S) and soft pattern matching (SPM). These methods are presented in the following subsections.^[9]

Pattern matching and substitution: After performing image segmentation and match searching, and if a match exists, we code an index of the corresponding representative bitmap in the dictionary and the position of the character on the page. The position is usually relative to another previously coded character. If a match is not found, the segmented pixel block is coded directly and added into the dictionary. Typical procedures of pattern matching and substitution algorithm are displayed in the left block diagram of the figure below. Although the method of PM&S can achieve outstanding compression, substitution errors could be made during the process if the image resolution is low.

Soft pattern matching: In addition to a pointer to the dictionary and position information of the character, refinement data is also required because it is a crucial piece of information used to reconstruct the original character in the image. The deployment of refinement data can make the character-substitution error mentioned earlier highly unlikely. The refinement data contains the current desired character instance which is coded using the pixels of both the current character and the matching character in the dictionary. Since it is known that the current character instance is highly correlated with the matched character, the prediction of the current pixel is more accurate.

Halftones

Halftone images can be compressed using two methods. One of the methods is similar to the context-based arithmetic coding algorithm which adaptively positions the template pixels in order to obtain correlations between the adjacent pixels. In the second method, descreening is performed on the halftone image so that the image is converted back to grayscale. The converted grayscale values are then used as indexes of fixed-sized tiny bitmap patterns contained in a halftone bitmap dictionary. This allows decoder to successfully render a halftone image by presenting indexed dictionary bitmap patterns neighboring with each other.

Arithmetic entropy coding

All three region types including text, halftone, and generic regions may all use arithmetic coding. JBIG2 specifically uses the MQ coder.

Patents

Patents for JBIG2 are owned by IBM and Mitsubishi. Free licenses should be available after a request. JBIG and JBIG2 patents are not the same.^[10]^[11]^[12]

Disadvantages

When used in lossy mode, JBIG2 compression can potentially alter text in a way that's not discernible as corruption. This is in contrast to some other algorithms, which simply degrade into a blur, making the compression artifacts obvious.^[13] Since JBIG2 tries to match up similar-looking symbols, the numbers "6" and "8" may get replaced for example.

In 2013, various substitutions (including replacing “6” with “8”) were reported to happen on some Xerox Workcentre photocopier and printer machines, where numbers printed on scanned (but not OCRed) documents could have potentially been altered. This has been demonstrated on construction blueprints and some tables of numbers; the potential impact of such substitution errors in documents such as medical prescriptions was briefly mentioned. ^[14]^[15]^[16] David Kriesel and Xerox are investigating this.^[17]^[18]

References

↑ Press release from the Joint Bi-level Image experts Group
↑ "ITU-T Recommendation T.88 -- T.88 : Information technology - Coded representation of picture and audio information - Lossy/lossless coding of bi-level images". Retrieved 2011-02-19.
↑ "ISO/IEC 14492:2001 - Information technology -- Lossy/lossless coding of bi-level images". Retrieved 2011-02-19.
↑ JBIG2-the ultimate bi-level image coding standard, by F. Ono, W. Rucklidge, R. Arps, and C. Constantinescu, in pp. 140–143, Proceedings, 2000 International Conference on Image Processing, (Vancouver, BC, Canada), vol. 1.
↑ jbig2dec home page
↑ open source jbig2 plugin for Java's ImageIO
↑ jbig2enc home page
↑ F. Ono, W. Rucklidge, R. Arps, and C. Constantinescu, "JBIG2-the ultimate bi-level image coding standard," Image Processing, 2000. Proceedings. 2000 International Conference on , vol.1, pp.140-143 vol.1, 2000.
↑ P. Howard, F. Kossentini, B. Martins, S. Forchhammer, and W. Rucklidge, "The emerging JBIG2 standard," Circuits and Systems for Video Technology, IEEE Transactions on , vol.8, no.7, pp.838-848, Nov 1998.
↑ What is the patent situation with JBIG?
↑ What is JBIG2?, retrieved 2012-04-07
↑ JBIG2 patents, retrieved 2012-04-07
↑ Zhou Wang, Hamid R. Sheikh and Alan C. Bovik (2002). "No-reference perceptual quality assessment of JPEG compressed images" (PDF).
↑ "Xerox scanners/photocopiers randomly alter numbers in scanned documents". 2013-08-02. Retrieved 2013-08-04.
↑ "Confused Xerox copiers rewrite documents, expert finds". BBC News. 2013-08-06. Retrieved 2013-08-06.
↑ http://fontfeed.com/archives/xerox-scanners%E2%80%8A%E2%80%8Aphotocopiers-randomly-alter-numbers/
↑ "Xerox investigating latest mangling test findings". 2013-08-11. Retrieved 2013-08-11.
↑ Update on Scanning Issue: Software Patches To Come, Xerox (blog), 2013-08-11

External links

Multimedia compression and container formats

Video
compression

ISO/IEC	MJPEG Motion JPEG 2000 MPEG-1 MPEG-2 Part 2 MPEG-4 Part 2/ASP Part 10/AVC MPEG-H Part 2/HEVC

ITU-T	H.120 H.261 H.262 H.263 H.264 H.265

SMPTE	VC-1 VC-2 VC-3 VC-5

Others	Apple Video AVS Bink Cinepak Daala Dirac DV DVI FFV1 Huffyuv Indeo Microsoft Video 1 MSU Lossless Lagarith OMS Video Pixlet ProRes 422 ProRes 4444 QuickTime Animation Graphics RealVideo RTVideo SheerVideo Smacker Sorenson Video, Spark Theora Thor VP3 VP6 VP7 VP8 VP9 WMV XEB YULS

Audio
compression

ISO/IEC	MPEG-1 Layer III (MP3) MPEG-1 Layer II Multichannel MPEG-1 Layer I AAC HE-AAC AAC-LD MPEG Surround MPEG-4 ALS MPEG-4 SLS MPEG-4 DST MPEG-4 HVXC MPEG-4 CELP MPEG-D USAC MPEG-H 3D Audio

ITU-T	G.711 (A-law, µ-law) G.718 G.719 G.722 G.722.1 G.722.2 G.723 G.723.1 G.726 G.728 G.729 G.729.1

IETF	Opus iLBC

3GPP	AMR AMR-WB AMR-WB+ EVRC EVRC-B GSM-HR GSM-FR GSM-EFR

Others	ACELP AC-3 ALAC Asao ATRAC CELT Codec2 DRA DTS FLAC iSAC Monkey's Audio TTA True Audio MT9 Musepack OptimFROG OSQ QCELP RCELP RealAudio RTAudio SD2 SHN SILK Siren SMV Speex SVOPC TwinVQ VMR-WB Vorbis VSELP WavPack WMA MQA aptX

Image
compression

IEC, ISO, ITU-T,W3C,IETF	CCITT Group 4 JPEG JPEG 2000 JPEG XR Lossless JPEG JBIG JBIG2 PNG TIFF/EP TIFF/IT HEVC GIF TIFF

Others	APNG BPG DjVu EXR FLIF ICER MNG PGF QTVR WBMP WebP

Containers

ISO/IEC	MPEG-PS MPEG-TS ISO base media file format MPEG-4 Part 14 (MP4) Motion JPEG 2000 MPEG-21 Part 9 MPEG media transport

ITU-T	H.222.0 T.802

Others	3GP and 3G2 AMV ASF AIFF AVI AU BPG Bink Smacker BMP DivX Media Format EVO Flash Video GXF IFF M2TS Matroska WebM MXF Ogg QuickTime File Format RatDVD RealMedia RIFF WAV MOD and TOD VOB, IFO and BUP

Collaborations

See Compression methods for methods and Compression software for codecs

Graphics file formats

Raster	ANI ANIM APNG ART BMP BPG BSAVE CAL CIN CPC CPT DDS DPX ECW EXR FITS FLIC FLIF FPX GIF HDRi HEVC ICER ICNS ICO / CUR ICS ILBM JBIG JBIG2 JNG JPEG JPEG 2000 JPEG-LS JPEG XR MNG MIFF NRRD PAM PBM / PGM / PPM / PNM PCX PGF PICtor PNG PSD / PSB PSP QTVR RAS RBE JPEG-HDR Logluv TIFF SGI TGA TIFF TIFF/EP TIFF/IT WBMP WebP XBM XCF XPM XWD

Raw	CIFF DNG

Vector	AI CDR CGM DXF EVA EMF Gerber HVIF IGES PGML SVG VML WMF Xar

Compound	CDF DjVu EPS PDF PICT PS SWF XAML

Related	Exchangeable image file format (Exif) Extensible Metadata Platform (XMP)

Category Comparison

This article is issued from Wikipedia - version of the Saturday, May 09, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.