H.262/MPEG-2 Part 2

H.262^[1] or MPEG-2 Part 2 (formally known as ITU-T Recommendation H.262 and ISO/IEC 13818-2,^[2] also known as MPEG-2 Video) is a video coding format developed and maintained jointly by ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical. The standard is available for a fee from the ITU-T^[1] and ISO.

Overview

MPEG-2 Video is similar to MPEG-1, but also provides support for interlaced video (an encoding technique used in analog NTSC, PAL and SECAM television systems). MPEG-2 video is not optimized for low bit-rates (less than 1 Mbit/s), but outperforms MPEG-1 at 3 Mbit/s and above. All standards-conforming MPEG-2 Video decoders are fully capable of playing back MPEG-1 Video streams.^[3]

History

The ISO/IEC approval process was completed in November 1994.^[4] The first edition was approved in July 1995^[5] and published by ITU-T^[1] and ISO/IEC in 1996.^[6]

In 1996 it was extended by two amendments to include the Registration of Copyright Identifiers and the 4:2:2 Profile.^[1]^[7] ITU-T published these amendments in 1996 and ISO in 1997.^[6]

There are also other amendments published later by ITU-T and ISO.^[1]^[8] The most recent edition of the standard was published in 2013 and incorporates all prior amendments.^[2]

Editions

H.262 / MPEG-2 Video editions^[8]
Edition	Release date	Latest amendment	ISO/IEC standard	ITU-T Recommendation
First edition	1995	2000	ISO/IEC 13818-2:1996^[6]	H.262 (07/95)
Second edition	2000	2010^[1]^[9]	ISO/IEC 13818-2:2000^[10]	H.262 (02/00)
Third edition	2013		ISO/IEC 13818-2:2013^[2]	H.262 (02/12)

Video coding

An HDTV camera generates a raw video stream of 149,299,200 (=24*1920*1080*3) bytes per second for 24fps video. This stream must be compressed if digital TV is to fit in the bandwidth of available TV channels and if movies are to fit on DVDs. Fortunately, video compression is practical because the data in pictures is often redundant in space and time. For example, the sky can be blue across the top of a picture and that blue sky can persist for frame after frame. Also, because of the way the eye works, it is possible to delete some data from video pictures with almost no noticeable degradation in image quality.

TV cameras used in broadcasting usually generate 50 pictures a second (in Europe) or 59.94 pictures a second (in North America). Digital television requires that these pictures be digitized so that they can be processed by computer hardware. Each picture element (a pixel) is then represented by one luma number and two chrominance numbers. These describe the brightness and the color of the pixel (see YCbCr). Thus, each digitized picture is initially represented by three rectangular arrays of numbers.

A common (and old) trick to reduce the amount of data is to separate each picture into two fields upon broadcast/encoding: the "top field," which is the odd numbered horizontal lines, and the "bottom field," which is the even numbered lines. Upon reception/decoding, the two fields are displayed alternately with the lines of one field interleaving between the lines of the previous field; this format is called interlaced video. The typical field rate is 50 (Europe/PAL) or 59.94 (US/NTSC) fields per second. If the video is not interlaced, then it is called progressive video and each picture is a frame. MPEG-2 supports both options.

In full size, this image shows the difference between four subsampling schemes. Note how similar the color images appear. The lower row shows the resolution of the color information.

Another common practice to reduce the data rate is to "thin out" or subsample the two chrominance planes. In effect, the remaining chrominance values represent the nearby values that are deleted. Thinning works because the eye better resolves brightness details than chrominance details. The 4:2:2 chrominance format indicates that half the chrominance values have been deleted. The 4:2:0 chrominance format indicates that three-quarters of the chrominance values have been deleted. If no chrominance values have been deleted, the chrominance format is 4:4:4. MPEG-2 allows all three options.

MPEG-2 specifies that the raw frames be compressed into three kinds of frames: intra-coded frames (I-frames), predictive-coded frames (P-frames), and bidirectionally-predictive-coded frames (B-frames).

An I-frame is a compressed version of a single uncompressed (raw) frame. It takes advantage of spatial redundancy and of the inability of the eye to detect certain changes in the image. Unlike P-frames and B-frames, I-frames do not depend on data in the preceding or the following frames. Briefly, the raw frame is divided into 8 pixel by 8 pixel blocks. The data in each block is transformed by the discrete cosine transform (DCT). The result is an 8 by 8 matrix of coefficients. The transform converts spatial variations into frequency variations, but it does not change the information in the block; the original block can be recreated exactly by applying the inverse cosine transform. The advantage of doing this is that the image can now be simplified by quantizing the coefficients. Many of the coefficients, usually the higher frequency components, will then be zero. The penalty of this step is the loss of some subtle distinctions in brightness and color. If one applies the inverse transform to the matrix after it is quantized, one gets an image that looks very similar to the original image but that is not quite as nuanced. Next, the quantized coefficient matrix is itself compressed. Typically, one corner of the quantized matrix is filled with zeros. By starting in the opposite corner of the matrix, then zigzagging through the matrix to combine the coefficients into a string, then substituting run-length codes for consecutive zeros in that string, and then applying Huffman coding to that result, one reduces the matrix to a smaller array of numbers. It is this array that is broadcast or that is put on DVDs. In the receiver or the player, the whole process is reversed, enabling the receiver to reconstruct, to a close approximation, the original frame.

Typically, every 15th frame or so is made into an I-frame. P-frames and B-frames might follow an I-frame like this, IBBPBBPBBPBB(I), to form a Group Of Pictures (GOP); however, the standard is flexible about this.

Macroblocks

P-frames provide more compression than I-frames because they take advantage of the data in a previous I-frame or P-frame - a reference frame. To generate a P-frame, the previous reference frame is reconstructed, just as it would be in a TV receiver or DVD player. The frame being compressed is divided into 16 pixel by 16 pixel macroblocks. Then, for each of those macroblocks, the reconstructed reference frame is searched to find that 16 by 16 macroblock that best matches the macroblock being compressed. The offset is encoded as a "motion vector." Frequently, the offset is zero. But, if something in the picture is moving, the offset might be something like 23 pixels to the right and 4 pixels up. The match between the two macroblocks will often not be perfect. To correct for this, the encoder takes the difference of all corresponding pixels of the two macroblocks, and on that macroblock difference then computes the strings of coefficient values as described above. This "residual" is appended to the motion vector and the result sent to the receiver or stored on the DVD for each macroblock being compressed. Sometimes no suitable match is found. Then, the macroblock is treated like an I-frame macroblock.

The processing of B-frames is similar to that of P-frames except that B-frames use the picture in a subsequent reference frame as well as the picture in a preceding reference frame. As a result, B-frames usually provide more compression than P-frames. B-frames are never reference frames.

While the above generally describes MPEG-2 video compression, there are many details that are not discussed including details involving fields, chrominance formats, responses to scene changes, special codes that label the parts of the bitstream, and other pieces of information.

Video profiles and levels

MPEG-2 video supports a wide range of applications from mobile to high quality HD editing. For many applications, it is unrealistic and too expensive to support the entire standard. To allow such applications to support only subsets of it, the standard defines profiles and levels.

A profile defines sets of features such as B-pictures, 3D video, chroma format, etc. The level limits the memory and processing power needed, defining maximum bit rates, frame sizes, and frame rates.

A MPEG application then specifies the capabilities in terms of profile and level. For example, a DVD player may say it supports up to main profile and main level (often written as MP@ML). It means the player can play back any MPEG stream encoded as MP@ML or less.

The tables below summarizes the limitations of each profile and level, though there are constraints not listed here.^[1]^{:Annex E} Note that not all profile and level combinations are permissible, and scalable modes modify the level restrictions.

MPEG-2 Profiles
Abbr.	Name	Picture Coding Types	Chroma Format	Aspect Ratios	Scalable modes	Intra DC Precision
SP	Simple profile	I, P	4:2:0	square pixels, 4:3, or 16:9	none	8, 9, 10
MP	Main profile	I, P, B	4:2:0	square pixels, 4:3, or 16:9	none	8, 9, 10
SNR	SNR Scalable profile	I, P, B	4:2:0	square pixels, 4:3, or 16:9	SNR^{[lower-alpha 1]}	8, 9, 10
Spatial	Spatially Scalable profile	I, P, B	4:2:0	square pixels, 4:3, or 16:9	SNR^{[lower-alpha 1]}, spatial^{[lower-alpha 2]}	8, 9, 10
HP	High profile	I, P, B	4:2:2 or 4:2:0	square pixels, 4:3, or 16:9	SNR^{[lower-alpha 1]}, spatial^{[lower-alpha 2]}	8, 9, 10, 11
422	4:2:2 profile	I, P, B	4:2:2 or 4:2:0	square pixels, 4:3, or 16:9	none	8, 9, 10, 11
MVP	Multi-view profile	I, P, B	4:2:0	square pixels, 4:3, or 16:9	Temporal^{[lower-alpha 3]}	8, 9, 10

1 2 3 SNR-scalability sends the transform-domain differences to a lower quantization level of each block, raising the quality and bitrate when both streams are combined. A main stream can be recreated losslessly.
1 2 Spatial-scalability encodes the difference between the HD and the upscaled SD streams, which is combined with the SD to recreate the HD stream. A Main stream cannot be recreated losslessly.
↑ Temporal-scalability inserts extra frames between every base frame, to raise the frame rate or add a 3D viewpoint. This is the only MPEG-2 profile allowing adaptive frame references, a prominent feature of H.264/AVC. A Main stream may be recreated losslessly only if extended references are not used.

MPEG-2 Levels
Abbr.	Name	Frame rates (Hz)	Max horizontal resolution	Max vertical resolution	Max luminance samples per second (approximately height x width x framerate)	Max bit rate in Main profile (Mbit/s)
LL	Low Level	23.976, 24, 25, 29.97, 30	352	288	3,041,280	4
ML	Main Level	23.976, 24, 25, 29.97, 30	720	576	10,368,000, except in High profile, where constraint is 14,475,600 for 4:2:0 and 11,059,200 for 4:2:2	15
H-14	High 1440	23.976, 24, 25, 29.97, 30, 50, 59.94, 60	1440	1152	47,001,600, except that in High profile with 4:2:0, constraint is 62,668,800	60
HL	High Level	23.976, 24, 25, 29.97, 30, 50, 59.94, 60	1920	1152	62,668,800, except that in High profile with 4:2:0, constraint is 83,558,400	80

A few common MPEG-2 Profile/Level combinations are presented below, with particular maximum limits noted:

Profile @ Level	Resolution (px)	Framerate max. (Hz)	Sampling	Bitrate (Mbit/s)	Example Application
SP@LL	176 × 144	15	4:2:0	0.096	Wireless handsets
SP@ML	352 × 288	15	4:2:0	0.384	PDAs
SP@ML	320 × 240	24	4:2:0	0.384	PDAs
MP@LL	352 × 288	30	4:2:0	4	Set-top boxes (STB)
MP@ML	720 × 480	30	4:2:0	15	DVD (9.8Mbps), SD DVB (15 Mbps)
MP@ML	720 × 576	25	4:2:0	15	DVD (9.8Mbps), SD DVB (15 Mbps)
MP@H-14	1440 × 1080	30	4:2:0	60	HDV (25 Mbps)
MP@H-14	1280 × 720	30	4:2:0	60	HDV (25 Mbps)
MP@HL	1920 × 1080	30	4:2:0	80	ATSC (18.3 Mbps), SD DVB (31 Mbps), HD DVB (50.3 Mbps)
MP@HL	1280 × 720	60	4:2:0	80	ATSC (18.3 Mbps), SD DVB (31 Mbps), HD DVB (50.3 Mbps)
422P@ML	720 × 480	30	4:2:2	50	Sony IMX (I only), Broadcast Contribution (I&P only)
422P@ML	720 × 576	25	4:2:2	50	Sony IMX (I only), Broadcast Contribution (I&P only)
422P@H-14	1440 × 1080	30	4:2:2	80
422P@HL	1920 × 1080	30	4:2:2	300	Sony MPEG HD422 (50 Mbps), Canon XF Codec (50 Mbps), Convergent Design Nanoflash recorder (up to 160 Mbps)
422P@HL	1280 × 720	60	4:2:2	300

Applications

Some applications are listed below.

DVD-Video - a standard definition consumer video format. Uses 4:2:0 color subsampling and variable video data rate up to 9.8 Mbit/s.
MPEG IMX - a standard definition professional video recording format. Uses intraframe compression, 4:2:2 color subsampling and user-selectable constant video data rate of 30, 40 or 50 Mbit/s.
HDV - a tape-based high definition video recording format. Uses 4:2:0 color subsampling and 19.4 or 25 Mbit/s total data rate.
XDCAM - a family of tapeless video recording formats, which, in particular, includes formats based on MPEG-2 Part 2. These are: standard definition MPEG IMX (see above), high definition MPEG HD, high definition MPEG HD422. MPEG IMX and MPEG HD422 employ 4:2:2 color subsampling, MPEG HD employs 4:2:0 color subsampling. Most subformats use selectable constant video data rate from 25 to 50 Mbit/s, although there is also a variable bitrate mode with maximum 18 Mbit/s data rate.
XF Codec - a professional tapeless video recording format, similar to MPEG HD and MPEG HD422 but stored in a different container file.
HD DVD - defunct high definition consumer video format.
Blu-ray Disc - high definition consumer video format.
Broadcast TV - in some countries MPEG-2 Part 2 is used for digital broadcast in high definition. For example, ATSC specifies both several scanning formats (480i, 480p, 720p, 1080i, 1080p) and frame/field rates at 4:2:0 color subsampling, with up to 19.4 Mbit/s data rate per channel.
Digital cable TV
Satellite TV

References

1 2 3 4 5 6 7 "H.262 : Information technology - Generic coding of moving pictures and associated audio information: Video". ITU-T Website. International Telecommunication Union - Telecommunication Standardization Sector (ITU-T). February 2000. Retrieved 13 August 2009.
1 2 3 ISO. "ISO/IEC 13818-2:2013 - Information technology -- Generic coding of moving pictures and associated audio information: Video". ISO. Retrieved 24 July 2014.
↑ "MPEG-2 Video". Retrieved 24 July 2014.
↑ P.N. Tudor (December 2005). "MPEG-2 Video compression". Retrieved 1 November 2009.
↑ H.262 (07/95) Information Technology – Generic Coding of Moving Picture and Associated Audio Information: Video (PDF), ITU, retrieved 3 November 2009
1 2 3 ISO. "ISO/IEC 13818-2:1996 - Information technology -- Generic coding of moving pictures and associated audio information: Video". ISO. Retrieved 24 July 2014.
↑ Leonardo Chiariglione - Convenor (October 2000). "Short MPEG-2 description". Retrieved 1 November 2009.
1 2 MPEG. "MPEG standards". chiariglione.org. Retrieved 24 July 2014.
↑ ISO. "ISO/IEC 13818-2:2000/Amd 3 - New level for 1080@50p/60p". Retrieved 24 July 2014.
↑ ISO. "ISO/IEC 13818-2:2000 - Information technology -- Generic coding of moving pictures and associated audio information: Video". ISO. Retrieved 24 July 2014.

External links

Official MPEG web site
MPEG-2 Video Encoding (H.262) - The Library of Congress

Multimedia compression and container formats

Video
compression

ISO/IEC	MJPEG Motion JPEG 2000 MPEG-1 MPEG-2 Part 2 MPEG-4 Part 2/ASP Part 10/AVC MPEG-H Part 2/HEVC

ITU-T	H.120 H.261 H.262 H.263 H.264 H.265

SMPTE	VC-1 VC-2 VC-3 VC-5

Others	AOMedia Video Apple Video AVS Bink Cinepak Daala Dirac DV DVI FFV1 Huffyuv Indeo Microsoft Video 1 MSU Lossless Lagarith OMS Video Pixlet ProRes 422 ProRes 4444 QuickTime Animation Graphics RealVideo RTVideo SheerVideo Smacker Sorenson Video, Spark Theora Thor VP3 VP6 VP7 VP8 VP9 WMV XEB YULS

Audio
compression

ISO/IEC	MPEG-1 Layer III (MP3) MPEG-1 Layer II Multichannel MPEG-1 Layer I AAC HE-AAC AAC-LD MPEG Surround MPEG-4 ALS MPEG-4 SLS MPEG-4 DST MPEG-4 HVXC MPEG-4 CELP MPEG-D USAC MPEG-H 3D Audio

ITU-T	G.711 (A-law, µ-law) G.718 G.719 G.722 G.722.1 G.722.2 G.723 G.723.1 G.726 G.728 G.729 G.729.1

IETF	Opus iLBC

3GPP	AMR AMR-WB AMR-WB+ EVRC EVRC-B GSM-HR GSM-FR GSM-EFR

Others	ACELP AC-3 ALAC Asao ATRAC CELT Codec2 DRA DTS FLAC iSAC Monkey's Audio TTA True Audio MT9 Musepack OptimFROG OSQ QCELP RCELP RealAudio RTAudio SD2 SHN SILK Siren SMV Speex SVOPC TwinVQ VMR-WB Vorbis VSELP WavPack WMA MQA aptX

Image
compression

IEC, ISO, ITU-T,W3C,IETF	CCITT Group 4 JPEG JPEG 2000 JPEG XR Lossless JPEG JBIG JBIG2 PNG TIFF/EP TIFF/IT HEVC GIF TIFF

Others	APNG BPG DjVu EXR FLIF ICER MNG PGF QTVR WBMP WebP

Containers

ISO/IEC	MPEG-PS MPEG-TS ISO base media file format MPEG-4 Part 14 (MP4) Motion JPEG 2000 MPEG-21 Part 9 MPEG media transport

ITU-T	H.222.0 T.802

Others	3GP and 3G2 AMV ASF AIFF AVI AU BPG Bink Smacker BMP DivX Media Format EVO Flash Video GXF IFF M2TS Matroska WebM MXF Ogg QuickTime File Format RatDVD RealMedia RIFF WAV MOD and TOD VOB, IFO and BUP

Collaborations

See Compression methods for methods and Compression software for codecs

MPEG (Moving Picture Experts Group)

MPEG-1 2 3 4 7 21 A B C D E V M U H

MPEG-1 Parts	Part 1: Systems Program stream Part 2: Video based on H.261 Part 3: Audio Layer I Layer II Layer III

MPEG-2 Parts	Part 1: Systems (H.222.0) Transport stream Program stream Part 2: Video (H.262) Part 3: Audio Layer I Layer II Layer III MPEG Multichannel Part 6: DSM CC Part 7: Advanced Audio Coding

MPEG-4 Parts	Part 2: Video based on H.263 Part 3: Audio Part 6: DMIF Part 10: Advanced Video Coding (H.264) Part 11: Scene description Part 12: ISO base media file format Part 14: MP4 file format Part 17: Streaming text format Part 20: LASeR Part 22: Open Font Format

MPEG-7 Parts	Part 2: Description definition language

MPEG-21 Parts	Parts 2, 3 and 9: Digital Item Part 5: Rights Expression Language

MPEG-D Parts	Part 1: MPEG Surround Part 3: Unified Speech and Audio Coding

MPEG-H Parts	Part 1: MPEG media transport Part 2: High Efficiency Video Coding Part 3: MPEG-H 3D Audio

Other	MPEG-DASH

ITU recommendations (standards)

Lists: List of ITU-T V-series recommendations List of ITU letter codes Categories: Category:ITU-R recommendations Category:ITU-T recommendations

G series (ITU-T)	G.114 G.165 G.703 G.704 G.706 G.707 G.709 G.711 G.718 G.719 G.722 G.722.1 G.722.2 G.729.1 G.723 G.723.1 G.726 G.728 G.729 G.783 G.798 G.806 G.811 G.983 G.984 G.987 G.988 G.991.1 G.991.2 G.992.1 G.992.2 G.992.3 Annex J Annex L G.992.4 G.992.5 Annex M G.993.1 G.993.2 G.7041 G.7042 G.7043 G.8262 G.9700 / G.9701 G.9960 G.9970 G.9972

H series (ITU-T)	H.222.0 H.225.0 H.235 H.239 H.241 H.245 H.248 H.261 H.262/MPEG-2 Part 2 H.263 H.264/MPEG-4 AVC H.265/MPEG-H HEVC H.320 H.323 H.323 Gatekeeper H.324 H.450

V series (ITU-T)	V.10 V.11 V.21 V.22 V.23 V.24 V.61 V.70 V.90 V.92

ITU-R	ITU-R 468 noise weighting ITU-R BS.1534-1 ITU-R BT.1304 ITU-R BT.470-6 ITU-R BT.470-7 ITU-R BT.601 ITU-R BT.709 ITU-R BT.2020

See also: All articles beginning with "ITU"

This article is issued from Wikipedia - version of the Monday, April 11, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.