Video coding format
A video coding format[1][2] (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). Examples of video coding formats include MPEG-2 Part 2, MPEG-4 Part 2, H.264 (MPEG-4 Part 10), HEVC, Theora, Dirac, RealVideo RV40, VP8, and VP9. A specific software or hardware implementation capable of video compression and/or decompression to/from a specific video coding format is called a video codec; an example of a video codec is Xvid, which is one of several different codecs which implements encoding and decoding videos in the MPEG-4 Part 2 video coding format in software. As an analogy, a video coding format (specification) is to a codec (specific implementation) what the C Programming Language (specification) is to compiler such as GCC (specific implementation).
Some video coding formats are documented by a detailed technical specification document known as a video coding specification. Some such specifications are written and approved by standardization organizations as technical standards, and are thus known as a video coding standard. The term 'standard' is also sometimes used for de facto standards as well as formal standards.
Video content encoded using a particular video coding format is normally bundled with an audio stream (encoded using an audio coding format) inside a multimedia container format such as AVI, MP4, FLV, RealMedia, or Matroska. As such, the user normally doesn't have a H.264 file, but instead has a .mp4 video file, which is an MP4 container containing H.264-encoded video, normally alongside AAC-encoded audio. Multimedia container formats can contain any one of a number of different video coding formats; for example the MP4 container format can contain video in either the MPEG-2 Part 2 or the H.264 video coding format, among others. Another example is the initial specification for the file type WebM, which specified the container format (Matroska), but also exactly which video (VP8) and audio (Vorbis) compression format is used inside the Matroska container, even though the Matroska container format itself is capable of containing other video coding formats (VP9 video and Opus audio support was later added to the WebM specification).
A video coding format does not dictate all algorithms used by a codec implementing the format. For example, a large part of how a video compression typically works is by finding similarities between video frames (block-matching), and then achieving compression by copying previously-coded similar subimages (e.g., macroblocks) and adding small differences when necessary. Finding optimal combinations of such predictors and differences is an NP-complete problem,[3] meaning that it is practically impossible to find an optimal solution. While the video coding format must of course support such compression across frames in the bitstream format, by not needlessly mandating specific algorithms for finding such block-matches and other encoding steps, the codecs implementing the video coding specification have some freedom to optimize and innovate in their choice of algorithms. For example, section 0.5 of the H.264 specification says that encoding algorithms are not part of the specification.[4] Free choice of algorithm also allows different space–time complexity trade-offs for the same video coding format, so a live feed can use a fast but space-inefficient algorithm, while a one-time DVD encoding for later mass production can trade long encoding-time for space-efficient encoding.
Lossless, lossy, and uncompressed video coding formats
Consumer video is generally compressed using lossy video codecs, since that results in significantly smaller files than lossless compression. While there are video coding formats designed explicitly for either lossy or lossless compression, some video coding formats such as Dirac and H.264 support either.
Uncompressed video formats, such as Clean HDMI, is a form of lossless video used in some circumstances such as when sending video to a display over a HDMI connection. Some high-end cameras can also capture video directly in this format.
Intra-frame video coding formats
One subclass of relatively simple video coding formats are the intra-frame video formats, in which compression can only be done to each picture in the video-stream in isolation, and no attempt is made to take advantage of correlations between successive pictures over time for better compression. One example is Motion JPEG, which is simply a sequence of individually JPEG-compressed images. This approach is quick and simple, at the expense the encoded video being much larger than a video coding format supporting Inter frame coding.
Because interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Making 'cuts' in intraframe-compressed video while video editing is almost as easy as editing uncompressed video: one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that, with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as "I frames" in MPEG-2) aren't allowed to copy data from other frames, so they require much more data than other frames nearby.[5]
Profiles and levels
A video coding format can define optional restrictions to encoded video, called profiles and levels. It is possible to have a decoder which only supports decoding a subset of profiles and levels of a given video format, for example to make the decoder program/hardware smaller, simpler, or faster.
A profile restricts which encoding techniques are allowed. For example, the H.264 format includes the profiles baseline, main and high (and others). While P-slices (which can be predicted based on preceding slices) are supported in all profiles, B-slices (which can be predicted based on both preceding and following slices) are supported in the main and high profiles but not in baseline.[6]
A level is a restriction on parameters such as maximum resolution and data rates.[6]
See also
References and notes
- ↑ The term "video coding" can he seen in e.g. the names Advanced Video Coding, High Efficiency Video Coding, and Video Coding Experts Group
- ↑ http://654lab.webstarts.com/uploads/csvt_overview.pdf
- ↑ "Chapter 3 : Modified A* Prune Algorithm for finding K-MCSP in video compression" (PDF). Shodhganga.inflibnet.ac.in\accessdate=6 January 2015.
- ↑ "SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS : Infrastructure of audiovisual services – Coding of moving video : Advanced video coding for generic audiovisual services". Itu.int. Retrieved 6 January 2015.
- ↑ Jaiswal, R.C. (2009). Audio-Video Engineering. Pune, Maharashtra: Nirali Prakashan. p. 3.55. ISBN 9788190639675.
- 1 2 Jan Ozer. "Encoding options for H.264 video". Adobe.com. Retrieved 6 January 2015.