<< Chapter < Page | Chapter >> Page > |
As a sequel to the JPEG standards committee, the Moving Picture Experts Group (MPEG) was set up in the mid 1980s to agreestandards for video sequence compression.
Their first standard was MPEG-1, designed for CD-ROM applications at 1.5 Mb/s, and their more recent standard,MPEG-2, is aimed at broadcast quality TV signals at 4 to 10 Mb/s and is also suitable for high-definition TV (HDTV) at 20 Mb/s.We shall not go into the detailed differences between these standards, but simply describe some of their important features.MPEG-2 is used for digital TV and DVD in the UK and throughout the world.
MPEG coders all use the MCPC structure of this previous figure , and employ the DCT as the basic transform process. So in many respects they are similar to H.261 coders, except that theyoperate with higher resolution frames and higher bit rates.
The main difference from H.261 is the concept of a Group of Pictures (GOP) Layer in the coding hierarchy, shown in . However we describe the other layers first:
The GOP Layer contains a small number of frames (typically 12) coded so that they can be decoded completely as a unit, withoutreference to frames outside of the group. There are three types of frame:
The main purpose of the GOP is to allow editing and splicing of video material from different sources and to allow rapid forwardor reverse searching through sequences. A GOP usually represents about half a second of the image sequence.
shows a typical GOP and how the coded frames depend on each other. The first frame ofthe GOP is always an I frame, which may be decoded without needing data from any other frame. At regular intervals throughthe GOP, there are P frames, which are coded relative to a prediction from the I frame or previous P frame in the GOP.Between each pair of I / P frames are one or more B frames.
The I frame in each GOP requires the most bits per frame and provides the initial reference for all other frames in the GOP.Each P frame typically requires about one third of the bits of an I frame, and there may be 3 of these per GOP. Each B framerequires about half the bits of a P frame and there may be 8 of these per GOP. Hence the coded bits are split about evenlybetween the three frame types.
B frames require fewer bits than P frames mainly because bi-directional prediction allows uncovered background areas tobe predicted from a subsequent frame. The motion-compensated prediction in a B frame may be forward, backward, or acombination of the two (selected in the macroblock layer). Since no other frames are predicted from them, B frames may becoarsely quantised in areas of high motion and comprise mainly motion prediction information elsewhere.
In order to keep all frames in the coded bit stream causal, B frames are always transmitted after the I/P frames to which they refer, as shown at the bottom of .
One of the main ways that the H.263 (enhanced H.261) standard is able to code at very low bit rates is the incorporation of the Bframe concept.
Considerable research work at present is being directed towards more sophisticated motion models, which are based more on theoutlines of objects rather than on simple blocks. These will form the basis of extensions to the new low bit-rate video standard,MPEG-4 (MPEG-3 is an audio coding standard).
Notification Switch
Would you like to follow the 'Image coding' conversation and receive update notifications?