ITU T. More information on the call can be found here. Specifies the core coding system, consisting of the well-known Huffman-coded DCT based lossy image format, but also including the arithmetic coding option, lossless coding and hierarchical coding. Specifies conformance testing, and as such provides test procedures and test data to test JPEG 1 encoders and decoders for conformance.
Registers known application markers, SPIFF tags profiles, compression types and registration authorities. JPEG 1 currently includes the following parts: Part 1: Requirements and guidelines Specifies the core coding system, consisting of the well-known Huffman-coded DCT based lossy image format, but also including the arithmetic coding option, lossless coding and hierarchical coding. However, many applications are not able to deal with JPEG color profiles and simply ignore them.
A JPEG image consists of a sequence of segments, each beginning with a marker , each of which begins with a 0xFF byte followed by a byte indicating what kind of marker it is. Some markers consist of just those two bytes; others are followed by two bytes indicating the length of marker-specific payload data that follows.
The length includes the two bytes for the length, but not the two bytes for the marker. Some markers are followed by entropy-coded data; the length of such a marker does not include the entropy-coded data. Within the entropy-coded data, after any 0xFF byte, a 0x00 byte is inserted by the encoder before the next byte, so that there does not appear to be a marker where none is intended, preventing framing errors.
Decoders must skip this 0x00 byte. This technique, called byte stuffing , is only applied to the entropy-coded data, not to marker payload data.
Since several vendors might use the same APP n marker type, application-specific markers often begin with a standard or vendor name e. At a restart marker, block-to-block predictor variables are reset, and the bitstream is synchronized to a byte boundary. Restart markers provide means for recovery after bitstream error, such as transmission over an unreliable network or file corruption.
Since the runs of macroblocks between restart markers may be independently decoded, these runs may be decoded in parallel. The encoding process consists of several steps:. The decoding process reverses these steps. In the remainder of this section, the encoding and decoding processes are described in more detail.
Many of the options in the JPEG standard are not commonly used, and as mentioned above, most image software uses the simpler JFIF format when creating a JPEG file, which among other things specifies the encoding method.
Here is a brief description of one of the more common methods of encoding when applied to an input that has 24 bits per pixel eight each of red, green, and blue. This particular option is a lossy data compression method.
It has three components Y, Cb and Cr: the Y component represents the brightness of a pixel, the Cb and Cr components represent the chrominance split into blue and red components. The YCbCr color space conversion allows greater compression without a significant effect on perceptual image quality or greater perceptual image quality for the same compression. The compression is more efficient as the brightness information, which is more important to the eventual perceptual quality of the image, is confined to a single channel, more closely representing the human visual system.
However, some JPEG implementations in "highest quality" mode do not apply this step and instead keep the color information in the RGB color model [ citation needed ] , where the image is stored in separate channels for red, green and blue luminance. This results in less efficient compression, and would not likely be used if file size was an issue. Due to the densities of color- and brightness-sensitive receptors in the human eye, humans can see considerably more fine detail in the brightness of an image the Y component than in the color of an image the Cb and Cr components.
Using this knowledge, encoders can be designed to compress images more efficiently. The transformation into the YCbCr color model enables the next step, which is to reduce the spatial resolution of the Cb and Cr components called "downsampling" or "chroma subsampling".
The ratios at which the downsampling can be done on JPEG are no downsampling , reduce by factor of 2 in horizontal direction , and most commonly reduce by factor of 2 in horizontal and vertical directions. For the rest of the compression process, Y, Cb and Cr are processed separately and in a very similar manner.
If the data for a channel does not represent an integer number of blocks then the encoder must fill the remaining area of the incomplete blocks with some form of dummy data. Filling the edge pixels with a fixed color typically black creates ringing artifacts along the visible part of the border; repeating the edge pixels is a common technique that reduces the visible border, but it can still create artifacts.
Before computing the DCT of the subimage, its gray values are shifted from a positive range to one centered around zero. For an 8-bit image each pixel has possible values:. To center around zero it is necessary to subtract by half the number of possible values, or Subtracting from each pixel value yields pixel values on. If we perform this transformation on our matrix above, and then round to the nearest integer, we get.
Note the rather large value of the top-left corner. This is the DC coefficient. The remaining 63 coefficients are called the AC coefficients. The advantage of the DCT is its tendency to aggregate most of the signal in one corner of the result, as may be seen above. The quantization step to follow accentuates this effect while simultaneously reducing the overall size of the DCT coefficients, resulting in a signal that is easy to compress efficiently in the entropy stage.
This may force the codec to temporarily use bit bins to hold these coefficients, doubling the size of the image representation at this point; they are typically reduced back to 8-bit values by the quantization step. The temporary increase in size at this stage is not a performance concern for most JPEG implementations, because typically only a very small part of the image is stored in full DCT form at any given time during the image encoding or decoding process. The human eye is good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation.
This allows one to greatly reduce the amount of information in the high frequency components. This is done by simply dividing each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process.
As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers, which take many fewer bits to store. Note that this is in no way matrix multiplication. Entropy coding is a special form of lossless data compression.
It involves arranging the image components in a "zigzag" order employing run-length encoding RLE algorithm that groups similar frequencies together, inserting length coding zeros, and then using Huffman coding on what is left.
The JPEG standard also allows, but does not require, the use of arithmetic coding, which is mathematically superior to Huffman coding. However, this feature is rarely used as it is covered by patents and because it is much slower to encode and decode compared to Huffman coding. The zigzag sequence for the above quantized coefficients are shown below. Thus, in the above scheme, the order of encoding pixels for the i-th block is Bi 0,0 , Bi 0,1 , Bi 1,0 , Bi 2,0 , Bi 1,1 , Bi 0,2 , Bi 0,3 , Bi 1,2 and so on.
This encoding mode is called baseline sequential encoding. Baseline JPEG also supports progressive encoding. While sequential encoding encodes coefficients of a single block at a time in a zigzag manner , progressive encoding encodes similar-positioned coefficients of all blocks in one go, followed by the next positioned coefficients of all blocks, and so on. This is followed by encoding Bi 0,1 coefficient of all blocks, followed by Bi 1,0 -th coefficient of all blocks, then Bi 2,0 -th coefficient of all blocks, and so on.
It should be noted here that once all similar-positioned coefficients have been encoded, the next position to be encoded is the one occurring next in the zigzag traversal as indicated in the figure above. It has been found that Baseline Progressive JPEG encoding usually gives better compression as compared to Baseline Sequential JPEG due to the ability to use different Huffman tables see below tailored for different frequencies on each "scan" or "pass" which includes similar-positioned coefficients , though the difference is not too large.
In the rest of the article, it is assumed that the coefficient pattern generated is due to sequential mode. JPEG has a special Huffman code word for ending the sequence prematurely when the remaining coefficients are zero. JPEG's other code words represent combinations of a the number of significant bits of a coefficient, including sign, and b the number of consecutive zero coefficients that precede it. In our example block, most of the quantized coefficients are small numbers that are not preceded immediately by a zero coefficient.
These more-frequent cases will be represented by shorter code words. The JPEG standard provides general-purpose Huffman tables; encoders may also choose to generate Huffman tables optimized for the actual frequency distributions in images being encoded. The resulting compression ratio can be varied according to need by being more or less aggressive in the divisors used in the quantization phase.
Ten to one compression usually results in an image that cannot be distinguished by eye from the original. The appropriate level of compression depends on the use to which the image will be put.
Those who use the World Wide Web may be familiar with the irregularities known as compression artifacts commonly known as 'jaggies' that appear in JPEG images. These are due to the quantization step of the JPEG algorithm. They are especially noticeable around sharp corners between contrasting colours text is a good example as it contains many such corners.
They can be reduced by choosing a lower level of compression ; they may be eliminated by saving an image using a lossless file format, though for photographic images this will usually result in a larger file size.
The images created with ray-tracing programs have noticeable blocky shapes on the terrain. Compression artifacts are acceptable when the images are used for visualization purpose. Unfortunately subsequent processing of these images usually result in unacceptable artifacts.
Some programs allow the user to vary the amount by which individual blocks are compressed. Stronger compression is applied to areas of the image that show fewer artifacts. This way it is possible to manually reduce JPEG file size with less loss of quality. Since the quantization stage always results in a loss of information, JPEG standard is always a lossy compression codec. Information is lost both in quantizing and rounding of the floating-point numbers. Even if the quantization matrix is a matrix of ones, information will still be lost in the rounding step.
The error is most noticeable in the bottom-left corner where the bottom-left pixel becomes darker than the pixel to its immediate right. The JPEG encoding does not fix the precision needed for the output compressed image.
On the contrary, the JPEG standard as well as the derived MPEG standards have very strict precision requirements for the decoding, including all parts of the decoding process variable length decoding, inverse DCT, dequantization, renormalization of outputs ; the output from the reference algorithm must not exceed:. These assertions are tested on a large set of randomized input images, to handle the worst cases. Look at the IEEE standard for reference.
This has a consequence on the implementation of decoders, and it is extremely critical because some encoding processes notably used for encoding sequences of images like MPEG need to be able to construct, on the encoder side, a reference decoded image.
In order to support 8-bit precision per pixel component output, dequantization and inverse DCT transforms are typically implemented with at least bit precision in optimized decoders. JPEG compression artifacts blend well into photographs with detailed non-uniform textures, allowing higher compression ratios. Notice how a higher compression ratio first affects the high-frequency textures in the upper-left corner of the image, and how the contrasting lines become more fuzzy.
The very high compression ratio severely affects the quality of the image, although the overall colors and image form are still recognizable. However, the precision of colors suffer less for a human eye than the precision of contours based on luminance. This justifies the fact that images should be first transformed in a color model separating the luminance from the chromatic information, before subsampling the chromatic planes which may also use lower quality quantization in order to preserve the precision of the luminance plane with more information bits.
For information, the uncompressed bit RGB bitmap image below 73, pixels would require , bytes excluding all other information headers. The filesizes indicated below include the internal JPEG information headers and some meta-data. On grayscale images, a minimum of 6. For most applications, the quality factor should not go below 0. The image at lowest quality uses only 0. The medium quality photo uses only 4. However, once a certain threshold of compression is passed, compressed images show increasingly visible defects.
See the article on rate distortion theory for a mathematical explanation of this threshold effect.
0コメント