FANDOM


This article contains text copied from Wikipedia under the terms of the GFDL. It needs to be edited to have a Computer vision focus.
300px-JPEG example flower

A photo of a flower compressed with successively higher compression ratios from left to right.

In computing, JPEG is a commonly used standard method of compressing photographic images. The which employs this compression is commonly also called JPEG; the most common file extensions for this format are .jpeg, .jfif, .jpg, .JPG, or .JPE although .jpg is the most common on all platforms.

The name stands for Joint Photographic Experts Group. JPEG itself specifies only how an image is transformed into a stream of bytes, but not how those bytes are encapsulated in any particular storage medium. A further standard, created by the Independent JPEG Group, called JFIF (JPEG File Interchange Format) specifies how to produce a file suitable for computer storage and transmission (such as over the Internet) from a JPEG stream. In common usage, when one speaks of a "JPEG file" one generally means a JFIF file, or sometimes an Exif JPEG file. There are, however, other JPEG-based file formats, such as JNG.

JPEG/JFIF is the most common format used for storing and transmitting photographs on the World Wide Web. It is not as well suited for line drawings and other textual or iconic graphics because its compression method performs badly on these types of images (the PNG and GIF formats are in common use for that purpose; GIF, having only 8 bits per pixel is not well suited for colour photographs, but PNG may have as much or more detail than JPEG).

The MIME media type for JFIF is image/jpeg (defined in RFC 1341).

EncodingEdit

Many of the options in the JPEG standard are little used. Here is a brief description of one of the more common methods of encoding when applied to an input that has 24 bits per pixel (eight each of red, green, and blue). This particular option is a lossy data compression method.

Color Space TransformationEdit

First, the image is converted from RGB into a different color space called YUV. This is similar to the color space used by NTSC and PAL color television transmission. The Y component represents the brightness of a pixel, and the U and V components together represent the hue and saturation. This part is useful because the human eye can see more detail in the Y component than in the others.

DownsamplingEdit

The above transformation enables the next step, which is to reduce the U and V components (called "downsampling" or "chroma subsampling"). The ratios at which the downsampling can be done on JPEG are 4:4:4 (no downsampling), 4:2:2 (decimate by factor of 2 in horizontal direction), and most commonly 4:2:0 (decimate by factor of 2 in horizontal and vertical directions). For the rest of the compression process, Y, U and V are processed separately and in a very similar manner.

Discrete Cosine TransformEdit

JPEG example image

The 8x8 subimage shown in 8-bit greyscale

Next, each component (Y, U, V) of the image is "tiled" into sections of eight by eight pixels each, then each tile is converted to frequency space using a two-dimensional discrete cosine transform (DCT).

If one such 8x8 8-bit subimage is:


\begin{bmatrix}
 52 & 55 & 61 &  66 &  70 &  61 & 64 & 73 \\
 63 & 59 & 55 &  90 & 109 &  85 & 69 & 72 \\
 62 & 59 & 68 & 113 & 144 & 104 & 66 & 73 \\
 63 & 58 & 71 & 122 & 154 & 106 & 70 & 69 \\
 67 & 61 & 68 & 104 & 126 &  88 & 68 & 70 \\
 79 & 65 & 60 &  70 &  77 &  68 & 58 & 75 \\
 85 & 71 & 64 &  59 &  55 &  61 & 65 & 83 \\
 87 & 79 & 69 &  68 &  65 &  76 & 78 & 94
\end{bmatrix}


which is then shifted by 128 results in


\begin{bmatrix}
 -76 & -73 & -67 & -62 & -58 & -67 & -64 & -55 \\
 -65 & -69 & -73 & -38 & -19 & -43 & -59 & -56 \\
 -66 & -69 & -60 & -15 &  16 & -24 & -62 & -55 \\
 -65 & -70 & -57 &  -6 &  26 & -22 & -58 & -59 \\
 -61 & -67 & -60 & -24 &  -2 & -40 & -60 & -58 \\
 -49 & -63 & -68 & -58 & -51 & -60 & -70 & -53 \\
 -43 & -57 & -64 & -69 & -73 & -67 & -63 & -45 \\
 -41 & -49 & -59 & -60 & -63 & -52 & -50 & -34
\end{bmatrix}


and then taking the DCT and rounding to the nearest integer results in


\begin{bmatrix}
 -415 & -30 & -61 &  27 &  56 & -20 & -2 &  0 \\
    4 & -22 & -61 &  10 &  13 &  -7 & -9 &  5 \\
  -47 &   7 &  77 & -25 & -29 &  10 &  5 & -6 \\
  -49 &  12 &  34 & -15 & -10 &   6 &  2 &  2 \\
   12 &  -7 & -13 &  -4 &  -2 &   2 & -3 &  3 \\
   -8 &   3 &   2 &  -6 &  -2 &   1 &  4 &  2 \\
   -1 &   0 &   0 &  -2 &  -1 &  -3 &  4 & -1 \\
    0 &   0 &  -1 &  -4 &  -1 &   0 &  1 &  2
\end{bmatrix}


Note the rather large value of the top-left corner. This is the Direct current (DC) coefficient.

QuantizationEdit

The human eye is fairly good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation. This fact allows one to get away with greatly reducing the amount of information in the high frequency components. This is done by simply dividing each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers.

A common quantization matrix is:


\begin{bmatrix}
 16 & 11 & 10 & 16 & 24 & 40 & 51 & 61 \\
 12 & 12 & 14 & 19 & 26 & 58 & 60 & 55 \\
 14 & 13 & 16 & 24 & 40 & 57 & 69 & 56 \\
 14 & 17 & 22 & 29 & 51 & 87 & 80 & 62 \\
 18 & 22 & 37 & 56 & 68 & 109 & 103 & 77 \\
 24 & 35 & 55 & 64 & 81 & 104 & 113 & 92 \\
 49 & 64 & 78 & 87 & 103 & 121 & 120 & 101 \\
 72 & 92 & 95 & 98 & 112 & 100 & 103 & 99
\end{bmatrix}


Using this quantization matrix with the DCT coefficient matrix from above results in:


\begin{bmatrix}
 -26 & -3 & -6 &  2 &  2 & -1 & 0 & 0 \\
   0 & -3 & -4 &  1 &  1 &  0 & 0 & 0 \\
  -3 &  1 &  5 & -1 & -1 &  0 & 0 & 0 \\
  -4 &  1 &  2 & -1 &  0 &  0 & 0 & 0 \\
   1 &  0 &  0 &  0 &  0 &  0 & 0 & 0 \\
   0 &  0 &  0 &  0 &  0 &  0 & 0 & 0 \\
   0 &  0 &  0 &  0 &  0 &  0 & 0 & 0 \\
   0 &  0 &  0 &  0 &  0 &  0 & 0 & 0
\end{bmatrix}


For example, using the DC coefficient of -415


\left \lfloor
 \frac{-415}{16}
\right \rfloor
=
\left \lfloor
 -25.9375
\right \rfloor
=
-26


Entropy codingEdit

Entropy coding is a special form of lossless data compression. It involves arranging the image components in a "zigzag" order that groups similar frequencies together, inserting length coding zeros, and then using Huffman coding on what is left. The JPEG standard also allows, but does not require, the use of arithmetic coding which is mathematically superior to Huffman coding. However, this feature is rarely used as it is covered by patents and because it is much slower to encode and decode compared to Huffman coding. Arithmetic coding typically makes files about 5% smaller.

The zig-zag sequence for the above quantized coefficients would be:

-26, -3, 0, -3, -2, -6, 2, -4, 1 -4, 1, 1, 5, 1, 2, -1, 1, -1, 2, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

JPEG has a special Huffman code word for ending the sequence prematurely when the remaining coefficients are zero. Using this special code word, EOB, the sequence becomes

-26, -3, 0, -3, -2, -6, 2, -4, 1 -4, 1, 1, 5, 1, 2, -1, 1, -1, 2, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0, 0, -1, -1, EOB

Compression ratio and artifactsEdit

The resulting compression ratio can be varied according to need by being more or less aggressive in the divisors used in the quantization phase. Ten to one compression usually results in an image that can't be distinguished by eye from the original. 100 to one compression is usually possible, but will look distinctly artifacted compared to the original. The appropriate level of compression depends on the use to which the image will be put.

Those who use the World Wide Web may be familiar with the irregularities known as compression artifacts that appear in JPEG digital images. These are due to the quantization step of the JPEG algorithm. They are especially noticeable around eyes in pictures of faces. They can be reduced by choosing a lower level of compression; they may be eliminated by saving an image using a lossless file format, though for photographic images this will usually result in a larger file size.

DecodingEdit

Decoding to display the image consists of doing all the above in reverse.

Taking the DCT coefficient matrix (after adding the difference of the DC coefficient back in)


\begin{bmatrix}
 -26 & -3 & -6 &  2 &  2 & -1 & 0 & 0 \\
   0 & -3 & -4 &  1 &  1 &  0 & 0 & 0 \\
  -3 &  1 &  5 & -1 & -1 &  0 & 0 & 0 \\
  -4 &  1 &  2 & -1 &  0 &  0 & 0 & 0 \\
   1 &  0 &  0 &  0 &  0 &  0 & 0 & 0 \\
   0 &  0 &  0 &  0 &  0 &  0 & 0 & 0 \\
   0 &  0 &  0 &  0 &  0 &  0 & 0 & 0 \\
   0 &  0 &  0 &  0 &  0 &  0 & 0 & 0
\end{bmatrix}


and multiplying it by the quantization matrix from above results in


\begin{bmatrix}
 -416 & -33 & -60 &  32 &  48 & -40 & 0 & 0 \\
    0 & -24 & -56 &  19 &  26 &   0 & 0 & 0 \\
  -42 &  13 &  80 & -24 & -40 &   0 & 0 & 0 \\
  -56 &  17 &  44 & -29 &   0 &   0 & 0 & 0 \\
   18 &   0 &   0 &   0 &   0 &   0 & 0 & 0 \\
    0 &   0 &   0 &   0 &   0 &   0 & 0 & 0 \\
    0 &   0 &   0 &   0 &   0 &   0 & 0 & 0 \\
    0 &   0 &   0 &   0 &   0 &   0 & 0 & 0
\end{bmatrix}


which closely resembles the original DCT coefficient matrix for the top-left portion. Taking the inverse DCT results in an image with values (still shifted down by 128)

JPEG example image

JPEG example image decompressed
Notice the slight differences between the original (top) and decompressed image (bottom) which is most readily seen in the bottom-left corner.

\begin{bmatrix}
 -68 & -65 & -73 & -70 & -58 & -67 & -70 & -48 \\
 -70 & -72 & -72 & -45 & -20 & -40 & -65 & -57 \\
 -68 & -76 & -66 & -15 &  22 & -12 & -58 & -61 \\
 -62 & -72 & -60 &  -6 &  28 & -12 & -59 & -56 \\
 -59 & -66 & -63 & -28 &  -8 & -42 & -69 & -52 \\
 -60 & -60 & -67 & -60 & -50 & -68 & -75 & -50 \\
 -54 & -46 & -61 & -74 & -65 & -64 & -63 & -45 \\
 -45 & -32 & -51 & -72 & -58 & -45 & -45 & -39
\end{bmatrix}


and adding 128 to each entry


\begin{bmatrix}
  60 & 63 & 55 &  58 &  70 &  61 & 58 & 80 \\
  58 & 56 & 56 &  83 & 108 &  88 & 63 & 71 \\
  60 & 52 & 62 & 113 & 150 & 116 & 70 & 67 \\
  66 & 56 & 68 & 122 & 156 & 116 & 69 & 72 \\
  69 & 62 & 65 & 100 & 120 &  86 & 59 & 76 \\
  68 & 68 & 61 &  68 &  78 &  60 & 53 & 78 \\
  74 & 82 & 67 &  54 &  63 &  64 & 65 & 83 \\
  83 & 96 & 77 &  56 &  70 &  83 & 83 & 89
\end{bmatrix}


This is the uncompressed subimage and can be compared to the original subimage (also see images to the right) by taking the difference (original - uncompressed) results in error values


\begin{bmatrix}
 -8 &  -8 &  6 &  8 &  0 &   0 &  6 & -7 \\
  5 &   3 & -1 &  7 &  1 &  -3 &  6 &  1 \\
  2 &   7 &  6 &  0 & -6 & -12 & -4 &  6 \\
 -3 &   2 &  3 &  0 & -2 & -10 &  1 & -3 \\
 -2 &  -1 &  3 &  4 &  6 &   2 &  9 & -6 \\
 11 &  -3 & -1 &  2 & -1 &   8 &  5 & -3 \\
 11 & -11 & -3 &  5 & -8 &  -3 &  0 &  0 \\
  4 & -17 & -8 & 12 & -5 &  -7 & -5 &  5
\end{bmatrix}


with an average absolute error of about 5 values per pixels (i.e., \frac{1}{64} \sum_{x=1}^8 \sum_{y=1}^8 |e(x,y)| = 4.8125 ). The error is most noticable in the bottom-left corner where the bottom-left pixel becomes darker than the pixel to its immediate right.

Usage Edit

JPEG is at its best on photographs and paintings of realistic scenes with smooth variations of tone and color. In this case it usually performs much better than purely lossless methods while still giving a good looking image (in fact it will produce a much higher quality image than other common methods such as GIF which are lossless for drawings and iconic graphics but require severe quantization for full-color images).

Photographs Edit

JPEG compression artifacts blend well into photographs with detailed non-uniform textures, allowing higher compression ratios.

JPEG example donkey 010

Low quality (10%), filesize 1.7 KB.

JPEG example donkey 050

Mid quality (50%), filesize 5.7 KB.

JPEG example donkey 100

Full quality (100%), filesize 36 KB.


The mid-quality photo uses only one sixth the storage space but has noticable loss of detail and artifacts. However, once a certain threshold of compression is passed, compressed images show increasingly visible defects. See the article on rate distortion theory for a mathematical explanation of this threshold effect.

Other lossy encoding formatsEdit

Newer lossy methods, particularly wavelet compression, perform even better in these cases. However, JPEG is a well established standard with plenty of software available, including free software, so it continues to be heavily used as of 2005. Also, many wavelet algorithms are patented, making it difficult or impossible to use them freely in many software projects.

The JPEG committee has now created its own wavelet-based standard, JPEG 2000, which is intended to eventually supersede the original JPEG standard.

Potential patent issuesEdit

In 2002 Forgent Networks asserted that it owns and will enforce patent rights on the JPEG technology, arising from a patent that had been filed in 1986 (US patent 4,698,672). The announcement has created a furor reminiscent of Unisys' attempts to assert its rights over the GIF image compression standard.

The JPEG committee investigated the patent claims in 2002 and found that they were invalidated by prior art. [1] Nevertheless, between 2002 and 2004 Forgent was able to obtain about $90 million by licensing their patent to some 30 companies. In April 2004 Forgent sued 31 other companies to enforce further license payments. In July of the same year, a consortium of 21 large computer companies filed a countersuit, with the goal of invalidating the patent.

The JPEG committee has as one of its explicit goals that their standards be implementable without payment of license fees, and they have secured appropriate license rights for their upcoming JPEG 2000 standard from over 20 large organizations.

See alsoEdit

External linksEdit

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.