Computer vision


459pages on
this wiki
Add New Page
Talk0 Share

A pixel, or picture element, in the context of computer vision, is the numerical value of the scalar (gray scale or index) or vector (color or multispectral) information at one point in a picture, or image. An image is typically represented as an array of pixels.

In other contexts, such as displays, cameras, and printers, the notion of a pixel may have a somewhat different definition.

Previous version of this page from history

A pixel (a portmanteau of picture element) is one of the many tiny dots that make up the representation of a picture in a computer's memory. Usually the dots are so small and so numerous that, when printed on paper or displayed on a computer monitor, they appear to merge into a smooth image]]. The colour and intensity of each dot is chosen individually by the computer to represent a small area of the picture. Pixel is sometimes abbreviated px or pel (for picture element), although pel sometimes refers to sub-pixels.


Pixels are generally thought of as the smallest complete element of an image. The definition is highly context sensitive. For example, we can speak of pixels in a visible image (e.g. a printed page) or pixels carried by one or more electronic signal(s), or represented by one or more digital value(s), or pixels on a display device. This list is not exhaustive and depending on context there are several synonyms which are accurate in particular contexts, e.g. pel, sample, bytes, bits, dots, spots, superset, triad, stripe set, window, etc. We can also speak of pixels in the abstract, in particular when using pixels as a measure of resolution, e.g. 2400 pixels per inch or 640 pixels per line. Dots is often used to mean pixels, especially by computer sales and marketing people, and gives rise to the abbreviation DPI or dots per inch.

Note that a pixel may be comprised of sub-parts or sub pixels. For example a pixel on a color display may be composed of red, green and blue sub-parts (sub-pixels, sub-pels, etc.) the three of which may be referred to as a triad. A pixel in a video signal may be composed of RGB parts or Y, R-Y, B-Y or Y, I, Q, or Y, C, M or subcarrier modulated Y or composite video or separate signals such as separate ones of the various three sub-pixels above. Many unskilled people, and sometimes skilled people, incorrectly use pixel and image element interchangeably, or use pixel to refer to sub-parts. Unskilled people don't know any better and the skilled people know better but because the meaning is clear from the context do so anyway. Many dictionaries also get it wrong.

Typical pixels we are concerned with in laser printers are those made up of sub-pels in the screening processes, those made up of yellow, cyan and magenta sub-pels in color printing and those which are simply dots of black toner in black and white printers. Typical pixels we are concerned with in television systems are the samples of composite video signals (a single digital value having Y and color subcarrier components) those carried by three electronic signals or three digital values, either Y, R-Y, B-Y, or R, G, B depending on where in the TV we are looking and those displayed on the TV screen which are made up of R, G, and B color sub-pixels. Note that Y, R-Y and B-Y values are often carried as two electronic signals in television applications, Y in one and time multiplexed R-Y, B-Y in the other.

Image elements is a broader term than pixels and is also highly context sensitive. Image elements includes both complete pixels as well as those various sub-parts of pixels and other elements of images which are not pixel related such as DCT coefficients. For example, it is correct to say that the red part of an RGB pixel is an image element but it is not normally considered correct to refer to the red part as a pixel itself (although persons who are not skilled in the television industry often do). When someone says a pixel is the smallest part of an image, that statement is incorrect if the image is made up of pixels having sub-parts, but is correct if the pixel is the smallest element (particularly in black and white images or when a single video signal has been sampled). Consequently one can say something like pixels and image elements are essentially the same when talking about technology when the pixel is the smallest part but can disagree that they are the same when talking about technology when the pixel is made up of sub-parts. This tends to confuse the hell out of unskilled people who can't pick up on the intended meaning from the context of the usage.
The more pixels used to represent an image, the closer the result will resemble the original. The number of pixels in an image is called the resolution. This can be expressed as a single number, as in a "three-megapixel" digital camera, which has three million pixels, or as a pair of numbers, as in a "640 by 480 display", which has 640 pixels from side to side and 480 from top to bottom (as in a VGA display), and therefore has a total number of 640 × 480 = 307,200 pixels. The coloured dots that form a digitized image (such as a JPG file used on a web page) are also called pixels. Depending on how a computer displays an image, these may not be in one-to-one correspondence with screen pixels. In areas where the distinction is important, the dots in the image file may be called texels. In computer programming, an image composed of pixels is known as a bitmapped image or a raster image. The word raster originates from analogue television technology. Bitmapped images are used to encode digital video and to produce computer-generated art. Since the resolution of the computer display can be adjusted from the computer's operating system, a pixel is a purely relative measurement. The modern computer display is designed with a native resolution which refers to the perfect match between pixels and triads. The native resolution will produce the sharpest picture capable from the display. However since the user can adjust the resolution, the monitor must be capable of displaying the resolution, which is accomplished by drawing each pixel out of more than one triad. This process usually results in a fuzzy picture. For example, a display with a native resolution of 1280×1024 will look best set at 1280×1024 resolution, will display 800×600 adequately by drawing each pixel with more physical triads, and will be unable to display in 1600×1200 at all due to the lack of physical triads. Usually a non-native resolution is better displayed on a CRT than on an LCD. This is because a CRT can display pixels in various sizes, whereas the pixels of an LCD have a fixed size. Non-native resolutions have to be approximated by the software in the LCD screen, using multiple fixed-size "physical pixels" to display a single "logical pixel". This often causes the screen to look jagged and blurry. Pixels are either rectangular or square. A number called the aspect ratio describes the squareness of a pixel. For example, a 1.25:1 aspect ratio means that each pixel is 1.25 times wider than it is high. Pixels on computer monitors are usually square, but pixels used in digital video have non-square shapes, such as the D1 aspect ratio. Each pixel in a monochrome image has its own brightness. Zero usually represents black, and the maximum value possible represents white. For example, in an eight-bit image, the maximum unsigned value that can be stored by eight bits is 255, so this is the value used for white. In a colour image, each pixel has its own brightness and colour, usually represented as a triplet of red, green and blue intensities (see RGB). Full-colour LCD flat panels and CRT monitors use pixels made of three sub-pixels. {| class="diff"
The number of distinct colours that can be represented by a pixel depends on the number of bits per pixel (BPP). Common values are
  • 8 bpp (256 colours),
  • 16 bpp (65,536 colours, known as Highcolour),* 24 bpp (16,777,216 colours, known as Truecolour). Images composed of 256 colours or fewer are usually stored in the computer's video memory in chunky or planar format, where a pixel in memory is an index into a list of colours called a palette. These modes are therefore sometimes called indexed modes. While only 256 colours are displayed at once, those 256 colours are picked from a much larger palette, typically of 16 million colours. Changing the values in the palette permits a kind of animation effect. The animated startup logo of Windows 95 and Windows 98 is probably the best-known example of this kind of animation. For depths larger than 8 bits, the number is the total of the three RGB (red, green and blue) components. A 16-bit depth is usually divided into five bits for each of red and blue, and six bits for green (the eye being more sensitive to green). A 24-bit depth allows 8 bits per component. On some systems, 32-bit depth is available: this means that each 24-bit pixel has an extra 8 bits to describe its opacity. On older systems, 4 bpp (16 colours) is also common. When an image file is displayed on a screen, the number of bits per pixel is expressed separately for the raster file and for the display. Some raster file formats have a greater bit-depth capability than others. The GIF format, for example, has a maximum depth of 8 bits, while TIFF files can handle 48-bit pixels. There are no displays that can display 48 bits of colour, so this depth is typically used for specialized professional applications with film scanners and printers. Such files are rendered on a screen with 24-bit depth. Other objects derived from the pixel, such as the voxel (volume element), texel (texture element) and surfel (surface element), have been created for other computer graphics uses. ==Sub-pixel== On both full-colour LCD flat panels and CRT monitors, each pixel is constructed from three sub-pixels for the three colours, spaced closely together. Each single-colour sub-pixel is brightened according to the triplicate number reference, and due to their proximity, they create an illusion of being one specially-tinted pixel. A recent technique for increasing the apparent resolution of a colour display, named subpixel rendering, uses knowledge of pixel geometry to manipulate the three coloured sub-pixels separately, which seems to be most effective with LCD displays set at native resolution. This is a form of anti-aliasing, and is mostly used to improve the appearance of text. Microsoft's ClearType™, which is available in Windows XP, is an example of this. ==Megapixel== A megapixel is 1 million pixels, and is usually used to express the resolution capabilities of digital cameras. For example, a camera that can take pictures with a resolution of 2048×1536 pixels is commonly said to have "3.1 megapixels" (2048 × 1536 = 3,145,728). Some digital cameras (digicams) use CCDs, which record brightness levels. Older digital cameras that do not use Foveon X3 CCDs have red, green, and blue colour filters so that each pixel can record the brightness of a single primary colour. Thus, the pixels of digital cameras that don't use Foveon X3 CCDs are similar to sub-pixels. The camera interpolates the colour information to create the final image. Thus, an 'x'-megapixel image from a digital camera can have as little as 1/4th the colour resolution of the same image as taken by a scanner. The detail resolution is unimpaired. Thus, a picture of a blue or red object will tend to look fuzzy compared to the same object in shades of grey. Green objects appear less fuzzy, since green is allocated more pixels (due to the eye's increased sensitivity for green). See [1] for a more detailed discussion. ==See also==

| class="diff-deletedline"| |}

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.