# What is a pixel?

I wanted to write about premultiplied alpha, but realized I need to explain some things about filtering first. And then I realized that to properly explain filtering, I need to start with an even more fundamental theoretical question: what is a pixel?

From Wikipedia:

"A pixel is the smallest item of information in an image. Pixels are normally arranged in a 2-dimensional grid, and are often represented using dots or squares. Each pixel is a sample of an original image, where more samples typically provide more-accurate representations of the original."

That's not a bad definition, but it's a little vague, especially the "often represented using dots or squares" part. Trouble is, when we try to make this more precise, we realize there are actually several different kinds of pixels in common use!

1. A pixel is a tiny square. Images are grids of tightly packed pixels, with no gaps or overlap between adjacent squares.

2. A pixel is a geometric point of zero size. When a continuously varying analog source image is converted to digital format, its color is sampled at each pixel location. Color values from in between these locations are not recorded.

3. A pixel is a circular dot with soft edges. These dots may or may not be larger than the pixel grid spacing: if they are larger, adjacent pixels will overlap. When digitizing an image, all colors within the pixel region are combined, using a falloff curve that weights their contribution according to distance from the pixel center.

Which version is true?

1. This matches what we see in programs like Photoshop, where we can zoom into an image and see the pixels scale up to large squares. It also matches what we are used to from early computers and game machines, where low resolutions had large and blocky pixels. But it's not really true in most cases today.

2. Matches how mathematicians like to view the world, but no actual hardware works like this.

3. Digital cameras, scanners, monitors, and printers all work this way, or some variant of it. Unfortunately there is no consistency about how much pixels overlap or what falloff curve is used. Some devices vary the concept even further, for instance an LCD uses separate dots with different center locations for the red, green, and blue color components.

Why does this matter?

Much of the time it does not. Many people write code assuming the mathematically elegant zero sized pixels of #2, and this often works fine even though it doesn't truly match their hardware.

One time it does matter is when quantizing an image, by converting analog to digital, or shrinking a digital image to a lower resolution. Any time we do such things, we must choose how the new pixel values will be calculated, which requires a specific definition of exactly what a pixel is.

Another time this matters is when scaling a digital image up to a higher resolution. In some ways this is impossible. We are trying to add more pixels, but these pixel values aren't stored in the source image, and we can't recreate missing data out of thin air! But this is too common and useful an operation to just throw our hands in the air and give up because of a minor technical issue like the problem being fundamentally unsolvable ðŸ™‚ So, we guess. The better we guess, the better the resulting scaled image will look. The more we know about how the source image was created, the more accurate a guess we can make, but in practice we usually know very little, as images don't typically include data describing what pixel falloff curves they were created from.

So we muddle through with no real standard for what a pixel actually is, and everybody just makes the best guesses they can. Or more often, ignores the question entirely...

1. DFHanson says:

"But this is too common and useful an operation to just throw our hands in the air and give up because of a minor technical issue like the problem being fundamentally unsolvable."

This is an awesome perk to being a software developer; we get to solve the fundamentally unsolvable. ðŸ™‚

2. You mean we guess and hope that no one notices the difference.

Is there a reason why no one can agree and why data like "pixel falloff curves" are not included in image format?

3. edwinb says:

definition 3 is a bit woolly.

consider a baur pattern (repetative 4 square areas of 3 colours)

a single image capture results in a certain quality of captured image with a resolution and file size.

without moving the subject repeat the image capture 4 times with the array moved by one area to result in information captured with each of the 4 areas exactly overlapping.

the resultant file is exactly the same size but the quality is far superior due to real rather than interpolated data.

these are square pixels and with no fall off curve

4. ShawnHargreaves says:

> Is there a reason why no one can agree and why data like "pixel falloff curves" are not included in image format?

I think it’s just considered too much work (both hard to implement and also a big runtime performance cost to implement such fancy filtering conversions) for too subtle a gain in image quality.

5. ShawnHargreaves says:

> without moving the subject repeat the image capture 4 times with the array moved by one area to result in information captured with each of the 4 areas exactly overlapping.

That’s the same thing as sampling at a higher resolution and then downsampling the results.

The trick is knowing the right places to put the samples, and the right way to downsample them. Sure you can get great results with an offset of exactly one area, but you can also get terrible results if your offset doesn’t match the tiling frequency of the source data.

6. "We are trying to add more pixels, but these pixel values aren’t stored in the source image, and we can’t recreate missing data out of thin air!"

Wrong – you just press the "Enhance" button a few times and it magically works. ðŸ™‚

7. Charibo says:

I used to explain pixel as a part of rasterized triangle (in other terminology one would call it a "fragment", but these two terms are imo equal in meaning). ðŸ™‚

8. KentDub says:

Very good post. I’m not sure that square or circular are the correct words. Rectangle or Elipse seem better. With pixel and device aspect ratios another layer of complexity is added (For example, DV video is 0.9 or 1.2, HD Anamorphic is 1.33, and HD Pro is 1.5). The representation of a pixel and the actual uses of pixels are very different – as you point out (Wouldn’t it be nice if there were actually no gaps between them?) Just my two cents… Keep up the great posts.

9. Tim says:

Numbers 1 and 3 are actually the same as 2 if you apply a filter to the continuous intensity before sampling it.

2. is definitely the best definition. Just think of the camera as perfectly sampling a filtered version of the image.

And enlarging images isn’t impossible, it’s just that we have to assume frequencies above the Nyquist cutoff are zero since the original gives no information about them. Once you assume that, it is a simple matter of applying sync interpolation. All other scaling algorithms use prior knowledge about the nature of the image being scaled.