Why does the Internet Explorer animated logo arrange its frame vertically?


If you ever tried to build a custom animated logo for Internet Explorer, you cetainly noticed that the frames of the animation are arranged vertically rather than horizontally. Why is that?

Because it's much more efficient.

Recall that bitmaps are stored as a series of rows of pixels. In other words, if you number the pixels of a bitmap like this:

123
456
789

then the pixels are stored in memory in the order 123456789. (Note: I'm assuming a top-down bitmap, but the same principle applies to bottom-up bitmaps.) Now observe what happens if you store your animation strip horizontally:

12
AB
34
CD
56
EF
78
GH

These pixels are stored in memory in the order 12345678ABCDEFGH. To draw the first frame requires pixels 1, 2, A and B. The second frame takes 3, 4, C, and D. And so on. Observe that the pixels required for each frame are not contiguous in memory. This means that they occupy different cache lines at least, and for a bitmap of any significant size, they also span multiple memory pages.

Now consider a vertically-arranged animation strip:

12
34
56
78
AB
CD
EF
GH

Again, the pixels are stored in memory in the order 12345678ABCDEFGH, [typo fixed, 15 Aug] but this time, the pixels of the first frame are 1, 2, 3 and 4; the second frame consists of 5, 6, 7, and 8; and so on. This time, all the pixels for a single frame are adjacent in memory. This means that they can be packed into a small number of cache lines, and reading the pixels for a single image will not force you to jump across multiple pages.

Let's illustrate with some pictures: Let's say that the large animation is a series of twelve 38x38 frames, for a total bitmap dimension of 38x456. Let's assume further, for the sake of example, that it's a 32bpp bitmap and that the page size is 4KB.

If the bitmap were stored as a horizontal strip (456x38), then the memory layout would look like this, where I've color-coded each memory page.

Observe that no matter which frame you draw, you will have to touch every single page since each frame containes a few bytes from each page.

Storing the bitmap vertically, on the other hand, arranges the pixels like so:

Notice that with the vertical strip, each frame touches only two or three pages; compare the horizontal strip, where each frame touches seventeen pages. This is quite a savings especially when you realize that most of the time, the only frame being drawn is the first one. The other frames are used only during animation. In other words, this simple change trimmed 60KB out of the normal working set.

Comments (17)
  1. Anonymous says:

    You guys must have some pretty amazing tools to be able to spot a 60k drop in working set due to page fragmentation. What sort of profiling setups do you have? The best I’ve used is Massif, which gives you a pretty space/time graph, but given what I’ve seen of most apps 60k for the spinning logo would be lost in the noise.

  2. Anonymous says:

    Mike, I think it’s not a drop in memory usage, it’s a drop in the number of pages that are touched while drawing a page. In other words, just making your data cache friendly – this doesn’t reduce memory usage, but can speedup things by orders of magnitude.

  3. Anonymous says:

    So why do toolbars take horizontal strips?

  4. Anonymous says:

    Hm. Personally, I’ve always used vertical animation strips when making animations for my programs… including a couple little games. You make it sound as if the standard was horizontal, and the vertical ones are freakish.

  5. Anonymous says:

    I remember back in the days of software rasterizers for games, you saw a similar effect due to L1 cache coherency. If a polygon happened to align on screen such that the source texture data was traversed in order, then performance was measurably better than the same polygon rotated by 90 degrees. On some games we worked on, we considered even going so far as to store two versions of a source texture, and picking the one closest to the alignment of the polygon (we never did this in the end due to the extra memory cost outweighing caching benefits).

    These days, modern GPUs tackle a similar problem. They don’t actually store textures in the obvious memory order, but instead store them with the pixels mashed up in what is usually called a "swizzle" pattern. A swizzle pattern isn’t very intuitive geometrically, but in terms of the texture coordinates it amounts to interleaving the individual bits of the coordinates. If your original texel was at coordinate (X,Y), and the bit representation of X is xxxxxx and Y is yyyyyy, then the actual data is stored at the memory offset given by xyxyxyxyxyxy (interleaving bitwise). The net result of this is that texels which are close to each other in the original image are clustered close to each other in memory, regardless if they were close to each other horizontally or vertically, and therefore traversing the texture in any direction is roughly similar in cost. This is a considerable performance win. (Most GPUs are capable of reading plain old linear layouts as well, but the performance is measureably poorer.)

    Software rasterizers could benefit from the same technique, except that the cost of doing the bit interleaving in software usually outweighs the cache coherency benefits (plus, there are relatively few applications for high-performance software rasterization these days, so nobody really takes the time).

  6. waleri says:

    Alas, image lists uses horizontal bitmaps…

  7. Anonymous says:

    "Again, the pixels are stored in memory in the order 1245678ABCDEFGH, but …"

    I never liked pixel 3 much anyway.

  8. Anonymous says:

    Sorry this is off topic, but triggered by your reference to "quite a savings".

    In the UK, we make "a saving" – savings go into banks. Does anyone know where the plural crept in across the Atlantic? I’ve always been puzzled by this.

  9. Anonymous says:

    Tony, the software way would be, instead of usual order VVVVVVVVUUUUUUUU, using UUUUVVVVVVVVUUUU bitorder which is very easy to generate on the fly while doing fixed-point interpolation. Look for "fatmap2.txt" for more info :)

    There is also a way to use normal coordinates for textures: swizle the screen-travesal!!! :P

  10. Anonymous says:

    Antonio/Tony: yes, and in fact old "fast rotozoomers" did exactly that. Don’t draw the screen line by line, but instead draw it in blocks (say, 8×8 pixels).

  11. Anonymous says:

    Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  12. Anonymous says:

    Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  13. Anonymous says:

    Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  14. Anonymous says:

    Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  15. Anonymous says:

    That’s just the interchange format.

Comments are closed.