Even with the pictures and videos so commonplace on PCs, many of us spend most of our time looking at and interacting with text. Yet few of us stop to think about the depth of technology required to render text well and that this is an area that continues to benefit from improved technology in displays, graphics cards, as well as the APIs available to developers. In Windows 7, The support for text and fonts in GDI continues to provide the foundation for compatibility and application support. Building on the foundation of the modern DirectX graphics infrastructure, Windows 7 enhances the text output available to developers with DirectWrite. This is a new API subsystem and one that over time you will see adopted more broadly by applications from Microsoft, independent software developers, and within Windows itself. This post will also talk about improvements to ClearType and the Fonts, both available as part of the improvements to the GDI-based text APIs. This work was introduced at the PDC (pointers towards the end of the post). This post is by Worachai Chaoweeraprasit, a development lead on our Graphics feature team. –Steven
One of the high-level goals of Windows 7 is to have even better graphics – graphics with higher fidelity. To that end, my team is looking into how to improve one of the most basic graphic elements in Windows, and that is text – the thing that’s always right in your face, but we hope you’ll never actually see it.
The need for good text
About 80% of the time people spend with their PC is to either read or write. This should come as no surprise when you realize that text is essentially how the machine talks back to you, and until we have a technology that would allow it to interject thought directly into our brains, text would probably continue to be the way we receive information from the computer screen.
Studies have shown that good text leads to better productivity. Essentially we are wired as human to be incredibly good at capturing words and making a smooth, rapid transition between them – the basis of reading. We’re so good at it that we can do it unconsciously with incredible speed given that the text is optimized for that process. This might explain why many can sink in to a good book for hours, but some quickly become tired after staring at the computer screen for a while. Any visual-related factor that could disrupt the reading process effectively slows us down. Good text, therefore, is text that is tuned to support the human reading process with minimal distraction possible.
The evenness of the white surrounding each letter, word, line, and paragraph plays a huge role in keeping the pace of reading while the black elements holds our attention together. A line too long, a word too tight, a paragraph too uneven, any of these conditions take us farther and farther away from the message being delivered but closer and closer to the mere medium delivering it. The art of text is essentially to make the actual text itself disappears before your eyes, so that the ideas it delivers reappear in your head. The study of how to prepare proper text is known as typography. And, as a typographer would say: good typography is not to be seen; only the bad ones are. As a platform, the role of Windows is to deliver great presentation of text and offering software developers great tools for creating the best presentation possible in the context of the software they develop.
Improving current techniques
People tend to develop habits and often over time these become the preferred way of getting things done. The more mundane the activity is, the easier we become attach to it, and the harder we’re willing to change. When it comes to text on your screen, the same screen you look at days in and days out. It could quickly become awkward if that completely changes overnight – even for the better. So, how do we go about improving on what we all become so used to? We want to make sure to support what is there and improve it, while supporting existing methods. But, before we get to understand the improvement, let’s first take a closer look into the current implementation really is and what challenges it presents over the years.
The current implementation is the product of text rendering design based on device pixel. The dimension of text at a certain size eventually translates into a fixed number of pixels in horizontal and vertical direction on the device surface. A 10-point text would translate to roughly 80 pixels height on a typical printer device of 600 dpi, while the same text would merely acquire 13 pixels on a 96 dpi monitor. This physical screen condition was hardly adequate for the quality we’re seeking for good text on screen.
Fortunately, the advent of ClearType during the past decade has largely improved the clarity aspect of quality. ClearType leverages the anatomy of the LCD pixel structure and takes advantage of the human visual system to distribute the energy typically emit to a whole display pixel, across the neighboring sub-pixels in the LCD’s typical 3-color channels making up each individual pixel, to create the visual illusion of higher resolution raster quality on a lower resolution device. As the result, ClearType text looks significantly sharper than the typical text on an LCD display, mitigating a large portion of the quality problem on a display technology that would become hugely popular a few years later.
Another pleasant design of the original ClearType in Windows was that it has improved the clarity of text without breaking application compatibility – that is, it doesn’t change the actual size of each individual glyph in either direction, nor did it change the distance between the two adjacent ones. This is the reason one could turn it on or off at will without having to “store” the selected option in the document or application. It is entirely per-user rendering preference. In Windows 7 we also improved the ClearType Text Tuner in keeping with our theme of being in control of your PC experience, by providing even more granular choices when tuning ClearType (and of course you can still turn it off).
But like many other things in the world, the coin comes in two faces. While it is able to preserve backward compatibility, it is limited by its own leverage unable to advance the state of the art. The width and height of the individual glyph and the nominal distances between the adjacent two remain fixed to the rounded number of screen pixels at a given size.
One of the graphics improvements we made in Windows 7, therefore, is to move from the physical pixel model of the past, and instead creating a new design around what we call the “device independent pixel” unit (or “DIP”), a “virtual pixel” that is one-ninety-sixth an inch in floating-point data type. In this model, a glyph (or any other geometric primitive for that matter) can size to fractional pixels, and be positioned anywhere in between the two pixels. The new ClearType improvement allows sizing and placement of glyph to the screen’s sub-pixel nearest to its ideal condition, creating a more natural looking word shape and making text on screen looks a lot closer to print quality.
The following figure shows the side-by-side comparison of the same word between the original or today’s ClearType (above) and the Windows 7 improvement – Natural ClearType (below), which does require calling the new APIs to render. Notice the width of the letters in the word and the spacing between them, as well as how the more consistent width and spacing improves the overall appearance of the entire word. Note that all the letters are placed with its nominal spacing and there is no kerning adjustment being applied here. A great article by Kevin Larson – a researcher in the Advanced Reading Technology team, discusses in details the scientific aspect of word recognition.
The ability to be more precise in approximating the screen placement of natural text also lends itself to a very nice side-effect, and that is the fact that text can now be placed on the line with no regards to the actual display device’s resolution. It means a UI designer can design an application UI knowing it’ll look the same on all other screens as it appears on his or her screen regardless of what type of display device the users might have. This fact is also particularly handy for software localization where the translated text produces the same layout everywhere.
This improvement could also offer a more realistic view of a print document on screen, or make the screen document looks closer to its print counterpart. It could also improve the quality of document zooming. Imagine document zoom that could go in and out in the same manner as what you would see when pulling the actual print page closer and farther away from your sight. It could mean a more joyful experience for online reading.
Fonts and Font Management
The Font is the heart and soul to typography, much like photo is to photography. A lot more fonts are shipped with Windows these days while even more are developed around the world. Windows Vista shipped with 40% more fonts comparing to Windows XP. Windows 7 is expected to ship with 40+ new fonts, just to underscore this trend. We’ve also added some additional viewing/categorization capabilities using the Windows 7 Explorer to improve working with a large library of related fonts.
The default common controls’ font dialog and the font chunk in Windows 7 Ribbon are also updated to be more intelligently selective of what fonts to be present to the user of the current user’s profile. Depending on a number of settings including the current UI language, the user locale, and the current set of keyboard input locales, the font list hides fonts of languages not typically used by the user of different culture and locale. For example, all the international fonts are automatically hidden away from a typical English user to reduce clutter and promote better productivity in common system applications such as NotePad, WordPad and Paint. Third-party application utilizing the Ribbon or the common controls’ font dialog could also have the same benefit. The user still retains the option of selecting any desired font back to the view by explicitly marking it in the Windows 7 Control Panel’s Font applet.
|Operating System||Fonts shipped “in-box”|
|Windows XP SP2||133|
|Windows 7||235 (currently planned)|
This growth, however, introduces some new opportunities for improvement. We’ve long treated fonts as system-wide resources. It gets “installed” on the machine and kept in a single flat namespace managed by the core part of the operating system. It may be interesting to some that the font named “Arial Black” isn’t really in the same grouping as “Arial Narrow” or “Arial”. This is because as far as the operating system is concerned, they are just different fonts with different names. And because font is uniquely identified by its name, you can’t have multiple versions of the same font at the same time.
Because font is system resource, non-traditional usage of font such as font embedded within the document, and font used exclusively in an application is done through the mechanism known as private installation, which involves making sure the font name is unique before installing it programmatically but doing so by hiding it from others to see. Private font is just like font installed publicly as far as the operating system internal is concerned.
An important improvement in Windows 7’s new font system is the notion of “font collection” which allows partitioning of fonts sharing the same usage into a separate namespace. The system collection is similar to what exists today and is created and managed by the system whereas custom collection can be created and managed, as many as needed, entirely by the application program. This allows document to have its own set of fonts local to it, and third-party application or plug-in to ship with its own font used exclusively within the program. This partitioning not only reduces unnecessary system-wide font update and allows update to happen only locally as needed, it also allows access to multiple versions of the same font in different collections.
The new font system also improves the way fonts are organized within the collection. It supports the notion of weight-width-slope variation where fonts with the same stylistic root but vary in its weight (thin, light, bold, black, etc.), width (wide, narrow, etc.), or slope (italic, oblique) are grouped together in the same font family. For instance, “Arial Narrow” becomes a variation or face in the “Arial” family. This grouping model is advocated by the CSS recommendation.
Fonts also represent art and artistic expression. The technology helping create font is therefore the artist’s tool of expression. An important technology called OpenType emerged during the past decade. It enables new ways type design can be realized. OpenType allows designer to define how glyphs interact and transform in stages. The designer then exposes this function as an executable unit known as the “font feature” for application programmable access.
OpenType was an offshoot of the TrueType Open technology Microsoft developed in 1994-95. The TrueType Open technology added the GSUB, GPOS, BASE, JSTF, and GDEF tables to the TrueType format. The primary usage at the time was to help with the creation of Arabic font due to the inherent complexity of the task. Microsoft chose to rename the technology to OpenType in 1996 and Adobe added their CFF glyph outline format to the technology in the same year. Today OpenType is used to improve readability of text as well as to express new and exciting type design in various languages.
However, despite its long-time presence and availability, the usage of OpenType in the Windows world remains largely in specialized programs. The Windows native graphics system has not fully embraced OpenType for its mainstream usage of text. This absence discourages many designers as there is no standard way in Windows to test the feature they produce. Likewise, its limited exposure doesn’t encourage discoverability for mainstream application developers. Improving this and transitioning to this improved rendering technology is a multi-step and multi-release investment done so as to maximize the benefit while minimizing the disruption that might be introduced as incompatibilities. Windows 7 takes another step on this path. We know for many that care deeply about this area there is a strong desire to move faster. We are doing our best to balance the speed of transition with the desire to maintain compatibility.
Windows 7 new text system not only uses available OpenType features internally but also allows access to any feature made available in the font in the high level programming interface, making it easier for application developer to discover and exercise the font feature in mainstream scenario. Windows 7 also ships with a brand new OpenType font “Gabriola” developed by a well-respected typographer John Hudson. Gabriola makes heavy use of contextual letterforms and offers an unprecedented number of stylistic sets for different usages of the font in different occasions. The figure below enumerates all stylistic sets available in this font; notice the subtleties and not-so-subtle way to distinguish each stylistic set.
The figure below also demonstrates the power of contextual letterforms in the eighth rendition of Gabriola’s stylistic set (“ss07”) to produce different ways the same word is rendered depending on where it’s at in the line.
Rendering text is complex and involved, even though it seems like something that should be straight forward. There are probably hundreds of ways to format text in a document and often many paths that ultimately yield the same results. HTML/CSS is a complex standard and is a great example of the richness of how text may be formatted and typeset. Underneath the formatting logic lies the language requirement – the rule of writing for the language. Windows has long been supporting Unicode – another complex standard for global data interchange. Windows supports an increasing number of Unicode script in every single release. The mapping from the input text to the final glyphs in the font requires intricate transformation, which involves parsing of font data and analyzing the language writing pattern. Once the glyph is finalized, it is then rasterized, merged and filtered into the final visual on the display device.
Due to this staging nature, different types of applications require different support from the text system. While a typical application such as the legendary “Hello world” type application may be satisfied with only the ability to get some text out showing to the user. The same level of support is hardly adequate for document preparation system such as Microsoft Word and Adobe InDesign. Some of the more mature application code bases may also have to deal with different graphics systems. This makes it harder in practice for a text system that tie to a particular graphics model to really be widely useful across the wide variety of applications in the Windows ecosystem.
It became obvious to us early on during the planning stage of Windows 7 that text processing is not homogeneous, and different types of applications have different needs and requires different levels of support. The appropriate level of programming access to the text functionality is as important as the functionality itself. The new text system in Windows 7 is assembled into a self-sufficient system called DirectWrite. The API is provided in four layers – the interfaces for font data, rendering support, language processing, and typesetting, each built upon the others with the lower layer makes no requirement to the upper one, and none depends on a specific graphics model. To illustrate the latter point, the figure below shows a sample application that uses the new typesetting interface and language processor while the final rendering happens as an extruded filled 3D geometry from the 2D graphics environment also new to Windows 7 called Direct2D. Both systems were introduced in PDC 2008 as the new graphic foundation in Windows 7.
DirectWrite preserves developer’s investment in existing technologies such as GDI and GDI+ in three important aspects. First, the previously described layering design of DirectWrite allows for the clean separation between the two fundamental processes of placing and rendering of text. It enables applications to use DirectWrite to place text while having it rendered onto traditional graphic surfaces such as GDI and GDI+. The reverse scenario in which the application may use GDI to place text while having it rendered through DirectWrite is also naturally supported. The second aspect of compatibility comes from the fact that DirectWrite also supports all existing methods for placing and rendering text found in GDI. A DirectWrite application can use DirectWrite to place and render text in the same manner as GDI does without actually using GDI. Text placed and rendered under this compatibility mode is indistinguishable from GDI text from the user’s point of view, and as such preserving existing layout of application UI and text document. Lastly, DirectWrite exposes a set of APIs that interoperate with GDI. An application selecting a GDI font object can turn it into a DirectWrite’s font object and vice versa. Since the font system is at the low end of the DirectWrite API layer, it provides a natural interoperability point that is fundamental enough to ensure high degree of data preservation and correctness. Once the application is able to acquire a DirectWrite’s font object, it can in turn use it in any other DirectWrite API requiring a DirectWrite font from that point onward. The conversion from a DirectWrite’s font object back to a GDI font object allows the rest of the GDI-based application to function with no change while still being able to reap the benefit of using DirectWrite’s new and improved font model. As in some real world examples, the XPS print rasterizer in Windows 7 is implemented on top of DirectWrite and utilizes DirectWrite’s interoperability API to convert back to a GDI font as part of the conversion of an XPS-based print job for a non-XPS printer driver. The Windows 7 XPS Viewer also uses DirectWrite alongside the GDI+ graphic rendering for its onscreen display.
There’s a lot more to the details of the API. In the PDC session linked to above, Leonardo Blanco and Kam VedBrat go into the details of DirectWrite and Direct2D and how to develop applications such as this.
The world has changed a lot since the first text APIs of Windows GDI, such as TextOut or ExtTextOut in Windows NT 3.1 (or the subsequent API additions). The evolution of support for text is a critical part of the underpinnings of Windows 7. We continue to improve this most “basic” element of a graphical operating system so that regardless of the language, script, or device used to render text, Windows will offers a great set of tools and APIs for developers and a great experience for end-users.