RichEdit Property Sets


RichEdit has many character-format properties, most of which are documented for ITextFont2 and CHARFORMAT2. Nevertheless, the OpenType specification defines many more character-format properties called OpenType features consisting of a 32-bit identifier (id) and a 32-bit value. For example, the Gabriola font has stylistic set 6, which displays “Gabriola is graceful” as

Variable fonts are the latest addition to the OpenType specification and the variable-font axis coordinates are also specified by an id-value pair. For example, the experimental HoloFont font has three axes, ‘wght’, ‘wdth’, and ‘opsz’, the first two of which are illustrated in

HoloFont was designed by John Hudson and Ross Mills of Tiro Typeworks Ltd.

You can try out variable fonts by checking out this site and you can find myriad variable-font articles and talks here. Variable fonts present a user-interface (UI) challenge. One technique is to use a slide bar to choose an axis coordinate. AI might provide good default values. If the traditional font drop downs are used, you can be confronted with a zillion choices. HoloFont has 9 weights × 5 widths × 6 optical sizes = 270 entries which all appear in the current Word drop-down font list! And that’s tiny compared to the continua of possible axis coordinate values. To illustrate this quandary, here are the first few entries in the HoloFont font drop-down list

Narrow Thin
Narrow ExtraLightmmm
Narrow Light
Narrow SemiLight
Narrow
Narrow SemiBold
Narrow Bold
Narrow ExtraBold
Narrow Black
SemiNarrow Thin
SemiNarrow ExtraLightmmm
SemiNarrow Light
SemiNarrow SemiLight
SemiNarrow
SemiNarrow SemiBold
SemiNarrow Bold
SemiNarrow ExtraBold
SemiNarrow Black
Thin
ExtraLightmmm
Light
SemiLight
Regular
SemiBold
Bold
ExtraBold
Black
SemiWide Thin
SemiWide ExtraLight
SemiWide Light
SemiWide SemiLight
SemiWide
SemiWide SemiBold
SemiWide Bold
SemiWide ExtraBold
SemiWide Black

Clearly such detailed font drop-down lists are impractical, so maybe we should use slide bars or drag selected text handles.

OpenType properties that are used in shaping complex scripts like Arabic are invoked automatically by DirectWrite and Uniscribe. But many other OpenType properties including these examples are discretionary and must be present in the backing store to work. In addition, it’s desirable to be able to add other kinds of properties. The CHARFORMAT2::dwCookie allows a client to attach one 32-bit value to a text run, but there’s need to attach multiple properties such spelling, grammar, and other proofing-error annotations along with other client properties.

To handle all these properties, the latest Office 365 RichEdit implements property sets as described in the remainder of this post. The D2D/DirectWrite RichEdit mode (but not the GDI/Uniscribe mode) displays the OpenType properties as illustrated in the figures above. The following, admittedly technical, discussion describes the property-set object model, the RTF and binary file format additions for property sets, how to display variable-font and other OpenType features using DirectWrite, and the OpenType variable-font (fvar) table.

Kinds of Properties

The kinds of RichEdit character format properties are summarized in the table

ID Range Usage
0..0xFFFF Properties not in property sets
0x10000..0x1FFFF RichEdit temporary properties such as proofing errors
0x20000..0x2FFFF Client temporary properties
0x30000..0x3FFFF RichEdit persisted properties
0x40000..0x2020201F Reserved; returns E_INVALIDARG if used
0x20202020..0x7E7E7E7Emmm OpenType features/axis (if 0x80808080 mask = 0; else invalid)
0x7E7E7E7F..0xFFFFFFFF Reserved; returns E_INVALIDARG if used

There are no persisted client properties since they are client-specific and could be misinterpreted if read by a different client.

Property Set Object Model

The client APIs for setting and getting properties are ITextFont2::SetProperty (id, value) and ITextFont2::GetProperty (id, pvalue). The id’s for these methods are given by xxxx, where xxxx is an OpenType feature tag, an OpenType variable-font axis tag (see MakeTag() below) or an annotation id defined in the table at the end of the preceding section. Since OpenType x’s belong to a limited set of ASCII characters in the U+0020..U+007E range, there’s plenty of room in the 32-bit id space to define other properties. Common properties like font weight are already represented as CCharFormat::_wWeight and in principle don’t need to be members of a property set. Since by default there are no properties in a property set, calling ITextFont2::SetProperty(id, tomDefault) deletes the property id if it exists. Note that id values < 0x10000 are reserved for other purposes, such as tomFontStretch (0x33E) to define a font’s stretch value. These values are well below the first possible OpenType id 0x20202020 (4 spaces). The largest OpenType tag is 0x7E7E7E7E, which gives 944 = 78,074,896 tags, although most of them will never be used or are used for other purposes such as ‘MATH’ for the math table. This leaves 2564 − 944 =  4,294,967,296 − 78,074,896 = 4,216,892,400 IDs for other purposes.

OpenType tags are constructed in the order given by the macro

#define MakeTag(a, b, c, d)   (((d)<<24) | ((c)<<16) | ((b)<<8) | a)

For example, the variable-font weight axis tag ‘wght’ has the value 0x74686777.

Internally it’s useful to mark OpenType feature tags with a bit (tomOpenTypeFeature—0x00800000) to distinguish them from variable-font axis tags. This bit cannot be confused with annotation id’s which have values of 0x3FFFF or less. The feature tags are defined by the DWRITE_FONT_FEATURE_TAG enum defined in dwrite.h. The variable-font axis tags are defined by the font’s fvar table discussed below and in principle can be any combination of ASCII letters. So, if a tag isn’t a feature tag, we assume that it’s a variable-font axis tag and let DirectWrite accept or reject it.

Property Set RTF

In RTF, property sets are encoded similarly to the {\colortbl…} for colors and have the form

{\*\propsets id value…; …}

Here the id and value are 32-bit values that are encoded for all properties in a property set. Each property set is ended by a semicolon. This format is repeated for all property sets used in the text. If an id starts with an ASCII letter and consists of 4 ASCII letters, it is written as a character string. For example, the id ‘wdth’ is written as such for the 32-bit id value 0x68746477. If any byte in the id isn’t an ASCII letter, the id is written as a 32-bit integer. These choices make it easier to read property IDs. A value with no fractional part is written as an integer. A value with a fractional part is written as a decimal fixed-point number, e.g., 123.545. Any other combination is invalid and ends reading the RTF stream. The property set table {\*\propsets …} is stored in the RTF header following {\fonttbl …} and {\colortbl …} (if they are present).

An example with two property sets containing variable-font id’s is

{\*\propsets wght 800 wdth 104;wght 400;}

This syntax is a slightly simplified version of the variable-font CSS syntax used in web applications.

In the RTF body, a reference to the Nth property set in the \propsets table is given by \psN (like \crN for choosing the Nth color in the \colortbl). Here N is 0-based, that is, \ps0 refers to the property set immediately following \propsets.

Property Set Binary Format

The property id-value pair is written in the binary format as opyidProperty (0x8A), optProperty (opt8Bytes) followed by the 32-bit id and value. CPropertySet is written as opyidPropertySet (0x89), optPropSet (optArray) followed by the set’s opyidProperty’s. The array of property sets CPropertySets is written as opyidPropertySets (0x88), optPropertySets (optArray) followed by the opyidPropertySet’s. These constants are defined in rebinary.h.

Rendering Variable-Fonts and OpenType Features

In addition to backing-store enhancements, the display routines need to pass active variable-font axis coordinates and OpenType features to DirectWrite. See OpenType Variable Fonts for information about the DirectWrite APIs for this. To create a font specified in part by axis coordinates, RichEdit gets an IDWriteFontFace5 (see dwrite_3.h) with the desired axis coordinates in place of the usual IDWriteFontFace. It does this by calling IDWriteFontFace::QueryInterface() to get an IDWriteFontFace5 interface, calling IDWriteFontFace5::GetFontResource() to get an IDWriteFontResource interface, releasing the IDWriteFontFace5 and calling IDWriteFontResource::CreateFontFace() to get a new IDWriteFontFace5 with the desired axis coordinates. Then it uses this IDWriteFontFace5 instead of the original IDWriteFontFace.

To pass OpenType features to DirectWrite, copy them into a std::vector<DWRITE_TYPOGRAPHIC_FEATURES> and pass them to IDWriteTextAnalyzer1::GetGlyphs() and IDWriteTextAnalyzer1::GetGlyphPlacements(). Some font features, such as Gabriola’s stylistic set 6 ‘ss06’ introduce glyphs with ascents and/or descents that exceed the standard typo ascents and descents as discussed in High Fonts and Math Fonts. To display such large glyphs with no clipping, the rendering software needs to calculate the line ascent and descent from the glyph ink, rather than from the usual font values. This is the approach used with the LineServices math handler.

OpenType Variable Font Axes

The variable font axes are defined in the OpenType fvar table, which has the header

struct FvarHeader             // Variable font fvar table header
{
   OTUint16 majorVersion;     // Major version of fvar table (1)
   OTUint16 minorVersion;     // Minor version of fvar table (0)
   OTUint16 axesArrayOffset;  // Byte offset from table start to first VariationAxisRecord
   OTUint16 reserved;         // Permanently reserved (2)
   OTUint16 axisCount;        // Count of VariationAxisRecord's
   OTUint16 axisSize;         // BYTE count of VariationAxisRecord (20 for this version)
   OTUint16 instanceCount;    // Count of InstanceRecord's
   OTUint16 instanceSize;     // BYTE count of InstanceRecord
};                            //  (axisCount*sizeof(DWORD) + (4 or 6))

Types like OTUint16 that begin with OT describe 4-byte, big-endian quantities that need reverse ordering to work with our little-endian machine architecture. The header is followed by axisCount VariationAxisRecord’s defined by

struct VariationAxisRecord
{
   OTUint32 axisTag;          // Tag identifying axis design variation
   OTFixed  minValue;         // Minimum coordinate value (16.16 format)
   OTFixed  defaultValue;     // Default coordinate value
   OTFixed  maxValue;         // Maximum coordinate value
   OTUint16 flags;            // Axis qualifiers (hidden if 1)
   OTUint16 axisNameID;       // ID for 'name' table entry that provides axis display name
};

The axisTag’s have the same MakeTag() form as the regular OpenType tags. Since they are accessed via the OpenType fvar table, they are in a different namespace from the regular OpenType tags. We don’t know of any tag conflicts between the two name spaces, so it’s probably okay not to mark the axis tags differently. But internally we mark OpenType feature tags by setting the high bit of byte 2 (OR in tomOpenTypeFeature), since the tags consist of ASCII symbols in the range 0x20..0x7E. This marking avoids sending OpenType tags to the wrong DirectWrite APIs.

The VariationAxisRecord’s are followed, in turn, by the InstanceRecord’s defined by

struct InstanceRecord
{
   OTUint16 subfamilyNameID;       // ID for 'name' table entry giving subfamily name
   OTUint16 flags;                 // Reserved for future use (0)
   OTFixed coordinates[axisCount]; // instanceSize coordinates
   OTUint16 postScriptNameID;      // Optional. ID for 'name' table entry giving PostScript name
};

At some point, it might be worth dealing with the InstanceRecord’s, but it’s certainly easier to use axis coordinates than handle myriad localizable font names (see Holofont discussion in the introduction). RichEdit could export a facility for translating between the two, but probably such a facility should be delegated to the font picker. The localizable font names are designed to help end users recognize the nature of a variable font instance, but they aren’t efficient at the RichEdit level. They also aren’t usable for variable-font animations, since such animations vary axis coordinates continuously.

4-28 Decimal Floating-Point Format

The OpenType “fvar” table described in the previous section defines the min, max, and default variable-font axis coordinate values using the OpenType 16.16 numeric format. The integer part of the value is given by shifting right 16 bits, i.e., dividing by 65536. If the fractional part is nonzero, store the value in a floating-point variable and divide by 65536. In applications, coordinates are easier to read when the fractional part is 0 if only the integer part is displayed. Since purely fractional coordinates (values < 1) are useless, if the absolute value is less than 65536, the value can be understood to be an integer without a fractional part.

The OpenType 16.16 format is a binary fixed-point format that may encounter roundoff when converted to decimal, e.g., 800.1 → 800.100006. This roundoff is ugly in RTF, CSS, and dialog boxes. So we need a decimal floating-point format that doesn’t have such roundoff. The IEEE 754-2008 decimal floating-point encoding defines decimal32 with 20 bits of precision, a sign bit and the large exponent range of 10192. OpenType variable-font axis coordinates need at most four decimal places. The sign bit is used for the slant (slnt) standard axis and can be used for custom axes.

If the value has no fractional part, we store it as a standard 2’s complement integer rather than in the high word of 16.16 for readability in RTF, CSS and dialog boxes. To convert it to the 16.16 format, multiply by 65536. But if the value has a fractional part, we use the following signed 4-28 decimal floating-point format

s n significand
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

 

If the number is negative, the sign bit 31 is 1. Bits 0..27 are the significand. The decimal divide value n is defined by

n divide significand by
000mm (not floating point)
001 10
010 100
011 1000
100 10000
101 100000
110 1000000
111 (not floating point)

 

n must have at least one 0 bit to distinguish the format from a negative 2’s complement integer and at least one 1 bit to distinguish it from a positive integer.

This gives 28 bits of precision with a maximum value of (1028 – 1)/10 = 26843545.5 with one decimal place and a minimum value of 0.000001 with six decimal places. These limits are beyond the values used for OpenType variable font-axis coordinates, which typically range between 1 and 1000. The 4-28 decimal floating-point format is easy to use and displays the original fixed-point values with no round-off error. To convert it to the 16.16 format, store the 28-bit significand field in a double variable, divide by the number corresponding to n, multiply by 65536 and round to the nearest integer. For the DWrite APIs, store the 28-bit significand field in a double, divide by the number corresponding to n and cast the result to a FLOAT.

In C, the 4-28 decimal floating-point format of the value x is recognized by the function IsDecimalFloat(x) defined by

#define IsDecimalFloat(x)       IN_RANGE(3, (x >> 28) & 7, 6)

where IN_RANGE() is defined by

#define IN_RANGE(n1, b, n2)     ((unsigned)((b) - (n1)) <= unsigned((n2) - (n1)))

The divide factor in the n table is given by pow(10, (x >> 28) & 3) or (x >> 28) & 3 can be used as a table index.


Skip to main content