How do I find the bounding box for a character in a font?


A customer had the following question:

I'm looking for a way to get the height of text which consists entirely of digits, for example "123", as these characters do not have any descent or internal leading. I expected functions like Get­Text­Extent to return the character's ascent minus the internal leading, but in fact it returns the ascent plus the descent plus the internal leading. I considered getting the font metrics and taking the TEXT­METRICS.tmAscent, but I'm worried that numbers in other languages might have a nonzero descent and internal leading. Is there a function I can call to return the "real" height of the text?

Well, first of all, this question makes an assumption about digits that isn't even true in English. Fonts developed in recent years tend to keep all digits the same height (and often the same width), but fonts designed before the advent of computers (or computer fonts which were inspired by old-timey fonts) will often vary the height, and sometimes even have digits with descenders. Here's an example from the font Georgia:

058

Observe that the number zero is six pixels tall, whereas the number eight is nine pixels tall, and the number five has a two-pixel descender!

Okay, so you're going to have to take the descent into account for all languages, including English. Internal leading is the space above a character to separate it from elements above it. For example, you need some space above a capital T so that the horizontal bar remains readable. Again, the assumption that English doesn't need internal leading is false.

Okay, but what about the original question? Well, when I heard this question, my first thoughts went back to the early days of Win32 when the coolest new GDI feature was paths, and everybody was showing off the fancy text effects you could pull off with the aid of paths. My initial instinct was therefore to use the same technique as those cool demos by combining Begin­Path, Text­Out, and End­Path. Once I had a path, I could get its dimensions by using Path­To­Region and Get­Rgn­Box.

Fortunately, it turns out that there's an easier way. The Get­Glyph­Outline function returns the glyph metrics, which describe the bounding box of the pixels of a character.

// Create an identity matrix
static const MAT2 c_mat2Identity = {
    { 0, 1 }, /* eM11 = 1.0 */
    { 0, 0 }, /* eM12 = 0.0 */
    { 0, 0 }, /* eM21 = 0.0 */
    { 0, 1 }, /* eM22 = 1.0 */
 };
GetGlyphOutline(hdc, L'0', GGO_METRICS, &gm, 0, NULL, &c_mat2Identity);

The dimensions of the character are returned in the GLYPH­METRICS structure, and in particular, you can derive the bounding box from the gmptGlyph­Origin, gmBlack­BoxX, and gmBlack­BoxY members.

Comments (16)
  1. Anonymous Coward says:

    Before using techniques like these, first ask yourself if you really need it. For example, if the numbers are likely to change, the misuse of this method can lead to jumping baselines and other visual instabilities.

  2. configurator says:

    I know you're not a .net expert, but is this what the .Net MeasureString function does? (msdn.microsoft.com/…/system.drawing.graphics.measurestring.aspx)

    I found it to be slightly wrong a lot of times, being either a few (3-10) pixels too narrow or too wide.

  3. Gwyn says:

    Another possibility which I have used is DrawText specifying the DT_CALCRECT format.

  4. Richard says:

    "Text" figures, while also know as "old style" (including in Word, which finally in 2010, supports Open Type selection of digit style) are *designed* for use mixed in with text and work better in blocks of text.

    "Lining" (or "Titling"), with the same height, baseline and width are designed for titles or with lots of other numbers (e.g. a table).

    Support is included in some modern fonts for instance the Office2007/Vista fonts Constantia, Corbel and Cambria all include both lining and text figure glyphs.

  5. Drak says:

    Ah, this might come in handy at some point!

    By the way, well done on the 'Leave a comment'-at-the-bottom issue and all-replies-on-one-page issue Raymond.

    In the "Why doesn't the Windows Vista copy progress dialog show the names of the files being copied?" post it looks like your replies don't have a yellow block around them anymore, is this intentional?

    [The yellow block is back! Thanks for pointing it out. -Raymond]
  6. Stubie says:

    Here's a related question:

    How can I find out how far (or if) text extends to the left (or right) of the text extent, or even the calculated rectangle (using DrawText() with DT_CALCRECT) with ClearType? For example, using a font like Tahoma, with a font height of -11, the numbers 4, 6, 8 and 9 extend a pixel to the left of the specified rectangle if using DrawText() with DT_NOCLIP. If I want to erase the text, and draw new text, I have to know that it previously had extended an extra pixel to the left, otherwise I'll leave behind traces of the previous text. Something like GetCharABCWidthsFloat() gives no indication that it'll extend a bit extra to the left.

  7. Mike Dimmick says:

    @configurator: quite likely you're measuring with GDI+ and rendering with GDI. GDI is the older technology designed for best readability at the exact size given, which means that the relative character widths and spacing change when the graphics are scaled, while GDI+ text rendering is designed to scale more consistently for zoom and changed DPI.

    Controls that simply wrap a Windows control such as, well, almost all of them including Label, Button, TextBox use GDI rendering – but System.Drawing.Graphics.DrawString uses GDI+. Your window will often look inconsistent if you have a mix. Use System.Windows.Forms.TextRenderer to measure and draw strings with GDI (added in .NET 2.0 for this exact reason).

  8. Random832 says:

    Actually – lining figures merely all have the same height – the same-width thing is a separate distinction – 'tabular' vs proportional. On my system a couple dozen fonts* have proportional lining figures as their default stle. Tabular text figures seem to be rare, but Cooper Black has them, and Papyrus seems to aim for them and miss [the digit 4 is 3.6% wider]

    Lining/titling figures are meant to appear in headlines [where height uniformity is important – all letters will be in uppercase in such contexts, leading to the alternate name "uppercase numerals", or more commonly 'lowercase' for text figures], whereas tabular figures [whose widths match not only one another, but often also the currency symbols and plus/minus, and the 'figure space' and 'figure dash' characters] are meant to appear in tables.

    *Falling mostly into the two categories of "Display fonts" [the clearest illustration of this principle is that Gill Sans MT Ext Condensed Bold has them whereas other Gill Sans varieties have tabular figures] and "Handwriting imitation" [though a surprising number of these had tabular figures] – Berlin Sans was the only example of a normal text font with proportional lining figures, though several (Most prominently Georgia and three of the C family) had proportional text figures.

  9. Drak says:

    @Mike:

    Thanks for that piece of info.. Now where do I store it so that next time I'm using GDI to draw a string I can still find it.. (Damn my memory, it's so full of holes)

  10. David Walker says:

    Slightly (but not much) off-topic, I really, really don't like reading text where the numbers are all "misaligned" vertically.  It looks like something is wrong!  I know that things were done this way historically, but it just takes me out of the flow of reading to see numbers jumping up and down.  (Of course, when letters do that, it doesn't bother me.  Hmm…..)

  11. David Walker says:

    No mention of kerning pairs here, but that is letter-pair-specific (I don't mean two or three-letter glyphs, but true kerning pairs).

  12. Gabest says:

    Not just simple kerning, there are languages that change the shape of the characters depending on which two are paired up, arabic and other crazy middle-east/asian languages.

  13. Hans Passant says:

    So does True-Type hinting.

  14. rs says:

    @Stubie

    I think SetBoundsRect/GetBoundsRect is what you are looking for.

  15. Georgia is an interesting font. Not only do the digits have different heights, but they have different *widths* as well. Most other proportional computer fonts I've seen have monospaced digits, and it's not uncommon for code that assumes digits are monospaced even in a proportional font.

    As soon as someone selects Georgia, that assumption breaks.

  16. Mihai says:

    This will work for digits, it is definitely a bad idea to use it for general text.

    For complex scripts even the concept of "bounding box for a character" is broken, because characters might change shape based on context.

    Heck, there are examples even in the Latin script. Take the 'fi' ligature, (or fl, or ffi/ffl in some advanced fonts).

    The width (and the height for some fonts) of 'fi' is different than the one for 'if'

Comments are closed.

Skip to main content