Underlining the Power of Windows UI Automation


This post describes how you can leverage the Windows UI Automation (UIA) API to help your customers interact with text shown in an app.

Apology up-front: When I uploaded this post to the blog site, the images did not get uploaded with the alt text that I'd set on them. So any images are followed by a title.

Introduction

Some time back, an organization which works with people with low vision asked me whether it would be practical for me to build a tool which provides customizable feedback to indicate where keyboard focus is, and where the text insertion point is. The result is Herbi HocusFocus at the Microsoft Store, and I described the app's journey in the following posts:

 

(Note that for the rest of this post, I'm going to refer to the "text insertion point" as the caret, given that in some apps, the caret can appear in read-only text, and as such, isn't actually a text insertion point.)

The organization got back to me recently to let me know that some of their clients would like a line shown underneath the sentence that they're reading, and wondered whether it would be practical for me to update my app to have that line shown. This seemed like the sort of feature where in some high-profile target apps, the Windows UI Automation (UIA) API could really help. So I spent a few hours adding a first version of the feature to the app, and uploaded it to the Microsoft Store. I knew there'd be certain constraints with what I'd built, but I was hoping it would work well enough to generate feedback, and help me prioritize my next steps.

Certainly the initial response has been encouraging, as I've been told that the update is receiving very positive reactions. From the organization:

"Awesome! I went and tried the feature right away and it's certainly a very great addition for people with cerebral visual impairment and people who suffer from acquired brain impairment/injury."

 

One handy aspect of the new feature is that the visuals shown for the underline were very straightforward for me to implement. While the other custom visuals shown by the app involve dealing with transparency, this new feature simply moves a window around. The window's positioned below the current line in the text, and is as wide as the line. It has a fixed short height, and has a background of the same customizable color as the caret highlight. Given how straightforward that was, it's really only the UIA interaction that's of interest here.

This has been a reminder for me of how a feature that's relatively straightforward to add using UIA, can have an important impact on people's lives. As such, I'd encourage all devs to consider how you can leverage UIA to add exciting features to your own apps which could help your customers in new ways.

 

So what are the constraints in my new feature?

Before describing how I leveraged UIA to add the new feature, it's worth calling out a few scenarios which won't be impacted by my recent work. Depending on the feedback I get, I can consider which of these constraints are most impactful to people using my app, and so consider what I can do about that.

The UIA Text pattern needs to be available

In order for my app to learn where the current line is, the target app needs to support the UIA Text pattern. The Text pattern provides a great deal of information about the text shown in an app, and I talked about the Text pattern in the series of posts starting at So how will you help people work with text? However, not all apps support the Text pattern. So if the target app doesn't support the Text pattern, then there'll be no underline shown in the app.

The Inspect SDK tool can report whether a text-related part of an app's UI supports the Text pattern. If the Text pattern is supported, then the element's IsTextPatternAvailable property is true.

Figure 1: The Inspect SDK tool reporting that the editable area in WordPad supports the UIA Text pattern.

 

The app only works with editable text

The app tracks changes to the caret position in an app, and so won't work in apps where there is no caret. This will typically be the case with apps that show read-only text, and provide no way to move the caret with the keyboard in order to select text.

 

Checking the UIA ControlType of the element

I wanted to ease into the new feature somewhat, so I could feel confident that things were working as required in a few well-known scenarios, before opening it up to work in as many places where it might work. As such, I checked the ControlType of the UIA element that claims to support the Text pattern, and decided to only show the line beneath the current line of text if the ControlType was Document or Edit. I expect I'll remove this constraint at some point.

You might be wondering what types of control other than Document or Edit would raise a UIA TextSelectionChanged event? Well, say an app presents read-only text, but provides a way to move the caret through the text with the keyboard, in order to select text. I've found one app which does this, and the ControlType of the element supporting the Text pattern is Pane.

 

A regrettable use of legacy technology

When I first added the feature of highlighting the caret position a few years ago, I made a poor choice. While all the focus-tracking action I took used only UIA, the caret tracking used a mix of UIA, and some legacy technology called winevents. I only included some use of a winevent because I was familiar with how that related to the caret position, and it seemed convenient for me at the time. I did it despite knowing how winevents are desktop-only technology, and so my feature wouldn't work if in the future it's downloaded through the Store to other platforms. And I did it despite UIA supporting a TextSelectionChanged event, which an app can raise whenever the caret moves. Well, I'm regretting doing that now.

It seems I've found a situation where an app raises the UIA TextSelectionChanged event, yet my event handler for the legacy winevent doesn't get called. So my app doesn't realize the caret has moved. So this means I need to ditch my use of the legacy winevent, and move to only use UIA for tracking the caret. This is probably something I would have done anyway at some point, but I now have a growing urgency. I wouldn't say this is my biggest regret in life, but I am kicking myself rather. I have little enough time to work on my app as it is, so to be adding to my workload simply because I chose to use legacy technology, really doesn't help. It's UIA-only for me now.

 

Using UIA to determine the bounding rectangle of the line of text where the caret is

So the goal with the new feature is to underline the line of text which currently contains the caret. That involves two steps. The first is to realize when the caret's moved, and the second is to get the bounding rectangle of the line of text where the caret is. My app already had code to react to the caret moving, and like I said above, that's not done by my app today in a way I'd recommend. Instead, I'd recommend that an app registers for notifications when the caret moves, by calling IUIAutomation::AddAutomationEventHandler(), passing in UIA_Text_TextSelectionChangedEventId. Your event handler will then get called as your customer moves the caret around the target app.

Note: Beware of what threads are being used in this sort of situation. Calling back into UIA from inside the event handler can cause unexpected delays, and I often avoid that by requesting when I register for the event, that certain data of interest relating to the element that raised the event, is to be cached with the event. Also, the WinForms app's UI typically won't be updatable from inside the event handler, so I may call BeginInvoke() off the app's main form in order to have the UI update made on the UI thread. Even having done that, I did once in a while find a COM interop exception thrown when trying to update the UI. I've not had a chance to figure out the cause of that yet, so I do have some exception handling in the code at the moment.

Ok, so let's say I know the caret's moved, and I need to find the bounding rect of the line of text which now contains the caret. I could take action every time the caret moves, to get the UIA element containing the caret, and check if it supports the UIA Text pattern. If it does support the pattern, get the bounding rectangle associated with the line of text containing the caret. But in practice, I don't need to check all that every time the caret moves. Rather, I could check whenever keyboard focus moves, does the newly focused element support the Text pattern? If it doesn't, then as the caret moves around that element, I know the Text pattern won't be available, and don't need to make the check for the pattern.

The code I ended up with is as follows:

 

 

// Cache the UIA element that currently has keyboard focus, so we don't need to retrieve

// it every time the caret moves.

private IUIAutomationElement _elementFocused;

 

// Call this in response to every change in keyboard focus or caret position.

public void HighlightCurrentTextLineAsAppropriate(bool focusChanged)

{

    // Are we currently highlighting the line containing the caret?

    if (!checkBoxHighlightTextLine.Checked)

    {

        // No, so make sure the window used for highlighting is invisible.

        _highlightForm._formTextLine.Visible = false;

 

        return;

    }

 

    // Using a managed wrapper around the Windows UIA API, (rather than the .NET UIA API),

    // I tend to hard-code UIA-related values picked up from UIAutomationClient.h.

 

    int propertyIdControlType = 30003; // UIA_ControlTypePropertyId

    int patternIdText = 10014; // UIA_TextPatternId

 

    // Are we here in response to a focus change?

    if (focusChanged)

    {

        // Hide the highlight until we know we can get the data we need.

        _highlightForm._formTextLine.Visible = false;

 

        // Create a cache request so that we access the data that we know we'll

        // need with the fewest number of cross-proc calls as possible.

        IUIAutomationCacheRequest cacheRequest = _automation.CreateCacheRequest();

        cacheRequest.AddProperty(propertyIdControlType);

        cacheRequest.AddPattern(patternIdText);

 

        // Until I figure out the occasional interop exception, wrap this in a try/catch.

        try

        {

            // Get the UIA element currently containing keyboard focus. (Note: If I rearrange some code

            // I could avoid making this call, and instead use the element which originally supplied the

            // UIA FocusChanged event.)

            IUIAutomationElement elementFocusNew = _automation.GetFocusedElementBuildCache(cacheRequest);

            if (elementFocusNew == null)

            {

                // If we failed to get the focused element, give up.

                return;

            }

 

            // For this first version, only work with Document and Edit controls.

            int CtrlType = elementFocusNew.CachedControlType;

            if ((CtrlType != 50030) && // Document

                (CtrlType != 50004)) //Edit

            {

                Debug.WriteLine("Newly focused element is neither Document nor Edit, so reset.");

 

                this._elementFocused = null;

            }

            else

            {

                Debug.WriteLine("Newly focused element is one of Document nor Edit, so use it.");

 

                this._elementFocused = elementFocusNew;

            }

        }

        catch (Exception ex)

        {

            Debug.WriteLine(ex.Message + " " + ex.StackTrace);

        }

    }

 

    try

    {

        // Do we know which element contains keyboard focus?

        if (this._elementFocused != null)

        {

            // Does the element support the Text pattern?

            IUIAutomationTextPattern textPattern = this._elementFocused.GetCachedPattern(patternIdText);

            if (textPattern != null)

            {

                // Get the current selection from the element. Even if there is no selection,

                // then we'll get data back relating to where the caret currently is.

                IUIAutomationTextRangeArray array = textPattern.GetSelection();

                if ((array != null) && (array.Length > 0))

                {

                    // For this version of the feature, only consider the first selection range

                    // if there are multiple selections in the app.

                    IUIAutomationTextRange range = array.GetElement(0);

                    if (range != null)

                    {

                        // Expand the range to encompass the entire line containing the caret.

                        range.ExpandToEnclosingUnit(TextUnit.TextUnit_Line);

 

                        // Now get the bounding rect for that line.

                        double[] rects = range.GetBoundingRectangles();

                        if ((rects != null) && (rects.Length > 3))

                        {

                            Rectangle rectLineText = new Rectangle(

                                (int)rects[0],

                                (int)rects[1],

                                (int)rects[2],

                                (int)rects[3]);

 

                            // We now know the bounding rect of interest. Move our highlight window

                            // so that it appears at the bottom edge of the bounding rect.

                            _highlightForm.HighlightCurrentTextLine(rectLineText);

                        }

                    }

                }

            }

        }

    }

    catch (Exception ex)

    {

        Debug.WriteLine(ex.Message + " " + ex.StackTrace);

    }

}

 

 

Perhaps one of the most exciting steps above is the call to ExpandToEnclosingUnit(). This is the place where I get to learn about the line that contains the caret. That function is really handy in other situations too. For example, if I want to learn of the bounding rectangle of a word or a paragraph, or of contiguous text which has the same formatting. That's pretty useful stuff in a variety of scenarios.

I should add that while the feature seems to hold up well enough in some apps, (including WordPad and NotePad,) it's not as reliable as I'd like it to be in other apps, (including Word 2016). I'll bet that's due to my use of the legacy technology I mentioned earlier. Ugh. I really need to find time to move to using only UIA, like I should have done in the first place.

Still, even with all the improvements I should look into, for a first version of the feature, it works well enough to generate the feedback that I need in order to make it as helpful as I can.

 

Figure 2: The line of text containing the caret being underlined in Word 2016.

 

Always keep in mind the accessibility of the app itself

Whenever I'm updating an app's UI, I need to consider the accessibility of the resulting UI. For this new feature, the only update to the UI is to add a radio button at a specific place relative to existing UI. Because I'm adding a standard control that's provided by the WinForms framework, I know I'll get a great head start on accessibility. For example, the control will be fully leverageable via the keyboard, and it'll be rendered using appropriate system colors when a high contrast theme is active, and the Narrator screen reader will be able to interact with the control. This is all great stuff, and in fact there were only two things I needed to check.

Focus order

As my customers tab through the UI, the order in which keyboard focus moves through the app must provide the most efficient means for my customers to complete their task. If keyboard focus were to bounce around the UI as the tab key is pressed, that would be at best a really irritating experience that no-one would want to have to deal with, or quite possibly make the app unusable in practice.

While my app's not web-based, the W3C web content accessibility guide Focus Order sums up the principal nicely for web UI, "focusable components receive focus in an order that preserves meaning and operability". I want that to be true in any app I build, be the UI HTML, WinForms, WPF, UWP XAML or Win32.

Fortunately, Visual Studio makes it quick 'n' easy for me to make sure I'm delivering an intuitive tab order. All I do in Visual Studio is go to View, Tab Order, and then select each control in the UI, (using either the mouse or keyboard,) in the order I'd like keyboard focus to move through the UI.

The screenshot below shows all the controls in the app UI with an accompanying number shown by each control, indicating the control's position in the tab order.

 

Figure 3: The app form in Visual Studio's design mode, with the tab order shown by the controls in the form.

 

Note that when using the Tab Order feature, it is important to include the static Labels in the logical place in the order, even though keyboard focus doesn't move to the Labels as your customer tabs around. For some types of control, if an accessible name has not been set on the control, then WinForms might try to leverage an accessible name based on the text of a nearby Label. For example, with a TextBox or ComboBox which don't have a static text label built into the control. In those cases, having the associated Label precede the control in the tab order, can result in the control getting the helpful accessible name that your customer needs.

Programmatic Order

Whenever I'm updating UI, I need to consider both the visual representation of the app, and the programmatic representation as exposed through UIA. Both of these representations must be high quality for my customers.

In some situations, the path my customers take when navigating through the UI will be based on the order in which the controls are exposed through the Control view of the UIA tree. The Control view is a view which contains all the controls of interest to my customers, including all interactable controls and static Labels conveying useful information. (So the view might not contain such things as controls used only for graphical layout which are not required to be exposed either visually or programmatically.)

Having added the new CheckBox to the app, I pointed the Inspect SDK tool at the UI to learn where the control was being exposed in the UIA hierarchy. It turned out that the CheckBox was being exposed through UIA as the first child element beneath the app window. So programmatically, it existed before all other elements in the UI. The screenshot below shows the CheckBox is being exposed before all the other elements, which are its siblings in the UIA tree.

Figure 4: Inspect reporting the UIA tree of the app, with the new CheckBox as the first child of the app window.

 

So say a customer using the Narrator screen reader encounters the app window. If they're not familiar with the app, they might choose to switch to use Narrator's Scan mode, to learn about the UI. By doing that, they may press the up and down arrows and move through the controls in the UI, including the controls which can't get keyboard focus. And the navigation path then taken through the controls must be a logical one based on the functionality in the app. Importantly, the path actually taken is impacted by the order of the elements as exposed through UIA. This means the first element they encounter will be the new CheckBox. Then they'll move to the static Label shown visually at the top of the app. And later, they'll move from the control shown visually before the new CheckBox, directly to a control following the new CheckBox. This is not the experience I want to deliver at all.

So to address this, I edit the designer.cs file for the app, and change the order in which the controls are added to the form. After I originally added the new CheckBox to the app, the related designer.cs code was as below. I've highlighted the line of interest in the code, which contains the new control called "checkBoxHighlightTextLine".

 

//

// HerbiHocusFocusForm

//

resources.ApplyResources(this, "$this");

this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font;

this.Controls.Add(this.checkBoxHighlightTextLine);

this.Controls.Add(this.labelInstructions);

this.Controls.Add(this.checkBoxHighlightFocus);

this.Controls.Add(this.checkBoxCaretAbove);

this.Controls.Add(this.checkBoxCaretBelow);

this.Controls.Add(this.groupBoxCustomise);

 

 

So I grabbed the line adding the new CheckBox to the form, and I moved it to be between the lines which add the controls logically before and after the CheckBox. I've highlighted the line of interest in the following resulting code.

 

//

// HerbiHocusFocusForm

//

resources.ApplyResources(this, "$this");

this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font;

this.Controls.Add(this.labelInstructions);

this.Controls.Add(this.checkBoxHighlightFocus);

this.Controls.Add(this.checkBoxCaretAbove);

this.Controls.Add(this.checkBoxCaretBelow);

this.Controls.Add(this.checkBoxHighlightTextLine);

this.Controls.Add(this.groupBoxCustomise);

 

 

Having done that, if I now point Inspect at the UI, it reports that the CheckBox element is sandwiched between the other controls in the UIA tree in a manner that matches the meaning of the UI.

Figure 5: Inspect reporting the UIA tree of the app, with the new CheckBox exposed in the appropriate place in the UIA tree.

 

So what's next?

Over the next few weeks I hope to grab a few early hours here and there, and work on some of the points raised by people using the app. These include:

1. Customizable color, size and transparency for the underline. That should be quick to do, so I'll probably work on that first. And as always, I'll make sure the app's tab order and UIA hierarchy are intuitive after updating the UI.

2. I got the feedback, "her last wish is that the focus marking would work even better and would work in the Windows menu etc." This is going to be an interesting one, and I'll need to follow up to learn more about exactly which UI is of interest. For regular app menus, my focus highlight struggles, because the menu can appear on top of my focus highlight. I expect I can address that, by adding an event handler such that I can learn when my highlight isn't top-most, and then moving my highlight back on top when necessary. (And I'd need to make sure I can't get stuck in a loop with other UI which also tries to keep itself on top.) But what I suspect is really the request here, is that the highlight works on the Start menu, and that's not straightforward for me to resolve. For my app to achieve that, it would need UIAccess privileges, and as I understand things, apps downloaded from the Store today can't get that. So the only way I could get that to work would be for me to revert to shipping my app through my own site and installer, and signing it, which I'm really not set up for at the moment. Hmm. I'm not sure what I'll do about this.

3. The app needs to work more reliably in some apps, and in some cases, work at all. One app which I'm particularly interested in is Edge, when caret browsing's turned on. For this to happen, I need to ditch the code I have relating to use of the legacy winevent, and move to only use UIA.

4. This last point is not something I've had feedback on, but is something I'm curious about nonetheless. Sometimes my underline appears further below the line of text than might seem appropriate, and I expect that's due to the paragraph formatting on the line. So I wonder if I can get what Word calls the "Spacing after" for the line of text, and account for that when positioning my underline. Perhaps this would keep the underline at the same distance from the text shown visually on the line, regardless of the spacing after. I really don't know if that's possible, but it'll be interesting to find out.

 

Summary

It's been a pleasure for me to discuss with the organization which asked for the new underline feature, the human aspects of building software like Herbi HocusFocus. For me, software has always been a means to an end. That end being the impact it has on people's lives. Other feedback I've received from the organization is:

"I also agree with the fact the developers should look at the way they can create tools for people in their community. They can see it as a form of charity work, giving back to their community and they'll see how small tools or adjustments can mean a world of difference to people. Software developers should always wonder how their users will actually use their software and build it with their users in mind. People and usability come before the technology if you ask me. That's why accessibility and a user friendly design should be getting more attention. All users benefit from a good design and from accessibility options and some users even depend on them to be able to use a piece of software. I'm also a strong advocate for user testing before launching a product, most software developers have certain expectations of how their users will interact with their software. Sometimes they can be quite wrong about the way people use and view their software and which steps seem logical to the user."

 

I can't argue with any of that. In fact, after more than sixteen years of building exploratory assistive technology apps like Herbi HocusFocus, only a handful of my apps actually had any impact. And those were the apps where someone had contacted me, specifically asking whether I could build a tool for them, because there didn't seem to be a solution available to them already. I'm very grateful to have been able to learn from all those people, and to get a better understanding of where I can have most impact.

Overall, this exercise has been a reminder for me of how UIA can help devs add some seriously useful functionality to an app, with relatively little work. That doesn't mean to say UIA does everything you want. I once told a dev how UIA doesn't provide a simple way to get a collection of elements that lies within a rectangle. He replied saying "I find that difficult to believe". Well, we live in the world we live in, and there's lots of things that we might like to exist in this world, and they don't. As far as I'm concerned, we try to improve things for the future, and make the most of what's available to us today. And I believe that UIA has a lot to offer us and our customers today.

So please do consider how UIA might help you provide a powerful new feature to help your customers. Even with all my new feature's constraints, the first piece of feedback I got, was "Wow, I'm amazed by your actions!". Anyone who knows me, knows my actions are far from amazing. But I have the support of some very cool technology that can make me look pretty useful at times.

Guy


Comments (2)

  1. I mentioned in the “So what’s next?” section above, the idea of accounting for what Word calls the “Spacing after”, when rendering an underline beneath the current line of text. If I don’t account for that, the underline can appear much further away from the text at the last line of the paragraph. That seems a significant distraction to me.

    So I did some experimenting, and it seems that I can account for that in certain apps. For example, Word 2016 exposes the relevant paragraph attribute through UIA, so when I determine that I’m at the last line in a paragraph, I can get the “Spacing after” and move the underline up as required. Other apps don’t expose the spacing value through UIA, and so I can’t do anything to account for it in those apps. (Interestingly those other apps include WordPad, which does adjust the visual display based on paragraph spacing, despite not exposing the value programmatically.)

    I inserted the UIA-related action shown below, just before I called HighlightCurrentTextLine() in the code above in the “Using UIA to determine the bounding rectangle of the line of text where the caret is” section. This has been a fun exploration into how I can use UIA to access paragraph formatting in a document, and so deliver a more consistent experience for my customers.

    // If the current line of text is the last line in the paragraph,
    // then we may need to shift the underline up to account for the
    // paragraph’s “Spacing after” attribute.

    // In order to determine if this is the last line in the paragraph,
    // we’ll set two text ranges. One text range will represent the
    // current line, and the other text range will represent the current
    // paragraph. We’ll then compare the end points of the two text ranges.
    // If the two end points are the same, then we know the current line
    // is the last line in the paragraph.

    // So next get a second range based on the text range we already have,
    // (that existing range being the one that represents the current line).
    IUIAutomationTextRange rangeParagraph = range.Clone();

    // Now expand our new range to cover the entire paragraph.
    rangeParagraph.ExpandToEnclosingUnit(TextUnit.TextUnit_Paragraph);

    // Now compare the end points of the two text ranges.
    int compareResult = rangeParagraph.CompareEndpoints(
    TextPatternRangeEndpoint.TextPatternRangeEndpoint_End,
    range,
    TextPatternRangeEndpoint.TextPatternRangeEndpoint_End);

    // If the two end points are the same, then we know we may need
    // to adjust the underline to account for the paragraph spacing.
    if (compareResult == 0)
    {
    // Get the current line’s “Spacing after” attribute value.
    // (We could use either the line’s text range or the
    // paragraph’s text range here.)
    const int afterSpacingId = 40042; // UIA_AfterParagraphSpacingAttributeId
    double? afterSpacing = range.GetAttributeValue(afterSpacingId) as double?;
    int underlineAfterSpacingPoints = (afterSpacing == null ? 0 : (int)afterSpacing.Value);

    // Is there any spacing to account for here?
    if (underlineAfterSpacingPoints != 0)
    {
    // Convert the spacing from points to pixels. The vertical dpi
    // was accessed and cached earlier during initialization with
    // the following action:
    //
    // Graphics g = this.CreateGraphics();
    // this._dpiY = g.DpiY;
    // g.Dispose();

    int underlineAfterSpacingPixels = (int)(underlineAfterSpacingPoints * this._dpiY) / 72;

    rectLineText.Offset(0, -underlineAfterSpacingPixels);
    }
    }

    1. And by the way, sorry about the lack of indentation in the code in the blog comments. I tried using either spaces or tabs, and either way the indentation gets stripped when the code is published, and the code becomes difficult to read. This seems somehow ironic given that the code relates to spacing.

Skip to main content