Can UIA help you build a tool for someone you know?


This post describes how you can use the Windows UI Automation API to access links presented by an app, and to programmatically invoke those links. The post encourages you to consider whether this could help you build a tool to help someone you know who finds accessing links to be a challenge.


 

Get creative
 
I was wondering the other day about what sorts of ways someone might want to interact with hyperlinks in a browser. People might want to be informed about the links through a mix of visuals, audio and touch, and that representation would be tuned to suit an individual’s desires. (For example, large high-contrast text or higher volume audio.) And when it comes to invoking a link, that might be done through input mechanisms such as touch, mouse, keyboard, voice, eye gaze, a foot or head switch, or head pointer.

And the person interacting with the link should be given as much time as they need to, to both understand all the information about the link, and to invoke it. We never want our customers to be stressed about whether our UI will disappear before they can leverage it.


You may feel that a custom tool for interacting with links might be of use to someone you know. Perhaps a friend or family member might find a list of links presented in some particular way, could be more practical to work with than the traditional distribution of links all over a web page. Or perhaps you yourself would?


So as I was thinking about this, while there are many input and output ways in which the links might be represented and invoked, there are still two steps that are always required. The first is to get the set of links available, and the second is to actually invoke them. I’m sure people who are far more creative than me will think of all sorts of new and exciting tools for interacting with links, but the tools need to know that the links exist, and allow the links to be invoked.


And as you’d expect, this is where I turn to the Windows UI Automation (UIA) API. I want to access all the links with as little code as possible, and as quickly as possible, and provide a way for the links to be programmatically invoked.


So I built a simple test app to do just that, and the code at the bottom of this post contains all the interesting stuff from that app. For this C# app, I used a managed wrapper around the native Windows UIA API. I could have used the .NET UIA API directly, or if I’d written a C++ app, used the native Windows UIA API directly. (A note on the differences between the two available UIA APIs is at the "I want to build a new AT tool to help someone I know. Should I use the managed .NET UIA API or the native Windows UIA API?" section at the end of Ten questions on programmatic accessibility.)

 
 

Interacting with hyperlinks with UIA


I decided that I’d write a WinForms app which gets a list of links from some window, and then presents the list in its own window. My customer can then select a link in the list, and have that link programmatically invoked.


So in Visual Studio Community 2015, I created a new WinForms project called “Linker”. I then added a ListView to present all the links and a "Close" Button. This means it only takes a couple of minutes to create an app that I can run and close. It sets the stage for the next step, which is to add all the functionality my customer needs.


And this is where I need to make an important usability decision. Exactly where should my new app look for all the links? Should it look for all the links being shown in a particular browser window? Or perhaps the foreground window, which may or may not be a browser. But if it’s the foreground window, how can my custom app be used without it becoming the foreground window? The answers here depend on what the person that you’re building the app for wants. And this is what’s so exciting about building the tool yourself. You can tune it however you want. You can make all sorts of assumptions which you know are valid in your situation. Perhaps you’ll adjust things if other people use the tool in the future, (which would be great,) but to start off with, you build the tool to help the one person you’re building it for.


If I was targeting a specific browser, I’d probably use UIA to find the browser window of interest, and that’d be a direct child element of the UIA root element. But for my new app, I decided that I’d have all the links found in the foreground window. This meant that I’d add a low-level keyboard hook, and respond to a particular key press. I chose F9 for no particular reason. (If this was a shipping app, I’d wonder what action F9 might also trigger in the foreground app.)


It was the adding of the low-level keyboard hook that was the most time-consuming part of building this app. I pasted in some code from another experimental app of mine, but in the process managed to remove “static” from something. This meant that my keyboard hook worked once, and then would throw an exception. In the end I decided to pay attention to what Visual Studio was telling me, and I saw that some part of my hook was being garbage collected. By making it static again, it worked just fine.


And then came the most fun bit – using UIA to find all the links for me. I’ll not describe how I did that in this section, as all the code is below, and hopefully the comments describe what’s going on.


My first attempt at this seemed to work well. With one cross-proc call I could gather all the data I needed. In my tests, I could add all the links to my list within one second, (and often in less than half a second,) so I felt that perf would be acceptable.


In order to access all I needed with one cross-proc call, I did need to make use of UIA’s caching feature. This meant that when I gathered all the links, I asked UIA to cache certain data at the same time. This meant I didn’t need to go back to the app with the links later to ask for the links’ names and URLs, and for a way I could later invoke it if necessary.


And on the subject of invoking the link, this will be done by my customer later through the use of the UIA Invoke pattern. I add a cached reference to the link’s UIA pattern object as a tag on the relevant list item in my app, and call the Invoke() method of that later if the link is to be invoked. That call to perform the invoke does involve a cross-proc call back to the app showing the links.


The screenshot below shows the new app presenting links found at http://msdn.com/enable. Once the links were presented in the new app, I alt+tabbed to it, and could page through the list to an Office link and invoke it.

 

 

Links found at http://microsoft.com/enable, with the list item for an Office developer link selected.

Figure 1: Links found at http://microsoft.com/enable, with the list item for an Office developer link selected.



It’s worth pointing out here that my app makes no attempt to detect changes in the UI showing the links, and that can mean that list of links can become stale. If my customer navigates to another page of links, then any attempt to invoke an earlier link still shown in my app will fail. So after navigating to a new page of links, my customer will need to press F9 again to have a new list of links shown. Maybe I could add a UIA event handler to my app to try to detect changes which mean I should automatically repopulate my list of links.

 
 

But am I looking for the right links?


When I ask for a set of links from the foreground app, I use a UIA condition, and ask for all elements that match that condition. While I could tune a condition to be very specific, in general I want to make it as simple as possible. So all I asked for was the set of elements whose ControlType is Hyperlink, and whose Enabled property is True.


And this raises a critical point. My app will only work with UI that represents itself appropriately through UIA. If UI shows something that looks like a link visually, but it’s just a styled Button, then my app won’t find it. If I wanted to, I could update my app to get all enabled Hyperlinks or Buttons. For my purposes here, I’m going to stick to just looking for Hyperlinks. When you build your app, you might need to tune the conditions used to best suit the needs of the person you’re building it for.


When I ran my app, I immediately got a surprise, thanks to my list being alphabetically sorted. And that is, when I pointed the app to http://microsoft.com, I found a bunch of links with no name at all. This means my customer can’t get a friendly description for the link, and has to try and interpret what the URL means.

 

 

Nameless links found at http://microsoft.com.

Figure 2: Nameless links found at http://microsoft.com.

 
 

I took a quick look with the Inspect SDK tool at the page, and did find one nameless link which contained a named image. So if I wanted to, if I find a link with no name, I could look for a friendly name on a child element of the link. But for this simple app, I haven’t done that.


More confusing were the results I found at http://msdn.com/enable. On that page I found some links with no names and no URLs, so it would seem that they’re no use at all to the person using my app.
 
 
Links found at http://msdn.com/enable with no names and no URLS.
Figure 3: Links found at http://msdn.com/enable with no names and no URLS.


So it seemed likely that the basic condition that I was using was not sufficient. I was getting results which don’t help the person using the app. When this sort of thing happens, the first thing I consider is whether the links are being exposed programmatically as being on the screen or not. The links have a property called IsOffScreen, and if that’s True, then the link is saying that it’s not being shown on the screen. I’d deliberately not used that in my condition when I built the app, as I was curious as to whether the off-screen links would be usable. But I now suspected that if some of the links were off screen, they’d not been set up yet with a URL. So I decided to update my app to only list links that are both enabled and declare themselves to be on the screen.


And I soon decided this was a bad idea for the following two reasons:


1. My tool now only listed a small fraction of all the links available, because most of the links on the page are scrolled out of view, and so they’re IsOffScreen property is true. I don’t want the person using my app to have to scroll a link into view before they can use it.


2. The links with no name and no URL still had no name and no URL when they were in view and they said they were on the screen. So my change hadn’t fixed anything. It turned out the nameless, URL-less links are embedded videos on the page.

So I went back to listing all enabled links, regardless of whether they’re shown on the screen. When I tried invoking those embedded videos through the new app, I found inconsistent results. When the videos were already scrolled into view, the video started playing as expected when I invoked it.  When the videos weren’t already scrolled into view when I invoked them, the video was scrolled into view, but didn’t start playing. Interestingly when I next gathered up the links on the page, those video links had a javascript-related URL.

But the upshot of all this is, the new app will present all links that declare themselves to be enabled, and if that includes a few meaningless ones, then so be it. I really don’t want to force the person using the app to have to repeatedly scroll through a long page before they can access a link of interest.

 
 

So what did deciding to work with the foreground window get me?


While I expect often you’ll be working with a specific browser window, for my test app, I wanted to work with any app that shows links. To test that out, I downloaded some PowerPoint slide deck from http://microsoft.com, opened it up in the PowerPoint 2103 that I have on this machine, and pointed the new app to it.


The app listed the links on the slides exactly as expected, and I could invoke them through the app as required.
 
 
The new app listing links found on a PowerPoint slide.
Figure 4: The new app listing links found on a PowerPoint slide.
 
 


Conclusion


Sometimes the constraints of today’s technology might impact what you’re trying to achieve. Perhaps you’d like to use UIA or the Windows Magnification API to help someone in some specific way, and the API’s don’t lend themselves to what you want to do. But I do believe the API’s can help you build some very helpful tools right now.


In fact, today you may have the skills, tools and enthusiasm to build a new and exciting tool, and the real challenge for you is getting the tool in front of someone who’d find it helpful. I’ve spent the last thirteen years trying to build tools to help people access technology in new ways, and most of the time, failed. Many of the things I built and which I felt might be useful, garnered no interest. But once in a while I get connected with someone who can tell me exactly what would help someone they know. And sometimes, what’s needed is not technically complicated. It can be built with today’s technologies, and the feeling afterwards that it was worth building can be overwhelming.


So if you don’t know of anyone who might be helped by a tool you could build, maybe there are local organizations near you who work with people with certain challenges, and they could help you learn of some opportunities to help.


If I ever get my act together I’ll upload the simple tool I built for this post to github. I know I said six months ago that I’d do that with another app, and still haven’t. But maybe it’ll happen one day. And in the meantime, if I can add some snippets relating to other things you can achieve with UIA, let me know.


Best of luck to you.


Guy
 
 
 
P.S. Just for fun, I pointed the new app that I built to this blog post before I published it. I turns out that there are 119 links on the page. How about that.
 
 
The new app reporting links on this blog page.
Figure 5: The new app reporting links on this blog page.
 
 
 
P.P.S. Here’s the main C# file for the app I just built. Most of the code is unrelated to UIA, which is good. You want to be able to do some really useful things with UIA, with very little UIA code.
 
 

using System;
using System.Diagnostics;
using System.Runtime.InteropServices;
using System.Windows.Forms;
 
// This namespace is available through the managed wrapper around the native Windows UIA
// API's UIAutomationCore.dll. Depending on what Windows SDK you're using, the managed
//  wrapper can be generated by something like this...
//
// "C:\Program Files (x86)\Microsoft SDKs\Windows\v8.1A\bin\NETFX 4.5.1 Tools\x64\tlbimp.exe"
//   c:\windows\system32\uiautomationcore.dll /out:Interop.UIAutomationCore.dll
//
// Your tlbimp.exe might be in a different folder on your machine. Then add the
// Interop.UIAutomationCore.dll wrapper that's generated to the list of references
// in your app. If you're using the managed .NET UIA API, you don't need a wrapper.
// Similarly, if you're using the native Windows UIA API from a C++ app, you don't
// need a wrapper then either.
 
using interop.UIAutomationCore;
 
namespace Linker
{
    public partial class Linker : Form
    {
        // This first chunk of stuff relates to adding a low-level keyboard hook. This
        // allows the app to take action in response to a specific key press, regardless
        // of what app is in the foreground. Your app might not need a hook like this.
 
        public const uint msgLinkerKeyboardHook = (0x0400 /*WM_USER*/ +
            0x0123 /*Some random number I made up.*/);
        private static Win32Interop.LowLevelKeyboardProc s_procKeyboardHook =
            KeyboardHookCallback;
        private static IntPtr s_keyboardHookID = IntPtr.Zero;
        private static IntPtr s_mainFormHandle = IntPtr.Zero;
 
        // Initialize UIA once, and keep a reference around.
 
        private IUIAutomation uiAutomation;
 
        public Linker()
        {
            InitializeComponent();
        }
 
        private void Linker_Load(object sender, EventArgs e)
        {
            s_mainFormHandle = this.Handle;
 
            // Create the keyboard hook.
 
            using (Process curProcess = Process.GetCurrentProcess())
            {
                using (ProcessModule curModule = curProcess.MainModule)
                {
                    s_keyboardHookID = Win32Interop.SetWindowsHookEx(
                        Win32Interop.WH_KEYBOARD_LL,
                        s_procKeyboardHook,
                        Win32Interop.GetModuleHandle(curModule.ModuleName), 0);
                }
            }
 
            // Get UIA ready to use.
 
            this.uiAutomation = new CUIAutomation();
        }
 
        protected override void OnClosed(EventArgs e)
        {
            // Remove the keyboard hook if we created it earlier.
 
            if (s_keyboardHookID != IntPtr.Zero)
            {
                Win32Interop.UnhookWindowsHookEx(s_keyboardHookID);
            }
 
            base.OnClosed(e);
        }
 
        private void buttonClose_Click(object sender, EventArgs e)
        {
            this.Close();
        }
 
        protected override void WndProc(ref Message m)
        {
            if (m.Msg == msgLinkerKeyboardHook)
            {
                // Get all the links now.
 
                RefreshLinks();
            }
 
            base.WndProc(ref m);
        }
 
        private static IntPtr KeyboardHookCallback(
            int nCode,
            IntPtr wParam,
            IntPtr lParam)
        {
            if ((nCode >= 0) && (wParam == (IntPtr)Win32Interop.WM_KEYDOWN))
            {
                Win32Interop.KbLLHookStruct kbData =
                    (Win32Interop.KbLLHookStruct)Marshal.PtrToStructure(lParam,
                    typeof(Win32Interop.KbLLHookStruct));
 
                bool fProcessed = false;
 
                switch (kbData.vkCode)
                {
                    // If my customer pressed the F9 key, post a message to the main form
                    // and get all the links in the foreground window.
 
                    case Win32Interop.VK_F9:
                    {
                        Win32Interop.PostMessage(
                            s_mainFormHandle,
                            msgLinkerKeyboardHook,
                            kbData.vkCode, 0);
 
                        fProcessed = true;
 
                        break;
                    }
                    default:
                    {
                        break;
                    }
                }
 
                if (fProcessed)
                {
                    // Prevent the message being processed further. A shipping app would
                    // consider carefully whether the key press should be allowed to
                    // trigger additional action in the foreground app.
 
                    return (IntPtr)1;
                }
            }
 
            // Pass on the key message.
 
            return Win32Interop.CallNextHookEx(s_keyboardHookID, nCode, wParam, lParam);
        }
 
        // *****************************************************************************
        //
        // THIS IS WHERE THE UIA ACTION STARTS!
        //
        // *****************************************************************************
 
        private void RefreshLinks()
        {
            // Clear the list of links if any are shown in the app at the moment.
 
            listViewLinks.Items.Clear();
 
            // Get the HWND for the foreground app. For Win32 calls use some interop
            // defined at the end of this file.
 
            IntPtr foregroundWindow = Win32Interop.GetForegroundWindow();
 
            // Now get the UIA element representing that foreground window.
 
            IUIAutomationElement element =
                this.uiAutomation.ElementFromHandle(foregroundWindow);
            if (element != null)
            {
                // Define a few local values because the managed wrapper doesn't seem to
                // expose them as I'd like. I've pulled these values from the SDK's
                // UIAutomationClient.h.
 
                int patternIdInvoke = 10000;
                int propertyIdControlType = 30003;
                int propertyIdName = 30005;
                int propertyIdValueValue = 30045;
                int propertyIsEnabledPropertyId = 30010;
                int controlTypeIdHyperlink = 50005;
 
                // Build a condition for the links of interest. For this app, I want all
                // elements that have a ControlType of Hyperlink, and an Enabled state of
                // True.
 
                IUIAutomationCondition conditionHyperlink =
                    this.uiAutomation.CreatePropertyCondition(
                    propertyIdControlType, controlTypeIdHyperlink);
 
                bool fEnabled = true;
 
                IUIAutomationCondition conditionEnabled =
                    this.uiAutomation.CreatePropertyCondition(
                        propertyIsEnabledPropertyId, fEnabled);
 
                IUIAutomationCondition conditionLinksOfInterest =
                    this.uiAutomation.CreateAndCondition(
                    conditionEnabled,
                    conditionHyperlink);
 
                // Now build up a cache request for all the properties and patterns that
                // we want to be cached with the list of links returned. I don't want to
                // incur the perf hit of going back to get all that data later.
 
                IUIAutomationCacheRequest cacheRequest =
                    this.uiAutomation.CreateCacheRequest();
 
                // Typically links expose their URLs through the Value property of the
                // UIA Value pattern.
 
                cacheRequest.AddProperty(propertyIdName);
                cacheRequest.AddProperty(propertyIdValueValue);
                cacheRequest.AddPattern(patternIdInvoke);
 
                // I only want to cache the data for the links, and not for any
                // descendants of the links.
 
                cacheRequest.TreeScope = TreeScope.TreeScope_Element;
 
                // Now make the one and only cross-proc call required, to get the list of
                // links and their names, URLs, and a way to programmatically invoke the
                // links.
 
                IUIAutomationElementArray elementArray = element.FindAllBuildCache(
                    TreeScope.TreeScope_Descendants,
                    conditionLinksOfInterest,
                    cacheRequest);
 
                if (elementArray != null)
                {
                    // Add the details about each link to the list.
 
                    for (int i = 0; i < elementArray.Length; i++)
                    {
                        IUIAutomationElement elementLink = elementArray.GetElement(i);
 
                        ListViewItem item = listViewLinks.Items.Add(
                            elementLink.CachedName);
 
                        item.SubItems.Add(
                            elementLink.GetCachedPropertyValue(propertyIdValueValue));
 
                        // Add a reference to the Invoke pattern for the link, so that we
                        // can invoke it later if necessary.
 
                        item.Tag = elementLink.GetCachedPattern(patternIdInvoke);
                    }
                }
            }
 
            labelLinkCount.Text = "Number of links found: " + listViewLinks.Items.Count;
        }
 
        private void buttonGo_Click(object sender, EventArgs e)
        {
            // My customer wants to invoke a link. Is there a link selected?
 
            if ((listViewLinks.SelectedItems != null) &&
                (listViewLinks.SelectedItems.Count > 0))
            {
                ListViewItem item = listViewLinks.SelectedItems[0];
                if (item != null)
                {
                    // Get the reference to the Invoke pattern for the link that was
                    // added earlier to the list item.
 
                    IUIAutomationInvokePattern invokePattern = item.Tag as
                        IUIAutomationInvokePattern;
                    if (invokePattern != null)
                    {
                        // Now invoke the link! Be ready for exceptions if the link no
                        // longer exists.
 
                        try
                        {
                            invokePattern.Invoke();
                        }
                        catch (Exception ex)
                        {
                            Debug.WriteLine("Failed to invoke element. " + ex.Message);
                        }
                    }
                }
            }
        }
    }
 
    // The below is standard interop to allow calls to Win32 functions, and nothing to do
    // with UIA.
 
    class Win32Interop
    {
        public const int WH_KEYBOARD_LL = 13;
        public const int WM_KEYDOWN = 0x0100;
 
        public const int VK_F9 = 0x78;
 
        [StructLayout(LayoutKind.Sequential)]
        public class KbLLHookStruct
        {
            public int vkCode;
            public int scanCode;
            public int flags;
            public int time;
            public int dwExtraInfo;
        }
 
        public delegate IntPtr LowLevelKeyboardProc(
            int nCode, IntPtr wParam, IntPtr lParam);
 
        [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
        public static extern IntPtr SetWindowsHookEx(int idHook,
            LowLevelKeyboardProc lpfn, IntPtr hMod, uint dwThreadId);
 
        [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
        [return: MarshalAs(UnmanagedType.Bool)]
        public static extern bool UnhookWindowsHookEx(IntPtr hhk);
 
        [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
        public static extern IntPtr CallNextHookEx(
            IntPtr hhk, int nCode, IntPtr wParam, IntPtr lParam);
 
        [DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
        public static extern IntPtr PostMessage(
            IntPtr hWnd, uint Msg, IntPtr wParam, IntPtr lParam);
 
        [DllImport("user32.dll")]
        public static extern IntPtr GetForegroundWindow();
 
        [DllImport("kernel32.dll", CharSet = CharSet.Auto)]
        public static extern IntPtr GetModuleHandle(string lpModuleName);
    }
}

Comments (2)

  1. DBZZ says:

    Guy,

    Shouldn't the last two arguments for the PostMessage DllImport be IntPtrs? LPARAM and WPARAM arguments can be either 32 or 64 bits depending on the platform.

    1. Hi DBZZ, thanks very much pointing this out. You’re absolutely right, those WPARAM and LPARAM args should be IntPtrs. Whenever I use interop code, I paste it in some somewhere, and clearly I don’t pay enough attention to what I’m pasting in as I should. I expect I copied this from one of my other experimental apps, and so I should fix it in whatever other apps I’ve used it in. Thanks for helping me to improve my apps!

      Guy

Skip to main content