The managed way to retrieve text under the cursor (mouse pointer)


Today's Little Program is a managed version of the text-extraction program from several years ago. It turns out that it's pretty easy in managed code because the accessibility folks sat down and wrote a whole framework for you, known as UI Automation.

(Some people are under the mistaken impression that UI Automation works only for extracting data from applications written in managed code. That is not true. Native code can also be a UI Automation provider. The confusion arises because the name UI Automation is used both for the underlying native technology as well as for the managed wrappers.)

using System;
using System.Windows;
using System.Windows.Forms;
using System.Windows.Automation;

class Program
{
 static Point MousePos {
  get { var pos = Control.MousePosition;
        return new Point(pos.X, pos.Y); }
 }

 public static void Main()
 {
  for (;;) {
   AutomationElement e = AutomationElement.FromPoint(MousePos);
   if (e != null) {
    Console.WriteLine("Name: {0}",
     e.GetCurrentPropertyValue(AutomationElement.NameProperty));
    Console.WriteLine("Value: {0}",
     e.GetCurrentPropertyValue(ValuePattern.ValueProperty));
    Console.WriteLine();
   }
   System.Threading.Thread.Sleep(1000);
  }
 }
}

We use the From­Point method to locate the automation element under the current mouse position and print its name and value.

Well that was pretty simple. I may as well do something a little more challenging. Since the feature is known as UI Automation, I'll try automating the Run dialog by programmatically entering some text and then clicking OK.

using System.Windows.Automation;

class Program
{
 static AutomationElement FindById(AutomationElement root, string id)
 {
  return root.FindFirst(TreeScope.Children,
   new PropertyCondition(AutomationElement.AutomationIdProperty, id));
 }

 public static void Main()
 {
  var runDialog = AutomationElement.RootElement.FindFirst(
   TreeScope.Children,
   new PropertyCondition(AutomationElement.NameProperty, "Run"));
  if (runDialog == null) return;

  var commandBox = FindById(runDialog, "12298");
  var valuePattern = commandBox.GetCurrentPattern(ValuePattern.Pattern)
                     as ValuePattern;
  valuePattern.SetValue("calc");

  var okButton = FindById(runDialog, "1");
  var invokePattern = okButton.GetCurrentPattern(InvokePattern.Pattern)
                     as InvokePattern;
  invokePattern.Invoke();
 }
}

The program starts by looking for a window named Run by performing a children search on the root element for an element whose Name property is equal to "Run".

Assuming it finds it, the program looks for a child element whose automation ID is "12298". How did I know that was the automation ID to use? The documentation for UI Automation suggests using a tool like UI Spy to look up the automation IDs.

Mind you, since I am automating something outside my control, I have to accept that the automation ID may change in future versions of Windows. (It's not like they check with me before making changes.) But this is a Little Program, not a production-level program, so that's a limitation I will accept, since I'm the only person who's going to use this program, and if it stops working, I know who to talk to (namely, me).

Anyway, afer we find the command box, I ask for its Value pattern. Automation elements can support patterns which expose additional properties and methods specific to particular uses. In our case, the Value pattern lets us get and set the value of an editable object, so we use the Set­Value method to set the text in the Run dialog to calc.

Next, we look for the OK button, which UI Spy told me had automation ID 1. We ask for the Invoke pattern on the button and then call the Invoke method. The Invoke pattern is the pattern for objects that do just one thing, and Invoke means "Do that thing that you do."

Open the Run dialog and run this program. It should programmatically set the command line to calc, then click OK. Hopefully, this will run the Calculator.

Just for fun, here's another program that just dumps the automation properties and patterns for whatever object is under the mouse cursor:

using System;
using System.Windows;
using System.Windows.Forms;
using System.Windows.Automation;

class Program
{
 static Point MousePos {
  get { var pos = Control.MousePosition;
        return new Point(pos.X, pos.Y); }
 }

 public static void Main()
 {
  for (;;) {
   AutomationElement e = AutomationElement.FromPoint(MousePos);
   if (e != null) {
    foreach (var prop in e.GetSupportedProperties()) {
     object o = e.GetCurrentPropertyValue(prop);
     if (o != null) {
      var s = o.ToString();
      if (s != "") {
       var id = o as AutomationIdentifier;
       if (id != null) s = id.ProgrammaticName;
       Console.WriteLine("{0}: {1}", Automation.PropertyName(prop), s);
      }
     }
    }
    foreach (var pattern in e.GetSupportedPatterns()) {
     Console.WriteLine("Pattern: {0}", Automation.PatternName(pattern));
    }
    Console.WriteLine();
   }
   System.Threading.Thread.Sleep(1000);
  }
 }
}
Comments (6)
  1. Dan Bugglin says:

    Huh, didn't realize this stuff existed.  I assumed .NET apps were entirely locked into their own little world unless you used P/Invoke.

    I guess programs like AutoIt wrap the native version of this API… it looks sort of similar to what they do.

  2. John says:

    (Sorry if this is double post, I didn't get a confirm from the blog software)

    We use this heavily in our "Automation Framework" and wrap it in a driver that allows us to easily create mappings and generate tests. Think Microsoft's Test Professional, but the editing of Automated Tests being much easier without the pain of re-recording everything. UI Automation was a huge boon for us, especially because its one of the few ways to programatically interact with Windows Presentation Foundation interfaces. This saved countless man hours of work, and in addition was included!

    The only thing I'd include in Raymond's blog is the suggestion to use the "Inspect" tool included in the SDK which based on my understanding is what they intend you to use, its almost like a UI Spy, but specifically for UIAutomation (and MSAA to an extent). It does a really nice job of showing you the "Tree" it sees along with just about any other piece of information you'd most likely need for writing UIAutomation "Clients".

    For all of its pros there are some cons, for one I couldn't for the life of me find a book on it, and at the time the MSDN documentation was a little scarce (but I've noticed that it has improved in recent years). We had some experience coming in dealing with automating programs which use the windows 'Common Controls' which was a big help. We also had access to the code which we were automating so were able to add AutomationId's to our projects which is a big help.

    Third Party support seems a little spotty, but it is getting better (I won't mention names as per the blog rules) with some of the more well known control makers out there. All of the "stock" controls have great support already which is a testament to the team responsible for WPF. I do wish there was more documentation (I know wrong blog) on how to implement custom AutomationPeers, our product has a few places where we've had to create custom controls internally and it'd be nice to actually extend a custom AutomationPeer instead of the FrameworkAutomationPeer default (although I appreciate that its there, for 95% of what we do its good enough because it enables you to slap on the AutomationId and get a simple Click and BoundingRectangle).

    The last thing I'd mention (and yes again I know not the right blog, I'd like to get in contact with the author(s) and buy him/her a case of beer) is that I'd like to see how you add AutomationId's to WinForms and MFC type applications. The MSDN Documentation seems to be very WPF-centric which makes sense as It seems the push for this was during the introduction of that technology. For those wondering, in a lot of cases when the Framework can't find a good AutomationId for these windowing libraries it falls back to using the Window Handle, which of course is not reliable.

    As always great blog Raymond, sorry for the length of the comment, but I'm super passionate about this stuff and I LOVE when your blog mentions this type of stuff! Brightens my day right up.

    [Thanks for the comment. It saddens me that UI Automation (and accessibility in general) don't get the visibility they deserve. -Raymond]
  3. Simon says:

    UI spy is dead (msdn.microsoft.com/…/ms727247.aspx), and we can't find it anymore, but Inspect (make sure you get the last version that has the tree view) is better anyway.

  4. Marco says:

    John, you can use Dynamic Annotation API to set automation ids (msdn.microsoft.com/…/dd318060.aspx).

    Check also Michael Bernstein's blog post "Using Dynamic Annotation with Child IDs" (blogs.msdn.com/…/using-dynamic-annotation-with-child-ids.aspx).

  5. Gabe says:

    I tried out Inspect, and left it running minimized for a few days. I must say that it caused several applications to exhibit really strange behavior. It must trigger something that causes programs to think some accessibility mode has been activated.

    It's a shame, because it's a really neat tool that is the kind of thing that's useful to have running.

  6. Alasdair says:

    > It must trigger something that causes programs to think some accessibility mode has been activated.

    Inspect looks like it sets SPI_SETSCREENREADER, which indicates to Windows and other applications that a screenreader is running and (if necessary) they should make themselves more accessible – possibly even turning on and off UIA (or MSAA) support.

    A good example is the View tab in Folder Options in Windows Explorer. Run Inspect, open it up, and you'll see that the checkboxes on that tab now have ON and OFF appended to their labels so that screenreader users can tell if they are checked – clearly the checkboxes don't report that through UIA or MSAA correctly, and there was some reason why they couldn't be made to do so.

    INFO: How Clients and Servers Should Use SPI_SETSCREENREADER and SPI_GETSCREENREADER

    support.microsoft.com/…/180958

    [Those checkboxes were written before MSAA Annotation was invented, so they had to show the ON/OFF state some other way. -Raymond]

Comments are closed.