Using UI Automation to explore a map

A few weeks ago I was having coffee with a colleague who’s blind, and I asked him about exploring a map. He said that he’d like to be able to move his finger over a map of the US, and have the name of the state beneath his finger spoken. By doing this, he could hear the names of all the states that surround his home state. So I set to work exploring how I could build a simple Windows 8.1 Store app which might be of interest to my colleague.

 

Figure 1: Basic map of US states, with a speech bubble showing over it.

 

The “State Your Name Please” app

The first thing I needed to do was build an app which presented a map of the US. So in Visual Studio Express 2013, I created a new blank app from the XAML template. I choose XAML, as my previous app was HTML/JS, and I do feel that variety is indeed the spice of life.

I next needed to add a map, so added a reference to “Bing Maps for C#, C++, or Visual Basic” to my project. For this experiment I had the map created in my main page’s initialization code, but I expect this could all have been defined in the XAML just as easily.

So at the top of MainPage.xaml.cs, I added:

using Bing.Maps;

 

And then in the main page’s constructor, added this:

_map = new Map();
_map.Credentials = <My app’s credentials for leveraging Bing Maps>
_map.ShowNavigationBar = false;

// Zoom to show the US.
Bing.Maps.Location northWest = new Bing.Maps.Location(46, -118);
Bing.Maps.Location southEast = new Bing.Maps.Location(32, -76);
_map.SetView( new Bing.Maps.LocationRect(northWest,southEast));

MainGrid.Children.Add(_map);

 

When I then run the app, I see the map I’m after.

Figure 2: Screenshot from the State Your Name Please app, showing a map of the US.

 

The next thing I need to do is react to user input at the map. So I added the following pointer event handlers:

_map.PointerPressedOverride += MapContainer_PointerPressed;
_map.PointerMovedOverride += MapContainer_PointerMovedOverride;

 

In response to either of the pointer pressed or moved events, I take action to try to find a state name associated with the point at which the event occurred.

The map control makes it really easy to find the latitude and longitude associated with a point of interest on the screen. If for some reason the attempt to get the lat/long fails, I simply ignore the event.

Location location;

if (!_map.TryPixelToLocation(pt, out location))
{
    return;
}

 

Now that I have the lat/long, I call a handy REST API to gather some useful information about this lat/long. The GetStateName() function below is my own private function to find the state name amongst all the data returned from the API call.

HttpClient client = new HttpClient();

string uri = "https://dev.virtualearth.net/REST/v1/Locations/" +
                 location.Latitude + "," + location.Longitude +
                 "?o=xml&key=" + _map.Credentials;

string outputMessage = "No data available.";

try
{
    HttpWebRequest request = WebRequest.Create(uri) as HttpWebRequest;
    HttpWebResponse response = await request.GetResponseAsync() as HttpWebResponse;

    Stream stream = response.GetResponseStream();
    var reader = new StreamReader(stream);
    var result = reader.ReadToEnd();
    outputMessage = GetStateName(result);
}
catch (Exception e)
{
    Debug.WriteLine(e.Message);
}

 

The app will speak “No data available” whenever the call to the API fails, for example when there’s no internet connection.

Given that the call to the REST API is asynchronous, there’ll be a small delay between the user interacting with the screen and the state name being determined. But in practice this doesn’t hurt the user experience. In fact I’ve added additional flags to prevent a state name being spoken while speech is still in progress. If I didn’t do that, then as a finger is moved around the map, a state name would start to be spoken and then be interrupted with the next state name. The way I’ve set things up, the current state name will be spoken in its entirety, and then the next pointer press or move will trigger the speaking of the next state name.

Having got the name of the state, I then use the Windows 8.1 Speech API to have the name spoken with the default voice on the device.

SpeechSynthesisStream speechStream = await
    _speechSynthesizer.SynthesizeTextToStreamAsync(outputMessage);

SpeechMedia.SetSource(speechStream, speechStream.ContentType);
SpeechMedia.Play();

 

The _speechSynthesizer object is a SpeechSynthesizer I’d created during app initialization, and SpeechMedia is a MediaElement that I’d added to the Main Page.

So basically, that’s it. The app I wanted to build was achieved with the three simple steps of:

1. Create a new app showing a map.
2. Use the map control and REST API to get the state name associated with a point on the map.
3. Use the Speech API to have the state name spoken.

 

But since I was having so much fun, I updated the app to also output the county name and city name depending on the current zoom level. The county and city names are both available in the response from the REST API call, so it seemed useful to also have those spoken if the user’s zoomed in to a level where county and city names seem of interest.

Having added this new county and city name feature, this means the map’s zoom and pan touch gestures are helpful, so I needed to add a mode whereby gestures would pan rather than trigger the speaking of a state name. I added a related checkbox to the appbar, and depending on the state of the checkbox, my pointer event handlers would either take the action described earlier, (and mark the event has handled,) or take no action at all and so allow the map's default zoom and pan touch gesture functionality.

And one final feature I added was to optionally have a popup appear on the screen near the input point, which contained the state name. I did this because the app seemed not only useful to users who are blind or have low vision, but also to users who find the multimodal output of both text and audio help with comprehension.

Figure 3: Screenshot from the State Your Name Please app, visually showing the name of a state.

 

So that completed Version 1 of the State Your Name Please app. I'd found it was straightforward to build a simple app which spoke the name of states as the user explores the map of the US through touch. I was feeling pretty pleased with the app until someone pointed out that it didn’t speak at all when Narrator and touch was used to explore the map.

 

Updating the app to work with Narrator

When Narrator's used with touch to interact with an app, Narrator will handle the user’s input events and then query the app for information relating to where the touch occurred. For example, Narrator will query an element for its name and whether it can be invoked. Because Narrator is being used to control the app, it could easily becoming confusing if the input events were also handled by the app itself, independently of Narrator. If both Narrator and the app were reacting to input events, that might result in two potentially conflicting user experiences. As such, when Narrator is being used to interact with the State Your Name Please app, the pointer event handlers I’d set up earlier don’t get called.

It’s worth a quick note here to say that Narrator uses the UI Automation (UIA) API to gather information from the app’s UI and to programmatically interact with that UI. The UI framework, (in this case XAML,) will provide some information by default for the app, and that information will be propagated through the UIA API to Narrator. The app will often need to take some additional steps to make sure that the full set of information about its UI is exposed, (for example, adding accessible names for Edit or List controls).

And we’re not just taking about Narrator here. By making your app programmatically accessible, your UI can be leveraged by users of other screen readers, or Windows Speech Recognition, or any creative UIA client apps that can be helpful to your customers.

So given that my app’s always interested in a particular point on the map, when a Narrator user is using touch at the map, how does my app know what point the touch occurred? We already know the pointer event handlers I set up earlier won’t get called. The answer is that the app can react to a call that UIA makes into the app, when UIA needs to know what object will provide the relevant accessible information following input occurring at a particular point at the app.

Some custom accessible properties of your UI can be set extremely easily in a XAML app, by using the AutomationProperties class. But a more powerful way to customize the accessibility of your UI is to create your own custom AutomationPeer. Many controls in XAML have associated AutomationPeers that provide the accessible functionality of that control. For example, a Button has a ButtonAutomationPeer. So I created my own custom AutomationPeer that would provide the accessible functionality that I wanted for the map.

Note that I couldn’t actually customize the accessible functionality of the map control itself, so I created a new control class that would host the map, and created my new custom AutomationPeer for that new control class.

In XAML:

<local:CustomMapContainer x:Name="MapContainer" />

In code-behind:

class CustomMapContainer : Grid
{
    protected override AutomationPeer OnCreateAutomationPeer()
    {
        Return new CustomAutomationPeer(this);
    }
}

class CustomAutomationPeer : FrameworkElementAutomationPeer
{
    CustomMapContainer mapContainer;

    public CustomAutomationPeer(CustomMapContainer owner) : base(owner)
    {
        mapContainer = owner;
    }

    protected override AutomationPeer GetPeerFromPointCore(Point point)
    {
        …
    }
}

 

The GetPeerFromPointCore() function gets called as the Narrator user moves their finger over the map, and in that function, I take similar action to the action I'd taken in the previous version of the app when the pointer was pressed or moved.

 

Where things got interesting

It’s all well and good to try to find the required state name beneath the GetPeerFromPointCore() function, but that action requires us to call the asynchronous REST API. I really didn’t want to block here waiting for the asynchronous operation to complete, as sometimes that might take a while, and who knows what effect that would have on UIA and the user experience. UIA has plenty of other things to be getting on with, and doesn’t want to be waiting for apps to provide it with data. (In fact, blocked UIA calls will time out after a while, in order to prevent the user experience being ruined by a misbehaving app.)

So the app’s GetPeerFromPointCore() doesn’t wait for the REST API call to return, but simply kicks off the request to get the data. And importantly, the function returns null to UIA here. This is effectively saying that no AutomationPeer exists for the point of interest. As such Narrator won’t speak anything based on that call, and when we later have the REST API response, we can take action to have the appropriate state name output. If GetPeerFromPointCore() did return something here, it would be based on some earlier, stale information, because until the REST API call returns we cannot know the current state name.

 

A couple more comments on the implementation of GetPeerFromPointCore():

1. I only let it call the REST API if it’s at least one second since its last call to the API. I wanted to avoid the API being called rapidly in a way that’s of no use to the user, and I so limited it to once per second.

2. When my existing function was called which got the lat/long from the point, the point supplied to GetPeerFromPointCore() behaved differently due to the current screen scaling, So before calling my existing function, I had to do this:

protected override AutomationPeer GetPeerFromPointCore(Point point)
{
    double scale = (double)DisplayInformation.GetForCurrentView().ResolutionScale / 100;

    Point adjustedPoint = new Point((double)point.X / scale, (double)point.Y / scale),

        <Pass adjustedPoint to _map.TryPixelToLocation() >

 

At some point the REST API call will complete and we’ll have the state name that we need. One option then is to simply set the accessible name of the custom control hosting the map to be the state name. This could be done by setting the AutomationProperties.Name property. What I actually ended up doing was caching the current accessible name for the control, and returning it whenever UIA called my custom AutomationPeer asking for the control’s name. 

protected override string GetNameCore()
{
    return _accessibleName;
}

 

But doing this wasn’t sufficient to build the Narrator user experience I was after, because simply changing the name of the element that Narrator was already interacting with wasn’t sufficient to get Narrator to speak the new name. Instead I needed to add use of the UIA LiveRegionChanged event, which specifically asks screen readers to inform users of some change in the UI.

Using LiveRegions is a two-step process. First we need to mark the element as being a LiveRegion. I set the element to be an “Assertive” LiveRegion, which means that screen readers should notify the user as soon as the change in the UI occurs.

 <local:CustomMapContainer x:Name="MapContainer"
    AutomationProperties.LiveSetting="Assertive"/>

 

The other step is to raise the LiveRegionChanged event itself. So once the REST API call completes and I’ve set the accessible name on the custom control hosting the map, I have the associated custom AutomationPeer raise the event.

customAutomationPeer.RaiseAutomationEvent(AutomationEvents.LiveRegionChanged);

 

Through the above two steps, whenever we set a state name to be the accessible name for control and raise the event, Narrator will speak that name to the user.

 

UIA needs to interact with my custom control, not the map

Given that it’s the AutomationPeer for my custom control that’s doing all the work here, I need to make sure that the user never interacts with the map directly when using Narrator. But if the user's interacting with the map and Narrator’s not running, I want my original pointer event handlers to be called directly. It’s important to me that I don’t have to set up modes in the app, such that there's a mode for when Narrator's running, and a mode for when narrator's not running. So I decided to always hide the map from hit-testing, (unless the map’s zoom and pan gestures have been specifically enabled by the user,) and move the pointer event handlers up to the custom control hosting the map.

_map.IsHitTestVisible = false;

MapContainer.PointerPressed += MapContainer_PointerPressed;
MapContainer.PointerMoved += MapContainer_PointerMovedOverride;

 

One unanticipated step I needed to take related to Narrator's hit-testing over the custom control hosting the map. I’d derived my custom control from a Grid. That decision apparently affected Narrator’s hit-testing, and I found Narrator could ended up interacting with the map rather than my custom control which hosts the map. Presumably this is because in most situations the user wants to interact with the contents of a Grid, not the Grid itself, and so Narrator's decision to ignore the Grid seems fair enough. So to workaround this, I updated my custom AutomationPeer to declare that the custom control had a UIA control type of Text.

protected override AutomationControlType GetAutomationControlTypeCore()
{
    return AutomationControlType.Text;
}

 

And on the subject of Control Type’s, I also overrode the custom AutomationPeer’s GetLocalizedControlTypeCore() in order to provide a more appropriate string for the custom control. For this experiment I’ve hard-coded an English string in the function despite the fact that the function's purpose is to return a localized string. If I ever ship a non-English version of the app, I’ll localize it properly.

protected override string GetLocalizedControlTypeCore()
{
    return "Explorable map";
}

 

And at one point while working on all this, I did find there were times when input at the app still led to UIA interacting with the map control itself rather than my custom control which hosts the map. So an additional step I took to make sure that didn't happen was to update my custom AutomationPeer such that it reported that the custom control had no child elements at all. 

protected override IList<AutomationPeer> GetChildrenCore()
{
return null;
}

 

It's possible that I could revisit all the steps I took to get this working and find that I can tidy them up to reduce the amount of work done, but I've not looked into that yet. (A side effect of making all the changes described above is that the Inspect SDK tool reports that the custom control hosting the map has no name. Inspect doesn't react to LiveRegionChanged events.)

 

So what’s with the big Zoom and Pan buttons?

By this stage I have an app which shows a map of the US, and a Narrator user can explore the map through touch and hear the name of the state beneath their finger spoken. But I’ve not done anything to allow the Narrator users to zoom or pan the map, and so the functionality to speak county and city names is inaccessible. I’d like to investigate whether it’s possible for me to update the app so that the map can be zoomed or panned through Narrator gestures, but for now I just added an option to show zoom and pan buttons down the right side of the app. These can be invoked by Narrator. As it happens, I’ve not taken the time yet to implement the pan buttons correctly, as they don’t pan the appropriate amount based on the current zoom level. If I get feedback that someone would like me to do that, I’ll look into it.

 

Summary

It was pretty straightforward for me to build the app that my colleague described, whereby the name of a state can be spoken as a finger is moved over the map. Getting the app to work with Narrator required a lot of experimenting to see what effect certain changes in the app had on the Narrator user experience. The Inspect SDK tool was essential during development, as that helped me determine whether unanticipated behavior was likely due to how the app UI data was being exposed through UIA, versus how Narrator was reacting to that data.

After a while I was able to provide a sufficient Narrator user experience, and this was done in such a way that when Narrator’s not running, the app continues to speak just fine. So the app has potential to be useful to users who are blind or have low vision, and also users who can find comprehending details shown on a map to be a challenge. In fact many students who are getting familiar with maps for the first time would also find it useful.

Interesting stuff to be sure.

Guy

P.S. If you want to try out the app itself, it's in the Windows Store at State Your Name Please.