My understanding of UI Automation
UI Automation means more to get a test result
UI Automation development is an important, unique, and effective bug finding process
UI Automation execution covers test hole which normal test cannot cover
Section 1: UI Automation means more to get a test result
UI Automation is an interesting topic. Before discussing the complexity, benefits, coverage, test holes, maintenances costs and other controversial factors involved in UI Automation, the meaning when a tester can create a successful UIAutomation case is somewhat different.
How UI Automation works? How the textbox gets filled, button gets clicked and window gets opened and closed automatically. It seems quite a magic but a successful UIAutomation is not limited on just making it work. Successful UIAutomation means at least but not limited:
1. Stable. A succeed case should always succeed no matter it executes in a powerful x64 machine, or a slow x86 VM. It should give the same result when the initial window position changes. It can test localized build.
2. Performant. UIAutomation should runs faster than manual execution because it is controlled by machine! It is very bad if the test sleeps for a specified time between each UI operation.
3. Flexible. The input and expected output should be configurable. If the test executes against large input, it should still generate the same pass/fail result. The execution time should be liner.
Given a Visual Studio 2008, how can you create such an UI Automation case to open notepad, type ABC, click file -> save menu, and type a filename in the dialog?
If you do not know Windows Message quite well, the most direct way is to use SendInput to simulate keyboard/mouse input. It works but the problem is:
1. You have to calculate the Window position. The initial window position changes and different resolution/DPI settings will be a nightmare.
2. You have to Sleep between each SendInput. For notepad, there is a timeframe between clicking File menu item and showing the sub menu. If test case does not wait, the clicking on save menu may be missed.
Window Message is a silver bullet? It works for most of the time, but the new problem is more difficult to solve:
1. If the target application is .NET WinForm, the HWND tree may change at runtime. If you hide a tree control, and show it, the Win32 HWND for the tree control is likely to change. WinForm also creates some transparent HWND child which may break the Automation.
2. There are a lot of traps to work with Windows Message. Suppose you want to set focus on the textbox of the target application, just calling SetFocus and passing the textbox HWND? No, please read:
"The SetFocus function sets the keyboard focus to the specified window. The window must be attached to the calling thread's message queue."
3. The SendMessage API returns only when the target application finishes the handling. However, it does not solve the sync issue between tester app and testee app completely.
4. What if the target application is WPF based??????!!!!!!
Let me stop showing how difficult it is. (It is indeed very difficult). Microsoft combines SendInput, Windows Message and the Accessibility support for UI element to create internal UI Automation tests.
For who do not know Accessibility support, please check:
Generally speaking, UIAutomation is just an inter-process communication. Like every other normal application, inter-process means:
1. How and where the communication happens? Windows Message, Socket, RPC, Named Pipe, Memory mapping file, WCF or even DLL Injection? (I will talk WCF and DLL Injection for UIAutomation)
2. Synchronization. How the test client knows a UI instruction finishes on target side, so that it can send the next instruction? How the test client distinguishes an initialization window and an initialized window so that it will block the button clicking before the button gets visible?
I can talk more on details later. This is how our UI Automation Framework makes it (at least but not limited):
1. Most of the UI Automation is DCOM invocation. UI element exposes IAccessible COM interface. WPF UI element also supports it. Check the following for more info:
2. SendInput, Windows Message and notification are widely used for special cases.
3. Window Hook and WinEvents are used to detect UI status change
4. WM_NULL, probing processor usage and thread condition of target application are used for synchronization.
Even with the support of UI Automation Framework, creating good UI Automation case is still a skillful task. You need to know where to use raw device input API and where to leverage the COM invocation API. How you pickup the correct synchronization mechanism? How to probe the target UI element effectively, using index, ID, or name? Starting from the root desktop Window or from a parent window? What if the target application shows MessageBox whose parent is desktop not the application main window? What if the target application uses async way to creates dynamic UI?
If you can design a scalable web application, it means you know programming language, threading locks, HTTP protocol, and probably database very well. If you can design a good UI Automation case, the meaning is not limited to the test result you can get. It means:
1. You know programming well, .NET Framework, multiple threading, C#...
2. You know Windows mechanism (Win32) well
3. You probably know how COM/DCOM works. There is no mystery for you any more.
4. You know how application fails and you know debugging well. (Debugging UI Automation is quite difficult than debugging your own EXE, and I promise the Automation will fail again and again before you know how to work it out)
5. At last, you know your test target well. Well, I will cover this in coming topics.
When the testers are armed with above skill sets, it will be quite easy to break other’s application. And it will be a solid tech base for them to develop other skills. When you know how to make your UI Automation work, you know how a typical developer works with code, how often the assumpations are incorrectly made, and why application fails so that you know how to break it.
If you know WPF Snoop, you see how DLL Injection can work for UI Automation. In prototype stage of my current project, I tried to create a new Automation Engine combing with DllInject and WCF. I inject a DLL into WFP application, and the DLL hosts a WCF server inside that. It opens a door for me to execute arbitrary code on target process. Then I can use WPF TreeWalker and reflection to do what ever I want.