The other day, I listened to a "Windows Presentation Foundation Under the Hood" talk by Leonardo Blanco and Mike Hillberg. Below are the notes from it, excluding Q&A session.
Components that make up the platform:
1. Element System - the most visible part of the system
- Styles, bindings, controls
- Text layout (PTS = pagination and table services)
2. Property System
- Change notification
3. Input System
- Mouse, keyboard, pen
4. Event system
- Routing of messages
- Class handlers
5. Font System (used implicitly when you have text on a page)
- Metrics (e.g. how long a sentence in pixels is when displayed)
- Cache (reuse bitmaps for better performance)
6. Visual System - manages everything that's drawn on screen
- Video, Audio
- 2D, 3D, Animation
- Text layout (lower level, line at a time)
7. Transport (not exposed)
- Message based
8. Composition System - this is where pixels are actually made
- 2D, 3D
- Hardware, Software
A simple example of XAML application that has a list box containing two buttons:
When you compile this app, the result is BAML = pre-parsed binary version of XAML
Then, you take BAML + other (non UI definition) .NET code + resources (e.g. images) and package it as an executable.
When you run this executable, the layout engine might introduce other elements (in addition to list box and two buttons), e.g. ScrollViewer, which in turn is comprised of two buttons and a bar, StackPanel... The point is that relatively small number of XAML elements might end up being a much larger number of visual tree elements. During the rendering process, the visual tree is represented as a bitmap. The rendering process is multithreaded, and could be done remotely or locally (that's where transport plays a role).
How is user input processed? When a user clicks a button,
1. User32.dll gets the message
2. WPF converts message to an input report
3. If input report is a mouse move, then
- Do structural hit testing on visuals to see what element was hit
- Now do a geometry hit testing through geometry data (it's not all rectangles... if your element is star shape, at this point you'll have a more precise idea whether the element was hit or missed; if the image is an outline and not a solid object, then the outline must be hit for the event to occur)
4. Convert report into one or more events processed by application
5. Notify all subscribers about the event by building event route and raising the events, which will likely change the look and feel of displayed elements.
So, if the same element exists in the visual and in the element layer, are you better off using visual layer for better performance?
Elements map to visuals. Elements are not wrappers on top of visuals. They are visuals by class inheritance. So, performance benefits, if any, will be negligible. It's strongly recommended to use the highest level where the element you need exists; i.e. if it exists in the element layer -- use it.