Yukon Engine: CLR Integration I

I thought it might be of interest to some to put a few notes up from the talk
I gave on Yukon development
last week. If nothing else, it's helpful to me to
have as a reminder for the next time I give this talk! I'll split this into several
parts for ease of access, and post them over the course of the week.

In the current release of SQL Server, there are effectively three ways to embed custom
logic into your database:

1.
Writing stored
procedures
, user-defined
functions
and triggers using
Transact-SQL;
2.
Creating
extended stored procedures
using a C-style DLL written to a special API;
3.
Building
your logic into an external component
(perhaps .NET or COM) and ensuring that
all database manipulation occurs via your own component.

Traditionally developers have opted for a mixture of optionsĀ 1 and 3. Option
1 is perfect from the point of view of a database purist - everything that impacts
the integrity of the database can be kept there; nothing can bypass the code, as it's
embedded in the database. However, whilst T-SQL is great for set-based data manipulation,
it doesn't have the same degree of structure and elegance as most component-orientated
languages, offers limited support for string handling and many other constructs, and
doesn't perform awfully well when you're not dealing directly with the database.

Option 3 allows you to build access logic that abstracts data from a relational form
into more business-centric idioms such as customer, purchase order etc., as well as
offloading this work from the database server to other application servers. But because
the data tier is separate from the database itself, it's not always easy to integrate
any semantic validation from here into other data services such as replication or
reporting.

Most people steer away from option 2 (extended stored procedures), because they're
quite fragile due to their in-process nature - a memory leak in one of these can bring
the whole database server down, for instance. They're also rather awkward to write.
The guarded
wording used in this security note
is certainly enough to scare me off recommending
them, at any rate.

The big change in SQL Server "Yukon" is that the Common Language Runtime (CLR) is
hosted directly in the database engine, allowing .NET managed code to be used for
stored procedures, functions or triggers. On the surface, this is the best of all
worlds - your code resides in the database, but can take advantage of the full richness
of the .NET Framework Base Class Libraries. It performs better than T-SQL for computationally-intensive
tasks, can be secured using both the SQL security architecture and .NET Code Access
Security, and allows you to use any .NET language to develop everything from the front-end
user interface to the back-end database logic itself.

Here's a few examples of how you might use this capability to build a richer database:

- Utilise the .NET cryptography classes to encrypt sensitive data in the database for added security;

Use regular expressions in a trigger to validate structured string data such as email  
addresses, phone numbers and postal codes prior to insertion in the database;  
  • Create table-valued functions that combine data from a SQL table with data from an
    external source or other values derived through computation and lookups based on the
    existing data.

On top of all this, you can also create custom user-defined data types and aggregates
using the .NET integration; for example, you could have a custom type that represents
a financial instrument and then use a custom aggregation to ensure that it was rolled
up appropriately for reports.

One thing that struck me in particular about the CLR integration was how Yukon took
advantage of the extended CLR hosting APIs in Whidbey. Traditionally the CLR takes
care of memory management, garbage collection and thread support itself, but when
it runs in Yukon these services are delegated to the database engine. The engine has
a much better understanding of the current system load as a whole and can therefore
manage memory and threads appropriately for the entire execution environment. This
results in a more robust and scalable solution than if these services were left within
the CLR itself.

In part two, I'll talk more about how you take advantage of these extensibility points
using Visual Studio "Whidbey".