Whidbey ADO.NET 2.0. Thoughts on how some new features can be misused.

There are a number of new features in ADO.NET 2.0 that have the _potential to be grossly misused_ ™. Don’t get me wrong, I like the features, some of them are good and some are extremely good. The reason that I am bringing them up is because these features when used in scenarios where they where not intended to be used are guaranteed to be painful.

Based on user feedback we cut the most glaring example, the SqlResultset. This was a really good feature and a highly requested one. Some scenarios absolutely require server side cursors and this was going to expose them in an easy to use fashion, and that is the problem. The SqlResultset exposed server side cursors a little too easily, people would use this for scenarios that absolutely should not use server side cursors. I want to make it clear that there is nothing stopping you from using server side cursors directly through TSQL, we have not disabled this scenario, it is just hard enough to use that it won’t be misused. If this feature had shipped on the beta it would have had my vote for PTBGM ™ (potential to be grossly misused) feature of the release. Fortunately for this blog we still have a number of promising runners up.

I am looking for feedback on which of these features you believe should earn the PTBGM for the 2.0 release, I would be happy to discus any of them at length.

My opinion: I want to make it very clear that this is not an official opinion. These are my personal thoughts on these features

PTBGM (Potential To Be Grossly Misused) features of ADO.NET 2.0

Async Callbacks:

True asynchronous functionality without creating extra threads, that is what async promises and it delivers! True non-blocking behavior, no background threads, true async network I/O this is truly a well designed feature… PTBGM kicks in when you want to use callbacks. Callbacks by definition happen in a separate thread, ADO.NET is not thread safe by design and you really need to know what you are doing before you attempt this. Pablo has a great article on this feature https://msdn.microsoft.com/data/DataAccess/Whidbey/default.aspx?pull=/library/en-us/dnvs05/html/async2.asp. I want to highlight the section on “Keeping WinForms Applications Responsive”  where he mentions that “It might be tempting to use asynchronous command execution to work-around this problem” I would substitute that for “don’t even think about using…”. Callbacks have their _very specific_ place in the grand scheme of things, and winforms are not it IMO.

My opinion: Don’t use callbacks for winforms. Make sure you understand thread safety before using the feature at all, we are lucky in having some really good documentation coming on this feature.

 

Multiple Active Result Sets per connection (MARS):

I don’t really believe this feature will be the winner of the PTBGM, but it is definitely a contender. When connecting with SqlClient to Sql Server 2005 (or with Oledb and the next release of the network libraries) we will now allow multiple active result sets per connection. In ado.net this mostly means that you will now be able to open multiple datareaders per connection as long as you use a separate command for each. The main scenario for this feature would be being able to modify the database as you process a datareader, it also enables running queries inside of the same transaction context and _can_ have a performance benefit in very specific scenarios where the cost of more connections (about 40k memory per connection) becomes a concern. There are a number of ways in which “things can go wrong” ™ however. Forgetting to close a datareader will now stop you from committing a transaction, modifying the state of the connection can give you unexpected results, reading and writing inside of a transaction context may isolate the reader from the changes and writer writer conflicts are non deterministic. Overall fairly minor and contrived concerns, my personal biggest problem with this feature however is that it is hard to squeeze real performance gains out of it, and easy to write hard to maintain code to try it. There are performance implications to using MARS across the board. On the client we run into an issue where creating a new batch is not free, we kind of work around this issue by pooling mars commands but still expensive if you don’t used the pooled functionality. On the network layer there is a cost associated with multiplexing the tds buffer, opening too many batches can be more expensive than opening another connection. On the server all MARS batches run in the same scope, worst case scenario your queries will end up running in the order received.

I really like the current way to use ado.net, open a new connection in each thread and rely on pooling to get performance and ease of development. It is tempting with the advent of mars to switch this model to one where we open one connection and rely on using a different command in each thread, I would not recommend it. Misusing (IMO) MARS to try to get additional performance has the potential of making your code look like a bowl of spaghetti for very unclear performance benefits.

My opinion: Don’t try to use MARS and threading to improve performance.

Fake MARS behavior:

Ok, this is a whole different ballpark. Not only is this well in the running for PTBGM, this has the potential of breaking existing applications. In existing versions of the framework we artificially restricted opening multiple datareaders even when using native providers that supported this feature like Oracle. We have finally lifted this restriction (about time!) and now we rely on the default behavior of the native provider. The problem starts when we run into a native provider where this functionality is faked under the covers by opening a new connection, if you can believe it this is the default behavior for the sqloledb provider when it can’t do real MARS. What this means is that if you use the ado.net 2.0 Oledb provider to connect to Sql Server 2000 (or 2005 with out the latest mdac version) you will now be “able” to open multiple datareaders (i.e. it won’t throw an exception) by opening a non pooled connection under the covers. If you write an Enterprise application in which you don’t realize you are opening new connections you will likely bring the server to its knees during heavy traffic due to the massive number of non pooled connection opens.

Customers have been running into Fake MARs problems with Oledb for a long long time now, up to version 2.0 of the framework it was not possible to run into this problem with managed code, from now on you are free to shoot yourself in the foot, please don’t do it.

My opinion: Don’t use the Oledb managed provider to connect to Sql Server, try the connection factories instead if you need to write provider independent code.

Standard disclaimer: This post is provided “AS IS” and confers no rights. My opinion is not an official endorsement, this are just my personal feelings about these features.

Rambling out.