SQL Databases on File Shares – It’s time to reconsider the scenario.

For those who have been around databases for any length of time, the idea of putting a database that you care about from either a reliability or performance perspective on an  (SMB – Server Message Block) file share seems like a crazy idea, but recent developments have made SMB-based file shares a very viable platform for production SQL Server databases with some very interesting advantages.

Historically, the perspective has been:

  • File shares are slow.
  • The connections to the share may be unreliable.
  • The storage behind the file share may be flaky.
  • SMB consumes large amounts of CPU if you can get it running fast enough.

Over the past few years, all of these conditions have changed, and in particular the work which has been done on the 2.2 revision to the SMB protocol has produced some stunning results.

So let’s look at these one by one:

File Shares are slow

There are two components to this one:

The raw speed of Ethernet vs  Fibrechannel and the speed/efficiency of the SMB protocol.

The transport layer has seen a very significant improvement in recent years.  Where at one point Ethernet was orders of magnitude slower than Fibrechannel, this is no longer the case.  Current commodity Ethernet is running up to 10 gigabit with 40 gigabit being tested, and on the near horizon.  This will put Ethernet on par with Fibrechannel from a bit-rate perspective, and the projections are that the two technologies will leap-frog each other from here out with neither one being a clear leader.

On the protocol front, the original SMB 1.x protocol was chatty, inefficient, and slow.  Over the last couple of years, the Windows file server team, while developing the 2.2 version of the protocol, has been using SQL Server, with a TPCC workload, as one of the performance benchmarks.

The benchmark configuration was to take a fibre-attached array, connect it to a server and run TPCC. 
Then add an identical server connected to the first with a single 1Gb link, and run the TPCC database on the new server with the original server functioning as a file server.

When they started, TPCC over a file server ran at ~ 25% the speed of direct-attach storage.  The team discovered several performance  problems in the stack, but one particular bug on the client side made a stunning difference.  The current results are that TPCC running over an SMB link as described above performs at 97% of the speed of direct-attach.  That is a stunning result, and one which is not limited to Windows file servers since the fix is on the client side.

So now, we have an SMB implementation running at speeds comparable to a fibre-attached array.

Connections to the share may be unreliable

Again, there are multiple parts to this one.  One aspect of this is that the underlying networking hardware has gotten very much more reliable in recent years.  Consumers and enterprises alike just wouldn’t put up with flaky network connections these days.  The popularity of FCoE (Fibre Channel over Ethernet) is an indicator of how much confidence of Ethernet as a storage transport has grown.

The second aspect to this one again comes back to the work done in the 2.2 version of the SMB protocol. 
With this version, SMB has a number of resiliency features built in.  If a link was to momentarily drop, in the past the connection would be lost and the file handle broken.  With the 2.2 version of the protocol, the link is automatically re-established and the application never sees the event other than a momentary stall in outstanding IO.

If we take the configuration a step further, the file server itself can be clustered, and now has the capability to failover a share from one file server to the other without losing handle state.  To clarify, SQL Server running an active workload can have the file server hosting the database files fail over, planned or unplanned, and SQL sees only a momentary drop in IO rates.

The storage behind the file server may be flaky or unreliable.

While it is always possible to put together an unreliable server, the tools now exist to incorporate very sophisticated reliability features right in the box.  Particularly with the advent of Windows 8 features recently announced, we have a pretty good toolset native in the OS.  We can create pools of storage which can be dynamically expanded.  Pools can be assigned a variety of RAID levels.  Many of the features which were previously only available in Fibre-attached arrays are now available with direct-attached storage on a Windows File Server.  When you add in the capability for failover and scale-out clustering, the reliability becomes very impressive.

SMB consumes large amounts of CPU if you can get it going fast enough.

This is actually a painful aspect of Ethernet which has hurt iSCSI as well as SMB and other protocols.

A recent transport development is the rDMA transport, which enables data to flow directly from the network wire into user space, without being copied through kernel memory buffers. 
This produces a huge reduction in CPU utilization at high data rates.  How much? 
I’ve seen an Infiniband-based SMB connection sustaining  > 3 gigaBYTES per second, while consuming around 7% CPU, using SQLIO 512K writes as a workload.  We’ve seen prototype units performing at twice that rate in the lab.

Additional benefits

Now that we’ve discussed why the factors which previously were blockers no longer are, let’s discuss some of the additional benefits:

Manageability

Consider the steps required for a DBA to move a database from one server to another:

SAN

SMB 

Take database offline/detach

Take database offline/detach

File request to remap LUNs

Database is attached to new server using UNC path

Meet with storage admin

Database is brought online

LUNs are unmapped from original server

 

LUNs are mapped to new server

 

LUNs get discovered and mounted on new server

 

Database is attached to new server

 

Database is brought online

 

 

Additionally, you have one set of tools for configuring storage, as opposed to separate tooling for each SAN vendor which you use.

Cost

Cost is always a concern, and with the capabilities which this platform brings to bear, we can accomplish what previously has required a much more expensive solution, for a fraction of the cost without sacrificing performance, reliability or manageability.

Example

As one example of the whole package, one reference configuration for a SQL deployment had originally been configured with Infiniband for communications, and several small SAN arrays – one per server in a rack.  By converting that configuration to a single clustered file server, the total cost of the solution was dropped dramatically:  ~$50,000 in FibreChannel hardware was saved, and the cost savings in moving from multiple FC arrays to the clustered fileserver was very substantial. 
The kicker though was that the performance of the solution was better than the original configuration, as previously it had bottlenecked on the
Storage Processors in the arrays.

So, the overall cost is substantially lower, the required features are delivered, and the performance is improved. 

What’s not to like?