PRB: SQL Backups to a UNC path fail with OS Error 1 (Incorrect Function)

Symptoms When you try to take a SQL Server database backup to a UNC path (\\server\folder\, you get this error,

  Msg 3201, Level 16, State 1, Line 1
Cannot open backup device '\\server\folder\db.bak'. Operating system error 1(Incorrect function.).
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

This issue is intermittent and backup starts failing after some time.

Troubleshooting
Those of you used to troubleshooting much be thinking of some permission or network issue. But what is the cause of OS Error 1 and what should you do? Read on…
Here are some things we already ruled out as the cause of above error,
      1. Permissions on the destination folder are fine.
      2. There is no network/on-the-wire issue.

To rule out the network issue, I tried to take a backup on the same machine as the SQL Server and still got the same error. What I did, was create a shared folder and use the UNC path for backup. So the backup is actually happening on the same local disk, but as a UNC path. This also failed with the same error, hence the network part is ruled out.

I should mention that once I ran the BACKUP DATABASE command it fails immediately. The size of the database I was testing was 4MB. Something else was at play here.
I decided to take a Process Monitor trace to see what was happening to the file when the OS Error 1 was being returned to SQL Server. Looking at the logs from ProcMon, I saw this event for CreateFile API.

Date & Time: 09-02-2011 08:40:17
Event Class: File System
Operation: CreateFile
Result: NOT IMPLEMENTED
Path: \\server\folder\dba.bak
TID: 9128
Duration: 0.0004262
Desired Access: Generic Read/Write
Disposition: Open
Options: No Buffering, Non-Directory File, Open No Recall
Attributes: N
ShareMode: None
AllocationSize: n/a

So what the above log is telling us is this,
      1. SQL Server is making a CreateFile API call to create the backup file.
      2. This is being done with the argument FILE_FLAG_NO_BUFFERING, so the system caching is not used. 
      3. This call failed and result was “not implemented”.

This was starting to look like an issue outside SQL Server. I started to look at the windows event logs and noticed that at the same time as the backup failure we see this event in the system event log.

Log Name: System
Source: nlemsql64
Date: 02-02-2011 22:41:43
Event ID: 1180
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: TDAB-SQLDA-CRA.p-tdbfg.com
Description:
The description for Event ID 1180 from source nlemsql64 cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

A Bing search for the event source “nlemsql64” helped me find that this source is for software called “NetLib® Encryptionizer® for SQL Server”. Reading their product documentation I came to understand that this does server-side data encryption for Microsoft SQL Server databases. This is being done by an encryption filter driver located in C:\Windows\System32\Drivers\nlemsql64.sys

Now, is it possible that this filter driver is hooking on to SQL Server Backup calls and causing this issue? I went back to the process monitor trace and looked at the call stack for the failure event. For those who aren’t familiar with Process Monitor for File System events, you can right-click on the event -> Properties -> Stack. This gives you the thread stack using Public Symbols.

Here is what I saw for this issue,

ProcMon

As I have highlighted in the above picture, we see the nlemsql64.sys (filter driver) is indeed hooking onto the SQL server call. This immediately raised a red flag, since such activities aren’t supported in SQL Server.  This is what we call a “detour” and you can read more about this here,

The use of third-party detours or similar techniques is not supported in SQL Server
https://support.microsoft.com/kb/920925

Once we came this far, we tested the backup on another machine which did not have this piece of software installed. As expected, the backup completed successfully. My customer contacted NetLib Support since we isolated this issue to only machines which had the software + filter-driver installed and below is a snippet of their response.
{
When the Encryptionizer drivers query information about a database to determine whether it is encrypted or not, they issue a IoQueryFileInformation API. It is rare but not impossible for the API to return an end-of-file status, or a not-implemented status (e.g., for a named pipe). These are normal status returns, but Encryptionizer was erroneously reporting them as errors. This was causing other areas of the system, that rely on such error reporting, to believe there was an actual error and fail.

So in other words, Netlib was saying there was an error when in fact there was not. The driver has been changed to keep the normal status internally for its own purposes, but does not report the normal status returns as errors. This is not a major change to the drivers - the changes have been made and are currently in testing.
}

Conclusion
For people out there who are experiencing similar issues and also have NetLib Encryptionzer installed, please contact NetLib and ensure the driver update is implemented, when it is released. This is not a SQL Server issue as proved above. I’m posting this here and hopefully it saves you some time that you might have spent troubleshooting this from a SQL perspective. 

P.S – I will update this blog post once I find out the specific build of the Netlib driver that has the fix present.

UPDATE: 22 MARCH
The new driver version for nlemsql64.sys which has the fix is 2008.401.27.0. You need to contact NetLib support to obtain this driver as it is currently not part of the full release cycle, and it is most likely available only on request.

As always, stay tuned for more. Cheers!

Regards,
Sudarshan Narasimhan
Technical Lead, Microsoft SQL Server CSS