SQL Server Best Practices: File Layouts (Revisited)

In a previous "SQL Server Best Practice" post, I made reference to file layouts and where various file types should go. I got a comment there that I felt deserved another post.

The question was:

"...7 individual drives?  If this is correct do you also recommend drive redundancy like raid 1 or 5?  Are there recommendations for someone with a smaller server as far as how to group say logs and backups on the one set of drives, data on another, etc. if I don’t have more than possibly 2 or 3 separate spindles?"

This brings up a good point. If you have a single server with only one or two internal drives, what then? And what about the opposite extreme, where you have a SAN setup somewhere else? Let's take a look at each of these situations.

If you're on a small system with only one or two drives, then odds are the application has a small load, and you can get away with either not separating the files or at least splitting log and data files. I want to stress that if you've under-sized your hardware, you're going to suffer performance issues - add the drives and do it right. But if in fact you are on a small server that really doesn't have that heavy of a load, then I would recommend separating the files in the order I posted them in the other entry. At a minimum, I try and separate log and data, and backups and data. And I always recommend RAID for any production server - it's just something you should consider part of the framework of buying any server hardware for any purpose.

Now suppose you're using a SAN. You might even have a SAN administrator at your firm that only allots space, not drives. In that case you do the best you can. But if you can get a face-to-face with that person, explain how SQL Server works with files, and why more spindles are better, and how the file separation helps. For your part, you should understand the basics of how RAID works, and you can find that on http://raid.com. As a general rule, RAID 1 is good for sequential write operations (like the log file does) and RAID 5 is an acceptable tradeoff between space usage and heavy read patterns for data. Of course, your mileage will vary, so it's best to understand how SQL Server works, and then work with your SAN administrator or vendor to set the placement strategy for the files.

Why all this focus on the file placement? Well, after the network card, the Disk Subsystem is one of the slowest in the system. A good strategy for separating these files can dramatically affect your performance. There are some great techniques to find disk usage, latency and so on, and in the future I'll point some of these out. For now, if you know of something that has helped you in the past with this learning curve, feel free to post a comment here.

Comments (4)
  1. mike says:

    HI I just inherited a sql server 2008 containing an oltp vldb, the database is 1 terabyte with one/only default filegroup. Since I have had no experience with vldbs I have frantically been reading up them as much as I can. The server was setup as 10 terabyte D drive on the SAN; the database is on that drive. Everything I have been reading about for VLDB says it should have been setup on with multiple drives. My question is since it's on the SAN and so if I create multiple filegroups on the same D drive will there be any advantage or should I have a meeting with SAN people and talk about breaking that 1 10 tera drive drive into perhaps 10 …1 teraybyte drives. Like I mentioned my experience with VLDB and SAN is limited so not sure what questions I should be asking etc. Thanks Mike

  2. BuckWoody says:

    Hello Mike – recommend you read the entries from Paul Randal (SQL Skills) and Denny Cherry. They have a lot of good info on VLDB's.

    There are so many variables on the decisions, not the least of which is the type of SAN you have, that you may want to invest the money to have one of them come out and take a peek at the system.

  3. brad says:

    hi Buck

    I am looking for some suggestions.We have a very large table.. let' call it custorder table.

    Most of the reporting queries running against it, are for the current year(2011)

    We want to partition this table against 5 separate physical hard disks using the partition key choosen. One suggestion is to use the 'order date' breaking the data up by calender year and the other is the 'customer id'.

    The reasoning behind partitioning the table using the 'order date' , so when querying for 2011 data just the latest partition would be searched all other partition would not be (partition elimnation).

    The other idea is to partition by 'customer id' (keeping in mind as I said before most reporting queries are for the current year) all the separate partition(on separate hard disks) would be invoked to search resulting faster performance.

    Also which partition key would provide better performance on new record inserts( primary key being customer-id and order-date)

    I am looking for advice or suggestions.

    thanks in advance,


Comments are closed.

Skip to main content