Why the obsession with random I/O within the context of SSAS?

by Denny Lee

As many of you know from the various blogs, whitepapers, and conferences from SQLCAT, there is a big obsession or compulsion toward I/O by various members of SQLCAT. If you’re not already familiar with this topic, definitely reference Mike Ruthruff’s whitepaper Predeployment I/O Best Practices.

But as could be seen from the SSAS Maestros session in Redmond this week (for more information, check out What is the SSAS Maestros?), there is even an obsession for IOps even within the context of Analysis Services as noted in the papers below

 

So why is the obsession?

As noted in the SSAS Maestros course, if you think I’m bad, ping my counterpart Thomas Kejser who is even more obsessed and all-knowing about IOps. The reason for this focus is because for many queries within Analysis Services (especially for enterprise scale cubes), there are a lot of threads that hit the storage engine. This means that the threads will be hitting disk – i.e. the disk I/O subsystem. And even if the each individual thread causes only sequential I/O, the cumulative effect of all of those threads is random I/O. And as noted in the below graphic - which is from Michael Anderson’s excellent Scalable Shared Database Part 5 post – both the IOps and MBps is substantially slower when the system is random instead of sequential.

 

So what can we do about it?

As noted by the our posts Analysis Services Distinct Count Optimization Using Solid State Devices as well as the paper: REAL Practices: Performance Scaling Microsoft SQL Server 2008 Analysis Services at Microsoft adCenter, the solution is to utilize NAND devices which have superior random IOps performance.

Note, you can still get great random IOps performance using regular spinning media provided you have enough spindles for your LUNs and you stripe and/or short-stroke the disks.

Or…you can make sure you buddy up with your friendly systems engineer whom can take care of this for you!

Enjoy!