A customer said that their program's I/O pattern is to open a file and then every so often write about 100KB of data into the file. They are currently using the
FILE_ flags to open a file, and they wanted to know what else they could do to make their writes go even faster.
Um, for one thing, you stop passing those two flags!
Those two flags in combination basically mean "Give me the slowest possible I/O performance!" because they force all I/O to go through to the physical media right away.
FILE_ flag will be a big help. This allows the hardware disk cache to do its normal job of completing the I/O immediately and performing the physical I/O lazily (perhaps in an optimized order based on subsequent writes). A 100KB write is a small enough write that your I/O time on rotational media will be dominated by the seek time. It'll take five to ten milliseconds to move the head into position and only one millisecond to write out the data. You're wasting 80% or more of your time just preparing for the write.
Much better would be to issue the I/O without the
FILE_ flag so that the entire 100KB I/O request goes into the hard drive on-board cache. (It will fit quite easily, since the on-board cache for today's hard drives will be 8 megabytes or larger.) Your
WriteFile will complete immediately, and the commit to physical storage will occur while your program is busy doing computation.
If the writes truly are sporadic (as the customer claims), the I/O buffer will be flushed out by the time the next round of application I/O begins.
FILE_ flag will also help, because that allows the operating system disk cache to get involved. If the application reads back from the file, the read can be satisfied from the disk cache, avoiding the physical I/O entirely.
As a side note, the
FILE_ flag is largely ineffective nowadays, because SATA drivers ignore the flush request. The file system doesn't know that the driver is lying to it, so it will still do all the work on the assumption that the write-through request worked, even though we know that the extra work is ultimately pointless.
For example, NTFS will issue metadata writes with a flush to ensure that the data on the physical media is consistent. But if the driver is ignoring flush requests, all this extra work accomplishes nothing aside from wasting I/O bandwidth. Even worse, NTFS thinks that the data on the drive is physically consistent, but it isn't. The result is that a poorly-timed power outage (or device removal) can result in metadata corruption that takes a
chkdsk to repair.
Now, it may be that the customer's program is using the
FILE_ flags for a specific purpose unrelated to performance, so you can't just go walking in and ripping them out without understanding why they were there. But if they added the flags thinking that it would make the program run faster, then they were operating under a false assumption.