How do I access a file without updating its last-access time?


The first problem with discussing file last-access time is agreeing what you mean by a file's last-access time.

The file system folks have one definition of the file last-access time, namely the time the file was most recently opened and either read from or written to. This is the value retrieved by functions like Get­File­Attributes­Ex, Get­File­Time, and Find­First­File.

The problem with this definition is that it doesn't match the intuitive definition of last-access time, which is "the last time I accessed the file," emphasis on the I. In fact, the intuitive definition of access is more specific: It's "the last time I opened, modified, printed, or otherwise performed some sort of purposeful action on the file."

This discrepancy between the file system definition and the intuitive definition means that a lot of operations trigger a file system access but shouldn't count as an access from the user interface point of view. Here are some examples:

Whenever some shell extension violates this rule, the shell team gets a bug report from some customer saying, "The last-access time shown in Explorer is wrong. A document which hasn't been accessed in months shows a last-access time of today. After closer investigation, we found that the last-access time updates whenever we insert seemingly-innocuous operation here."

If you're writing a program which needs to access the file contents but you not want to update the last-access time, you can use the Set­File­Time function with the special value 0xFFFFFFFF in both members of the FILETIME structure passed as the last-access time. This magic value means "do not change the last-access time even though I'm accessing the file."

BOOL DoNotUpdateLastAccessTime(HANDLE hFile)
{
 static const FILETIME ftLeaveUnchanged = { 0xFFFFFFFF, 0xFFFFFFFF };
 return SetFileTime(hFile, NULL, &ftLeaveUnchanged, NULL);
}

As the documentation notes, you have to call this function immediately after opening the file.

Going back to that linked comment: The reason why viewing the Summary tab causes the last-access time to be updated is that the Summary tab retrieves its information by calling Stg­Open­Storage, and there's no way to tell that function, "Hey, when you open the file in order to see if it has any document properties, do that Do­Not­Update­Last­Access­Time thing so you don't update the last access time."

Bonus chatter: Starting in Windows Vista, maintaining the last-access time is disabled by default. In practice, this means that the number of bugs related to altering the last-access time accidentally will multiply unchecked, because the mechanism for detecting the error is disabled by default.

Comments (29)
  1. John says:

    At a higher level, why does "last access time" even exist?  What is its usefulness?

  2. Oliver says:

    @John – Have you never used your browser history?

  3. NB says:

    @Oliver: How does last access time compare with browser history? Is there some Windows component that can generate a file use history based on it?

    Personally I like the change made in Vista, never cared for last access time, last modified time is good enough.

  4. Steve says:

    @John: Surely you can imaging a use? One thing that springs to mind is to archive files that aren't used often.

  5. Joshua says:

    Last access time is really only useful for compress or delete old files.

    I've seen a program or two that depended on a UNIX nuance where writing to the file didn't update access time, but such behavior is rather fragile.

  6. Porter says:

    It is common on linux systems using SSD to disable updating last accessed time to reduce number of writes.

  7. Skyborne says:

    I think the original case for last-access-time was for mbox-formatted mail files.  If the mail file was modified after being accessed, then it has unread mail.

    Linux also added relatime a while ago, because we like complicated things.

  8. voo says:

    @Skyborne But for that use case the last-modified timestamp is MUCH more useful – something which is true in general I think (at least I always remove the accessed column in Windows and that for ages).

    The part about "remove old files that weren't used for a long time" seems like the most realistic use case, as otherwise this would lead to some problems. Was probably a much larger deal in the good ole days. Though I think the reason it was included was because other FS already had it(?) and those probably had it because in the bigger picture it seems like a rather innocuous feature (well I remember Raymond's article about why it isn't) that's rather easy to implement if you already include last modified stamps.

    @Porter Yep one should really save those writecycles, I always say "People you all know that 1PT of writes on a 160gb large SSD won't last you for more than 30-40 years – think about the future"!

  9. Some Guy says:

    Don't forget about the "I know I accessed the file today by the last access time is six months ago" bug reports. After all, the only people who complained about the original problem are the people who check that – and you can't please everyone.

  10. Jeff says:

    voo: That would be true if SSDs had built-in wear leveling that was any good. They don't, but since everyone assumes they do, in the real world an SSD lasts a few months on average.

  11. dave says:

    Suppose you wanted to roll out to cheaper storage files that weren't being accessed… it would be useful to know when the files were last accessed.

    In retrospect, though, 'last-read time' might have been a better(*) choice.  All those metadata-y things should be covered by the change-time on the file, which the file system maintains (for Posix, no doubt) but the Windows API neglects to expose.

    (*) this still has the problem that reading data streams that are used for properties, a la Explorer, will update the last-read time.  And I'm unfond of breaking symmetry by making i the last-unnamed-stream-read-time.

  12. voo says:

    @Jeff Yeah it's not as if modern controllers had a WA <1.5 for sequential writes (and you can triple that all you want for random stuff). You don't happen to have anything to back those claims up, like say some screenshots of a modern SSD (no indilinx controller or something please) that wore out its write cycles under normal usage?  Would be more than interesting – otherwise I'll just assume it's FUD you made up on the spot, because math and tests say otherwise. See this (http://www.xtremesystems.org/…/showthread.php) – sequential writes with at least 1/3 static data. Hard to say how good modern drives will fare, since so far only one small SSD with crap controller has died and that after 550TB of writes.

    Now we know that the old Intel g2 controller has a WA <3 for average desktop use cases (easy to do the math for that and most people get about 2 still), let's just assume that they didn't improve that at all and let's double it again for the presumably not perfect static wear leveling and well – put another factor of two in there just for good measure. So the 25nm 40gb Intel drive would probably only do a few hundred TBs of writes.

  13. Dave says:

    I'm wondering how the magic SetFileTime() call is handled for this.  Is there some kernel flag set that says "if the next call after CreateFile() is SetFileTime(), disable updates"?  Is it a timed thing, CreateFile() queues a timestamp update that can get removed again if a magic SetFileTime() arrives within a certain time?  (I'm guessing it's the latter, since last-update granularity is pretty loose, it's only guaranteed to be accurate to within an hour).  It just seems more like a task for dwFlagsAndAttributes rather than a somewhat iffy race-condition call to SetFileTime().

    [Merely creating the handle does not trigger an update of write or access. You have to actually write/read. Therefore, when the first read/write operation actually takes place, the file system says "Oh wait, I won't update the write/access time." (Note: I'm guessing.) -Raymond]
  14. Nick Lowe says:

    The unsigned FILETIME maps at the NT Native API to the signed LARGE_INTEGER. 0xFFFFFFFF is therefore -1.

    The documentation for FILE_BASIC_INFORMATION states:

    "The file system updates the values of the LastAccessTime, LastWriteTime, and ChangeTime members as appropriate after an I/O operation is performed on a file. However, a driver or application can request that the file system not update one or more of these members for I/O operations that are performed on the caller's file handle by setting the appropriate members to -1. The caller can set one, all, or any other combination of these three members to -1. Only the members that are set to -1 will be unaffected by I/O operations on the file handle; the other members will be updated as appropriate."

    msdn.microsoft.com/…/ff545762(v=vs.85).aspx

    Trivia: This was implemented this way starting with Windows 2000 as a shortcut since many applications previously had to query the times, do their work, and then set the original times back.

  15. Nick Lowe says:

    It is also worth pointing out that the handle must have FILE_WRITE_ATTRIBUTES granted to it to prevent the system from making updates to LastAccessTime, LastWriteTime or ChangeTime.

  16. Ben Voigt says:

    @Jeff: Yes, I find it hard to understand why the lifetime calculation is always expressed as (capacity * rewrites) / (write load).  It should be `(freespace * rewrites) / (write load)`.  Which in most deployments is 1/20th the time, and is starting to be rather concerning.

  17. Worf says:

    It is wrong to assume that an SSD only wear-levels free space. Because once you write every sector, there is no more free space. SSDs are block devices – you tell it to write sector N wit5h this chunk of data, and to read sector M. Now, fancy OSes add a feature called TRIM that tells the SSD "this sector's contents are invalidated".

    The reason for this is a flash device can only be erased in big chunks – say 256kiB. If I need to use the block as it's time in the cycle for wear levelling, I need to move the good data, and ignore the dirty data (when you rewrite a sector, the ols sector is marked invalid and the new data put in a new spot). TRIM just tells an SSD that the sector is invalidated so when the controller is writing, it's not having to move data the OS doesn;t care about.

    Sure, with TRIM it's free, but there are OSes that don't support TRIM. And the OS is free to just rewrite a sector without TRIMming it first.

  18. SmittyBoy says:

    "you have to call this function immediately after opening the file."

    Why isn't that a flag on the OpenFile/CreateFile etc..  Especially considering that it's the OS that is providing my App with a timeslice.  

    [Yeah, that's what Create­File needs. Three more flags! -Raymond]
  19. Skyborne says:

    @voo: if you have N xterms open, each with a shell keeping their own "I notified the user of new mail when last-modified was X", then you can get up to N "new mail" notifications for 1 new message.  With last-access time, you can get 1 notification, read the message, and the other N-1 shells can tell you've read it, even though it wasn't through them.

    Likewise, it lets ye olde xbiff work: it can put its "new messages" flag back down once you've read them from your MUA.  Otherwise, it would never know since it doesn't read mail itself.

  20. Skyborne says:

    In fact, now that I think about it, the shell would *never* know if you read the mail, if it didn't have atime.  It's done in the MUA (out-of-process again), and you don't *have* to write the file to read the mail.

  21. Jakob says:

    Perhaps someone can settle a bet for me on this. My understanding is that "file" in the phrase "the time the file was most recently opened and either read from or written to" specifically refers to a data stream of a file (on an NTFS volume). A colleague of mine insists that "file" refers to the entire collection of attributes associated with those data streams, specifically including the attributes contained in the MFT. His thought is that reading or writing to any of the attributes (say, writing to the file name by calling MoveFileExW()) will update the access time. I maintain that only calls that read or write to the data streams (e.g. ReadFile() and WriteFile()) will do so. Empirical testing would appear to be on my side. However, he insists that my testing methods are flawed because I am not accounting for the fact that the NTFS driver caches access time updates for up to an hour before writing them to disk.

  22. Nick Lowe says:

    You would need an individual flag for LastAccessTime, LastWriteTime and ChangeTime to specify that they should remain unchanged when I/O operations occur that would otherwise trigger an update.

    The property would also be immutable on the handle without another mechanism being implemented to change the behaviour later.

    (CreateFile would require two extra flags as ChangeTime is not exposed to Win32.)

    Specifying any of those flags would also have to require that FILE_WRITE_ATTRIBUTES be present in the desired access rights and it also be granted.

    The implemented behaviour is pretty useful, comprehensive and simple:

    When set to -1 (0xFFFFFFFF, 0xFFFFFFFF), I/O operations that would usually trigger an update will not update the respective file time on the file system.

    When set to 0, the file time is left unchanged and subsequent I/O operations are able to update as necessary.

    When set to another value, the file time is updated to the value provided and subsequent I/O operations are able to update as necessary.

  23. Worf says:

    @Jakob: Here's a puzzle piece. blogs.msdn.com/…/10195932.aspx

  24. Nick LOwe says:

    @Medinoc – It is false as it should state that it needs to be called before I/O operations are carried out on the handle that would otherwise trigger an update.

    It definitely only needs to be called once though.

  25. Medinoc says:

    Well, what I know for sure is that when I tried "before", it didn't work, and when I tried "after", it did.

  26. 640k says:

    There should be time stamps for when metadata was last changed/accessed ;)

  27. David Walker says:

    Last access time is exactly the order in which you want your files arranged (sorted) on a disk that has different performance characteristics on different parts of the disk (such as most traditional hard drives where access times are better on the outer cylinders).  Some devices don't have those performance characteristics, but without access to the last access time, you can't implement this.

    For this purpose, access time should include all user access but exclude access due to a full-disk backup or a full-disk virus scan.  :-)

  28. 640k says:

    image viewer in windows doesn't update last modified time when modifying the image data. why?

Comments are closed.