Why does creating a shortcut to a file change its last-modified time… sometimes?


A customer observed that sometimes, the last-modified timestamp on a file would change even though nobody modified the file, or at least consciously took any steps to modify the file. In particular, they found that simply double-clicking the file in Explorer was enough to trigger the file modification.

It took a while to puzzle out, but here's what's going on:

When you double-click a file in Explorer, Explorer adds it to the Recent Items list. Internally, this is done by creating a shortcut to the item. The nice thing about a shortcut is that it knows how to track its target. That way, if you move an item, then try to open it from the Recent Items list, the shortcut tracking code will try to find where you moved it to. You moved the file. The shortcut still works. Magic.

Shortcut target tracking magic is accomplished with the assistance of object identifiers, and object identifiers, as we saw earlier, are created on demand the moment somebody first asks for one.

And that's where the file modification is coming from. If the file is freshly-created, it won't have an object identifier. When you create a shortcut to it (which happens implicitly when it is added to the Recent Items list), that triggers the creation of an object identifier, which in turn updates the last-modified time on the file.

Frustratingly, the Link­Resolve­Ignore­Link­Info and No­Resolve­Track policies do not prevent the creation of object identifiers. Those policies control whether the tracking information is used during the resolve process, but they don't control whether the tracking information is obtained during shortcut creation. (Who knows, maybe you're creating the shortcut to be used on a machine where those policies are not in effect.) To suppress collecting the volume information and object identifier at shortcut creation time, you need to pass the SLDF_FORCE_NO_LINKINFO and SLDF_FORCE_NO_LINKTRACK flags to the IShell­Link­Data­List::Set­Flags method when you create the shortcut.

Comments (15)
  1. Alex Grigoriev says:

    What permission mask is necessary to create an obj id? Would it be WRITE_EA? WHat if an unprivileged user creates the shortcut? Will it be tracked?

  2. @Joshua, this should not be a big deal, because it does not make random changes, but sets the last modified time to current time. Unless you have files from the "future", it will never set the time to earlier, thus only causing extra notifications for backup like software with "empty" changes.

  3. 640k says:

    A backup program should compute a hash on it's own to determine if a file has changed. Not some random file system metadata. No need to backup the content of a file if the content haven't been modified since last backup.

  4. Joseph Koss says:

    @640K:

    Comparing hashes can only prove that a file was modified.. they can never prove that it wasn't.

    These spurious changes to the last modified time-stamp are false positives.

    Hashing is open to false negatives.

    You cannot combine these two things to remove the possibility of error.

  5. Cheong says:

    Will there, at some point of time in future, be a switch in "fsutil behavior" command that allow us to set "create object id for every file on file creation" for the NTFS volumes so in this case we'll get more consistent result?

  6. Joshua says:

    *shudder*

    I wonder if this is what's causing intermittent failures in the document cache my product has.

    This is the second time Raymond blogged about something that chewed up the modification time. I wonder if it should be treated as unreliable.

  7. Alan Malloy says:

    @sukru-t Suppose I compile app.c to produce app.exe. I modify app.c, and then remember I want a shortcut: I create a shortcut to app.exe somewhere. Now I run make, and…app.exe is newer than app.c, so no compilation occurs?

  8. Mich says:

    lnk-files are broken by design. Every additional file system feature added to compensate for it will fail.

  9. Simon says:

    @Joseph Koss:

    Comparing hashes can only prove that a file was modified.. they can never prove that it wasn't.

    Sure.  But what you can do is reduce the chances to a level where you can have hashed as many files as there are stars in the known universe and still have a collision probability comfortably under a trillionth of a trillionth of a percent. (That's using a 256-bit hash).  

    At some point you have to stop worrying about collisions, and when the chance of a collision before the heat-death of the universe gets comfortably under that of the developer being struck by lightning while simultaneously winning the lottery before s/he can commit the hashing code, you're probably past it.

    (This is assuming that no-one's broken your hash algorithm and is deliberately trying to generate a collision, of course.  But while that's a worry for a cryptographer, for a backup program, I don't see a reason to lose sleep over it).

  10. Joseph Koss says:

    @Simon:

    At some point you have to stop worrying about collisions

    You can backup the file that the metadata claims was changed on the grounds that it really might have been, as well as the file that the hash knows was changed on the grounds that it definitely was changed. If you do both these things, you most definitely do not /have/ to accept the risk of collisions.

  11. Joshua says:

    @sukru-t: It's a cache, not the master. Bumping the timestamp to higher than the master by some external operation breaks the cache manager.

  12. Adam Rosenfield says:

    @640k: Do you really want a backup program to read *every* file on disk in order to compute a hash to test if the file has changed?  That requires an order of magnitude more disk I/O than just checking the modification times on every file, which only requires reading all of the directories on disk.

    You could only hash the file if its modification time has changed, but that's just making the program more complicated without much benefit.  You'd have to read in the file twice if it changed (once to hash it, once to copy it to the backup volume), though that would almost certainly be cached by the OS in memory, assuming it's not ginormous.

  13. Evan says:

    @Alan Malloy: "Suppose I compile app.c to produce app.exe. I modify app.c, and then remember I want a shortcut: I create a shortcut to app.exe somewhere. Now I run make, and…app.exe is newer than app.c, so no compilation occurs?"

    My somewhat-trolling answer is "stop using make and get a better build system." :-)

    @Adam: "You could only hash the file if its modification time has changed, but that's just making the program more complicated without much benefit."

    On the contrary, I feel that this is *very* reasonable behavior for a backup system. "False positives" aren't the only reason that an mtime may change but the file wouldn't. So either you already have the backup system backing up stuff it doesn't need to, in which case a few more because of this shortcut issue is likely to be the least of your concerns, or you have to have something that looks at the file contents anyway.

    (Not to say that it couldn't cause other problems, e.g. the aforementioned 'make' answer.)

  14. Nick Ross says:

    @640k:

    There's already a meta data flag in the directory for backup programs. It's called the archive flag.

  15. 640k says:

    The archive attribute is broken by design. Even in single tasking DOS.

Comments are closed.