How do I find the original name of a hard link?


A customer asked, “Given a hardlink name, is it possible to get the original file name used to create it in the first place?”

Recall that hard links create an alternate name for a file. Once that alternate name is created, there is no way to tell which is the original name and which is the new name. The new file does not have a “link back to the original”; they are both links to the underlying file content. This is an old topic, so I won’t go into further detail. Though this question does illustrate that many people continue to misunderstand what hard links are.

Anyway, once you figure out what the customer is actually asking, you can give a meaningful answer: “Given the path to a file, how can I get all the names by which the file can be accessed?” The answer is Find­First­File­NameW.

Note that the names returned by the Find­First­File­NameW family of functions are relative to the volume mount point. To convert it to a full path, you need to append it to the mount point. Something like this:

typedef void (*ENUMERATEDNAMEPROC)(__in PCWSTR);

void ProcessOneName(
    __in PCWSTR pszVolumeRoot,
    __in PCWSTR pszLink,
    __in ENUMERATEDNAMEPROC pfnCallback)
{
  wchar_t szFile[MAX_PATH];
  if (SUCCEEDED(StringCchCopy(szFile, ARRAYSIZE(szFile), pszVolumeRoot)) &&
      PathAppend(szFile, pszLink)) {
   pfnCallback(szFile);
  }
}

void EnumerateAllNames(
    __in PCWSTR pszFileName,
    __in ENUMERATEDNAMEPROC pfnCallback)
{
 // Supporting paths longer than MAX_PATH left as an exercise
 wchar_t szVolumeRoot[MAX_PATH];
 if (GetVolumePathName(pszFileName, szVolumeRoot, ARRAYSIZE(szVolumeRoot))) {
  wchar_t szLink[MAX_PATH];
  DWORD cchLink = ARRAYSIZE(szLink);
  HANDLE hFind = FindFirstFileNameW(pszFileName, 0, &cchLink, szLink);
  if (hFind != INVALID_HANDLE_VALUE) {
   ProcessOneName(szVolumeRoot, szLink, pfnCallback);
   while (cchLink = ARRAYSIZE(szLink),
          FindNextFileNameW(hFind, &cchLink, szLink)) {
    ProcessOneName(szVolumeRoot, szLink, pfnCallback);
   }
   FindClose(hFind);
  }
 }
}

// for demonstration purposes, we just print the name
void PrintEachFoundName(__in PCWSTR pszFile)
{
 _putws(pszFile);
}

int __cdecl wmain(int argc, wchar_t **argv)
{
 for (int i = 1; i < argc; i++) {
  EnumerateAllNames(argv[i], PrintEachFoundName);
 }
 return 0;
}

Update: Minor errors corrected, as noted by acq and Adrian.

Comments (35)
  1. acq says:

    Raymond, you don't pass PrintEachFoundName to EnumerateAllNames but it's obvious that was an idea, so maybe it's not worth spending time correcting it.

  2. Adam Rosenfield says:

    Thanks for the tip, I didn't know it was possible to enumerate all of the hard links to a file (even if it's Vista+ only).  I also find it interesting that there's no ANSI version of this function.

    But aside from backup programs, what are the possible use cases for wanting to enumerate the hard links to a given file?

  3. henke37 says:

    Also, What if the original name of the file is deleted? Hard links are fun that way, since they continue to work after that event.

  4. Bob says:

    SysInternals just released a new utility call FindLinks that finds all of the hard links for a given file.

    technet.microsoft.com/…/hh290814

  5. David Walker says:

    I realize that the original question included a misperception, but if the create time is stored with the file name (and not the contents), then you could find all names pointing to a file, and then get the oldest one.  That would MAYBE be the "original" name used to create the file (unless the original name used to create the file was later deleted).  Further proof that the question is incorrect.

    Of course, if that metadata is stored with the file, and there's only one create time, then even this hack won't work.  

  6. Ivo says:

    @David – yes, I think all metadata is stored with the file. I was having a related problem before – if you create a hard link to a read-only file, you can't delete the hard link (because all links to the file are "read-only"). You can of course remove the read-only attribute but then you mess up the file. That's what Explorer will do. If you create a hard link to a read-only file and then delete it from Explorer, the file becomes writable.

    So the solution is to look for another link to the file (as Raymond described) and use it to restore the attribute after you delete your hard link.

  7. Joshua says:

    @Ivo: You don't have to use FindFirstFileNameW for that: stackoverflow.com/…/delete-link-to-file-without-clearing-readonly-bit

  8. Joe Dietz says:

    Ten some years ago I found myself implementing a file system for windows 2000.  Having come from a UNIX background I didn't find all of the NTFS features like hardlinks unusual at all, but I did find that it was bit unusual that there where no win32 APIs that used them….  Something that most users of windows don't know or just have forgotten was that one of the original design goals of the NT kernel was to be able to support any number of existing operating system interfaces through runtime 'subsystems', such as UNIX (posix, Interix/SFU etc.), OS2 and win32.  Win32 is the only one that ever got used and I think the only one shipped post w2k3.  Anyways a side effect of this design goal is that the file system model has the ability to implement semantics of file systems from all of these environments, which also helps makes the NT kernel file system interfaces 2-3X more complex than the UNIX VFS….  So UNIX things like hardlinks and symlinks (well the later isn't a native capability of NTFS, but it is approximated via reparse points) have existed forever, but win32 only recently bothered with them.

  9. J says:

    "@David – yes, I think all metadata is stored with the file. I was having a related problem before – if you create a hard link to a read-only file, you can't delete the hard link (because all links to the file are "read-only"). You can of course remove the read-only attribute but then you mess up the file. That's what Explorer will do. If you create a hard link to a read-only file and then delete it from Explorer, the file becomes writable.

    So the solution is to look for another link to the file (as Raymond described) and use it to restore the attribute after you delete your hard link."

    If that behavior is true, it's unfortunate.  The ability to add or remove links should be tied to the writeability of the containing directory, not the file itself.

  10. Adrian says:

    Isn't it necessary to call FindClose(hFind) after the enumeration?  What kinds of objects get leaked when you forget to do that?

  11. Timothy Byrd says:

    Hey Raymond – thanks for posting that. (I don't have an immediate use, but it's in the "good to know" category.)

    I re-read the old post and it reminded me why my personal rules for using hard links are "must take enough disk space that I care" and "must be files that I pretty much am never going to change the contents of".

  12. Billy O'Neal says:

    @J: No, it should not. If you want the item to have a different set of attributes to the file then you don't use a hard link, you should use a symbolic link. A hard link is just another file name for the same file. A symbolic link is actually the link.

  13. David Walker says:

    @Joe:  Implementing your own file system?  That is brave.

  14. Petr Minar says:

    So which file name is returned from ReadDirectoryChangesW when file with multiple names is changed?

    [First half of the answer coming August 12; second half on December 26. -Raymond]
  15. yuhong2 says:

    "I also find it interesting that there's no ANSI version of this function."

    blogs.msdn.com/…/476213.aspx

    blogs.msdn.com/…/867880.aspx

  16. Joshua says:

    [First half of the answer coming August 12; second half on December 26. -Raymond]

    Now that's what I call a long queue.

  17. Adam Rosenfield says:

    @Joshua: No, that's actually pretty soon.  Word on the street is that Raymond's queue is at least 2-3 years long.  A 5 month wait is paltry by comparison.

  18. Gabe says:

    Joshua: He doesn't have a queue — he has a time machine!

  19. Neil says:

    In edge cases it might be possible to find the original hard name of a link by comparing its ACL to the folders of all the hard links.

    @Ivo: It gets worse when the file is open, as you then can't delete any of its links, although I guess MOVEFILE_DELAY_UNTIL_REBOOT still works.

  20. Random832 says:

    "The ability to add or remove links should be tied to the writeability of the containing directory, not the file itself."

    Alternately, if the ability to remove links is to be tied to the file (which is not an entirely unreasonable stance – this is the problem the sticky bit solves on unix, though in that case it's tied to file ownership, which creating links already is), the ability to add links should also be.

  21. J says:

    "@J: No, it should not. If you want the item to have a different set of attributes to the file then you don't use a hard link, you should use a symbolic link. A hard link is just another file name for the same file. A symbolic link is actually the link."

    But the name is really a property of the directory, not the file itself.  So creating a name in a directory is something that should only involve permissions in the target directory.  The current "solution" is obviously wrong: being able to create a link, but not being able to remove it is an unnecessary asymmetry.

    That's not a hole either.  If you can't write to the file in the source directory, you won't be able to write to it in the target directory either (and the same with reading from the file).

    "Alternately, if the ability to remove links is to be tied to the file (which is not an entirely unreasonable stance – this is the problem the sticky bit solves on unix, though in that case it's tied to file ownership, which creating links already is), the ability to add links should also be."

    I agree with this as an alternate solution.  Windows could have had a permission called "create link".  If you have that permission on a file, you can create (and remove!) hardlinks to that file.  No asymmetry.  And furthermore, you aren't conflating two unrelated operations in the same permission bit.  Being able to write data to a file is very different from creating or removing a link to/from it.

  22. David Walker says:

    @J: The name is a property of the directory?  I don't think if it that way.  The name has historically been considered part of the file (logically, if not implementation-wise).

  23. Pls fix says:

    Pls fix Explorer to handle hard links correctly so Winsxs size is correct.

  24. David Walker says:

    @pls fix:  Complaining to Raymond won't help.  Besides, he already addressed this.  What is the "correct" size of a folder that has links to files that actually live in other folders?  There are several answers that could be considered "correct".  

    Some of the answers suffer from overcounting (that is, the total size of each folder, when added together, may exceed the capacity of the drive).  Should backup programs use the total size when estimating how much backup space will be required to back up the whole disk?

    There is not a "correct" answer.  The reported size of the Winsxs folder is, in fact, the correct amount of space you'll need if you copy all of the file data that is pointed to by filenames that are in the Winsxs folder.

  25. Gabe says:

    J: In Unix, a filename is considered part of a directory. A file may have any number of names, including zero. Anybody can add or remove a file's name at any time so long as they have appropriate permissions for the given directory. If you want to know what names a given file has, you have to exhaustively search every directory in the filesystem. If you want to know the complete path to a given file (say you want to know the fully-qualified name of the current directory), it's not a trivial operation. Locking a file is often largely useless because somebody can just unlink the file you want to lock and create a new one in its place.

    In Windows, the name is considered part of a file. A file with no name makes about as much sense as a file with no size, so any file must have at least one name. This means that it's easy to ask for all the names a file has or find out the full path name of a file. Most significantly, you can lock a file and know that the file is genuinely locked; a read-only file is immutable.

  26. Anonymous says:

    @Bob

    I am finding this FindLinks tool quite useless.  Such a tool is already built into the OS:

    To list links:

    fsutil hardlink list C:windowsnotepad.exe

    To query the file index:

    fsutil file queryfileid C:windowsnotepad.exe

  27. 640k says:

    @Adam Rosenfield: But aside from backup programs, what are the possible use cases for wanting to enumerate the hard links to a given file?

    Most common case is probably when deleting an open file.

    @David Walker: There is not a "correct" answer.  The reported size of the Winsxs folder is, in fact, the correct amount of space you'll need if you copy all of the file data that is pointed to by filenames that are in the Winsxs folder.

    This is obviously not true because the files in Winsxs takes different amount of space if copying to ntfs or fat.

  28. David Walker says:

    @640k: What I said WAS true, modulo any sector overhead.  It also may depend on whether you are copying to a different folder on the same disk, or to a different disk.  Copying onto a different disk or to tape will copy all of the data from the WinSXS folder.  Copying on to the same disk might just copy the hard links — I'm not sure about that.  So the size reported by Explorer for WinSXS *is* correct.

  29. Pls fix says:

    @David Walker, the correct size is and always be the size excluding the redundant hard links and all other sort of redundant junction points. Just freaking count them only once. The size including the hard links is the total gross size which is incorrect and what Explorer currently shows.

    [I find your confidence in the use of the word "always" refreshingly idealistic. -Raymond]
  30. Joshua says:

    I'd assume that Pls fix is also from a Unix background, but has forgotten that most windows tools (including Explorer's folder copy) fail to gather hard links across folder copies.

    I have a tool that correctly does the copy and another tool that correctly determines the size. See http://www.cygwin.com.

  31. Gabe says:

    LR: Another situation you need to consider is "How much additional free space will my disk have if I delete this folder?"

    Of course you probably don't want to delete WinSxS but it's very similar to "How much space will this folder use on my backup tapes, assuming the rest of the disk is already backed up?" if your backup program knows about hard links.

  32. LR says:

    @Gabe,

    right. But because of differential backups, the size that a backup will take is a really complicated matter (where hardlinks only play a very minor part, in my opinion).

    Everytime when you delete something which shares a large amount of files via hardlinks with other directories, I would think that you have some special case. How often do you delete /bin, /user/bin, /lib etc, and have checked the size of this directories with the Linux pedant of the Explorer first?

    For the most usual cases (copying something to USB or network or another volume), the size displayed in Explorer is right.

  33. LR says:

    @Pls fix, Joshua:

    The meaning of "correct folder size" depends entirely of what you want to get from this value. The only situation in which a simple sum over all files sizes is incorrect occurs when all of this three conditions are true:

    - You want to copy the directory, and the copy tool is taking hardlinks into account by recreating the situation (references to the same file) at the target directory.

    - You are not copying to a FAT(32) volume or some sort of network folders. Note that USB sticks use FAT(32) most of the time. (Yes, you can reformat them to NTFS. I know this, and have done this myself.)

    - The directory (including its subdirectories, if you are planning to perform a recursive folder copy) contains two or more references to the same file. (References shared with folders *outside* of the actual source directory must not get special handling, of course. This files must be copied to the target as usual.)

    Only all three conditions together make the simple sum invalid, but how can the Explorer know what are trying to do?

  34. David Walker says:

    @pls fix:  If you want a copy of everything that's referenced in a folder (including everything that is the ultimate target of hard links), then you DO, in fact, need to copy the contents of things that are hard-linked.  Your obstinance doesn't change this.  You might be copying across a network, or to an external disk that will be detached from the computer or network and carried somewhere else.  We can agree to disagree.  

  35. David Walker says:

    @LR: The simple sum over all DIRECTORY sizes, which is easy to get from Explorer, will often overestimate the total amount of space required.  The reported size of WinSXS includes files that are hard-linked from inside WinSXS, but the files exist elsewhere.  Those file sizes are also counted in the "size" of the other directories.

Comments are closed.