Got conflicts or missing files? Read this (and don't uninstall for now)

A few users have reported either files missing when they view Live Mesh folders on their computer (but appearing as expected on their Live Desktop) or their computer showing all files but reporting many more conflicts than expected (and their Live Desktop showing duplicate copies of any file with a conflict.) We’ve spent the weekend investigating this, and we have a few details below. But the key news is this:

  • Live Mesh is not deleting or losing files. Even if they don’t all appear in Live Mesh folders on your computer, they are all present in a hidden state, as well as on your Live Desktop if the folder is synchronized there.
  • Uninstalling the Live Mesh software from a PC will delete the files that are in a hidden state on that PC. If the folder is not synchronized on any other computer or your Live Desktop, then uninstalling will result in the hidden files no longer being available.
  • If you stopped synchronizing a folder on your PC, then re-synchronized it, that folder is likely to generate some number of false conflicts.

We're testing a fix for this now and will deploy it as soon as we're confident it’s ready. In the meantime, if you have a Live Mesh folder that appears to be missing files (that is, synchronization appears to be complete but expected files are still not present) or one that has an unexpectedly large number of conflicts, in order to prevent data loss, do not uninstall the Live Mesh software. In addition, do not re-synchronize any folders you had previously stopped synchronizing.

 

Why is this happening? A technical explanation

 

In the interest of being open with you, our beloved and valued technology preview testers, we’re happy to provide more detail on what’s causing this behavior. The key to a good synchronization algorithm is understanding the history of all the files in a folder. You want to make sure that any client, even one that hasn’t synchronized in a good long while, can get an accurate list of what changed and what needs to be brought up to date. Part of a complete folder history is keeping track of files or folders that have been deleted, so that if a client comes online and tries to sync a file that doesn’t exist in the up-to-date folder list, we can tell whether it’s a new file or a file someone deleted a while back but is still on this particular client because the file deletion was never synchronized to it. We use the industry jargon tombstone for these deleted files.

 

What we discovered recently is an interesting edge case around tombstones. Live Mesh performs its synchronization in batches of roughly 50 files at a time. The client asks for the first 50 changes, processes those, then asks for the next 50, and so on. If, through sheer chance, the next batch of 50 items is made up entirely of tombstones—that is, it’s a list of 50 files that have all been deleted since this particular client last synchronized—the storage server gets a little confused. It returns that list of tombstones and then concludes that synchronization has ended. If there are additional file changes, they never get sent to the client. So this can lead to two patterns:

  • If you’re synchronizing a folder for the first time, and that folder has this pattern of 50 tombstones in a row, then you get only a partial sync. Any files the server returned before this block of 50 tombstones show up just fine, but any files after that won’t be synchronized to the client.
  • If you have first stopped and then restarted synchronization for a particular folder, it gets a little more complicated. In this scenario, when synchronization resumes, the client ought to just quickly see which of the files have changed and get everything up to date. But if a client hits this 50-tombstone-in-a-row bug, the Live Mesh service ends up confused. The client might have tens or even hundreds of files that you and I know are up to date and part of the folder, but the server failed to tell the client about them. So the client has no choice but to conclude that you have a bunch of new files, and that’s what it tells the server. The server, of course, recognizes that files with those names already exist, and so it mistakenly thinks the client has just created a bunch of new files that unfortunately have the exact same names as files already in the folder. Result: sync conflicts, and potentially lots of them.

Now, conflicts are an inevitable part of any synchronization system, and so both the server and client are built to understand what a conflict is and to store any conflicting files in a separateholding area, where they remain until the user decides how to resolve the conflicts. While they are in the holding area, the client software might not display the files as being part of the folder, while the Live Desktop instead represents the conflict by showing both files (yes, this is something we are already working on improving and making consistent ;). So the files are all in your mesh; it’s just not obvious where they are or how to access them.

 

The one catch is that the client holding area is stored in the Live Mesh folder for temporary application files. This directory is something we remove when the Live Mesh software is uninstalled (also something we are already working on improving—either warning you that uninstalling will remove any files in a conflicted state or just copying all conflicts somewhere else where they won’t get removed on uninstall). So if you somehow end up in a state where the folder is present only on one client, uninstalling the Live Mesh software from that client may remove all files in a conflicted state. That’s bad, we know it’s bad, and that’s why we suggest you don’t uninstall until we fix this issue.

 

Thanks as always for your continued participation and feedback. Ferreting out issues like this is exactly what our technical preview is designed to do, and so while it’s sometimes a bit painful, we appreciate your help and your patience as we work to get the issue fixed.

 

Technorati Tags: Live Mesh