Just to remind you, I’m a tester on the BCL team (part of my testing responsibilities include the IO feature area). I wanted to share some pain that we’ve gone through in investigating directory modification failures with the hope that I can spare you similar experiences.
It’s a common testing pattern in managed code to create a directory, do some work and then delete it. We noticed problems in this pattern on some machines, especially when these operations are in a loop. Tests were failing in these machines for the following APIs:
The problem turned out to be isolated to machines that have content indexing services like mssearch.exe. We have also uncovered cases in the past where instances of anti-virus programs causing similar problems. This is caused by these services taking a handle to a directory when the test wanted to modify the directory, which of course meant we can’t finish the appropriate operation (the OS won’t let you perform those operations while another process has an open handle to them), and hence the failure. Most of them involve a delete operation so the reason for preventing the operation is clear. Directory.Exists() has problems because it can still return true after a successful Directory.Delete(). The reason is that we call the win32 API RemoveDiretory to delete the directory which will not remove it until the last handle to the directory is closed: but it may mark it to be deleted. Our implementation of Directory.Exists() doesn't currently have a way of distinguishing directories that are marked for deletion with normal directories and will return true in the former case. This is a rare race condition, but it can turn up from time to time in tests.
Unfortunately, there are no perfect solutions to this problem:
- Directory doesn’t have a FileOptions.DeleteOnClose enum value that can be passed at the construction time as with a constructor we added in V2.0 for FileStream. If it had, tests can create directories without worrying about cleaning up afterwards. It doesn’t look like the Win32 API, CreateDirectory, has an option to specify this.
- There is an enum in FileAttributes called NotContentIndexed that can be set on a directory but there is still a race in setting this attribute and creating the directory
- The test can create a directory that specifies ACLs denying access to other users (a capability we added in V2.0, check out the beta!) and then follow the pattern. But this seems to be bit of an overkill to do to be a good cleanup citizen on the machine. Also, its unlikely that most of our users would follow this pattern and we strive as much as possible to replicate our user scenarios in our mainline testing.
- Disable all content indexing services on a machine before a test run (good for testing, but no real user scenario oriented)
- Follow a fail-retry pattern where, if any of the above APIs fail in situations where its expected to succeed, we simply do it again
In the end, we opted for the last solution for our tests. It has the best chance of achieving our objectives in a reliable fashion. We could have changed the APIs, but this has the drawback of incurring a performance cost to every user on every call, when the vast majority of the time, the issue doesn’t occur. We like avoiding that.
Any feedback or questions you have is welcome.