How can I tell whether a COM pointer to a remote object is still valid?


A customer asked the rather suspicious question, "How do I check whether a pointer is valid in another process?"

This question should make your head boggle with bewilderment. First of all, we've moved beyond Crash­Program­Randomly to Crash­Some­Other­Program­Randomly. Second of all, what the heck are you doing with a pointer in another process? You can't do anything with it!

After some back-and-forth¹ we manage to tease the real question out of the customer: How can I tell whether a COM pointer to a remote object is still valid?

The easy answer is "Don't worry. COM will take care of it." Just call the method on the object. If the remote object is not valid, you will get an error back, like RPC_E_DISCONNECTED or RPC_E_SERVER_DIED or RPC_E_SERVER_DIED_DNE or HRESULT_FROM_WIN32(RPC_S_SERVER_UNAVAILABLE). When you get an error like that, you'll know that the remote object is no longer valid, and you can respond accordingly.

What if you want your program to be a little proactive and prune dead remote objects instead of just noticing that they're dead the net time you want to use them?

Some people "solve" this problem by performing a Query­Interface on a newly-generated interface ID. Since the IID has never been seen before, COM cannot consult its cache of previously-queried interfaces and must remote the call, at which point the death of the remote object will be detected. (The second rule for implementing Query­Interface exists in part so that COM can optimize Query­Interface of remote objects.) The problem with this technique is that by subverting the cache, you also end up polluting it. Each time you generate a new IID and do a dummy Query­Interface on it, you add another dummy entry to the Query­Interface cache. This wastes memory keeping track of interfaces that nobody will ever ask for again, and may even push out information about interfaces that your program actually uses!

The COM folks tell me that your program should just accept the fact that the other process can go away at any time. Instead of making some sort of decision based on whether the other process is still there (since a response of "yeah, it's still here" could be wrong by the time you act on it), you should just call the method and accept that it may fail because the other process vanished while you weren't looking.

Footnote

¹ The customer first explained that their server process created an object and gave a pointer to that object to the client. The client then registered a callback object with the server, and the server wanted to check that the client object was still valid before invoking any methods on it. When asked, "Why not just use COM?" the customer replied, "We are using COM. We create the object on the server via Co­Create­Instance, then register the client object via a method on our interface."

The customer was under the impression that when a COM pointer refers to an object in another process, you just get that pointer from the other process.

If you think about it, this makes no sense at all. How could any of your method calls work? You call pRemote­Object->AddRef() and the compiler is going to deference the pRemote­Object pointer, and then crash because the pointer would refer to memory in another process. I guess the customer was under the impression that some magic voodoo happens so that the CPU knows that "Oh wait, this pointer really belongs to another process, let me go fetch the memory from that other process. Okay, and now you want to call a function pointer in another process? Okay, um, let me magically merge the two processes together so the remote code running in that other process can access the objects in your process." Or something.

When you have a COM pointer to an object in another process, the pointer that you have is a proxy which accepts method calls and marshals the call to the real object somewhere else.

Comments (27)
  1. pcooper says:

    While certainly those who understand what the CPU is doing know that it's the COM system that need to do some kind of proxying, I don't think one should be surprised that when COM masks the complexity so that calling a remote object is just like calling a local object, that people start thinking that when they use COM that calling a remote object is just like calling a local object.

  2. MikeCaron says:

    @pcooper: Normally, it is (mostly) like calling a method on a local object. You can't tell whether a local pointer is good or not either.

    It shouldn't require too much of a mental stretch to figure out that just passing a pointer to another process is a terrible idea.

  3. Adam Rosenfield says:

    Maybe the customer thought that COM was implemented in shared memory?  It seems plausible at first, but once you realize all COM objects have vtable pointers and that the shared memory would have to be mapped at the same virtual addresses in all processes, you can see that it's far from trivial to get working in shared memory.

    [If the customer had thought that COM was implemented in shared memory, then the question would have been "How can I tell whether a pointer is valid in my process?" -Raymond]
  4. Z.T. says:

    Not understanding that RPC marshals calls is like thinking that an ORM stores pointers in the database.

  5. Ben says:

    If your application needs to know as an application feature that the object is still there, that needs to be an application-level task handled at the application level, by the application.

    For example, if you have a stock price monitor, which gets updates through some sort of callback interace, you want to know if the remote application has crashed, or forgotten you, and/or the network is down.

    That's important information in your scenario – there is a difference between "nobody has published a new price for the last 5 minutes" (which could easily happen) and "the router has severed your network connection five minutes ago without telling you" (in which case you need to reconnect).

    So if you want to know, at an application level, that the application is still there, you need to implement something at the application level, such as AreYouThere(){return S_OK;} or GetHealthStatus(__out PLONG plStatus){…} Or you can do it the other way with a "NothingHappenedRecently()" call.

    And if you want to know within 5 seconds if the network is down or the server crashed, ask every five seconds. There is no other way.

    That goes for TCP/IP, named pipes, DDE, database applications, and everything else.

  6. Joshua says:

    That "magic voodoo" might be shared memory, but if COM were shared memory well…

  7. jader3rd says:

    I've seen a lot of people on development forums ask questions which amount to that they assume the computer is doing magic voodoo.

  8. Porter says:

    Seeing as he wanted to know when the remote object acting as a client died, this is handled by DCOM as it will drop a reference to your callback object if the client died. If you have one callback object per client problem is solved.

  9. alegr1 says:

    That's just in: some programmers are (f-in) retarged. News at 11.

  10. OCD says:

    The problem with COM is that it's at that level of complexity which makes it possible to use without understanding properly, but still fairly difficult to properly understand. Many topics fall into this range, (for example C vs a higher level language), and the circumstances often dictate that individuals with limited understanding are trying to produce something to a deadline. This leads to those maintenance nightmarish scenarios in all sorts of engineering disciplines (probably none more so than software). This is why I am a strong advocate of the higher level languages. Not because they are better in every case, but because it helps people choose the difficult tools only when they must.

  11. cheong00 says:

    I can understand why the customer sometimes want that answer.

    That InternetExplorer for example. After call to Navigate(), you put a loop to check IE is no longer busy (it's vbscript so no event checking here). After you checked the browser is not gone to some error page, you use HTMLDocument to try to access an object to attempt automated login. Unfortunately, at the time the browser is still no ready and error is thrown. So you attempt to start another browser object and retry a backup address.

    No big deal here, except that when COM throw out the error and indicates the object underneath has been disconnected, the IE brwoser window in fact successfully reached the login page and sit there waiting. If you set the IE to be invisible to do the automation, you'll see lots of iexplorer.exe process stacks in background because you never have the chance to call InternetExploere.Quit() to close them.

    Many times like in such cases, I'd also want to know if the object being automated is still alive, or is there anyway to attempt reconnect the object.

  12. caf says:

    I think the COM people did the right thing here – if they'd provided the inherently racy function, then many people would simply write programs with an inherent race condition.

  13. TC says:

    Why does the customer's ability to program COM, imply that he should understand lower-level details like: processes don't all share the same memory? I can drive a car, without understanding how the engine actually works …

  14. Jonathan S says:

    @TC: It's usually not a big problem that the customer doesn't understand the lower-level details.  Except in the case where the customer makes wrong assumptions about them and then asks questions based on those assumptions.

    If the customer had asked "How can my program tell if the remote COM object is alive?" this would've been a pretty boring blog post.

    Same thing with your car engine analogy.  Not a problem until you start asking how to adjust your carburetor to keep the starter working (and it turns out your battery is going flat).

  15. cheong00 says:

    @TC: Point is well noted. However my major complain is on the fact that sometimes when COM object dies, the actual object being automated is still alive and there's no way to know whether the caller should do the cleanup or not, and / or how to do the cleanup.

  16. Simon Buchan says:

    I was surprised by the number of RPC error codes here, so I checked and there are a *bunch* of codes that you could get that cover the various causes for "the object you had previously isn't there any more", which sounds like a pain to cover accurately in a "IsAServerDownError(HRESULT)" method, since you probably don't want to keep killing and restarting the server when you are just giving a bad parameter. In particular, I would never have thought to check for HRESULT_FROM_WIN32() codes.

  17. I just know I'm going to be banished to pedant's corner, but there *are* things you can do with a pointer in another process.  For example you could use it in a call to CreateRemoteThread.

  18. Kevin says:

    "Richard Russell 16 Nov 2011 2:57 PM #

    I just know I'm going to be banished to pedant's corner, but there *are* things you can do with a pointer in another process.  For example you could use it in a call to CreateRemoteThread."

    You're not actually accessing that pointer though, just passing it to what is essentially a marshaling function.

  19. TC says:

    OT but:

    @cheong00: After a call to Navigate(), *do not* wait until IE.BUSY is false. Busy can change state several times before the document has fully loaded. That explains the errors you're getting. Wait instead until IE.READYSTATE = 4.

  20. TC says:

    Request indulgence for final OT to cheong00 :-)

    You *can* hook up to and use IE events in VBScript. Eg:

    set ie = wscript.createobject ("InternetExplorer.Application", "blah_")

    ie.visible = true

    msgbox "READY"

    ie.navigate2 "http://www.google.com"

    wscript.quit

    SUB blah_BeforeNavigate2 (pDisp, pURL, pFlags, pTargetFrameName, pPostData, pHeaders, pCancel)

    msgbox "GOING TO " & pURL

    END SUB

  21. TC says:

    @Jonathan: I take your points. But there's a difference between actively making assumptions, and just not knowing how things work. Raymond seems to believe that the customer *should know* that processes don't share the same memory. I still say: why should they necessarily know that? Cheers :-)

    [Um, because if processed shared memory, then you wouldn't need remoting in the first place: It would all be local! -Raymond]
  22. Crescens2k says:

    @TC: It isn't just about COM in this case, the address space affects Win32 programming in general. When you start getting to the stage where you can create other processes, you start coming across terms like virtual address space and stuff. If you look into the memory management then you will find even more references there, even a whole bunch of functions to allocate pages in the address space. Whats more, COM explicitly calls out the difference by using the terms inprocess server, local server and remote server. So it would be more surprising to get to the point of using COM in Win32 without being aware of processes not sharing address spaces.

    Since desktop and server operating systems on x86/x64 architectures run in protected mode these days, this is basic knowledge if you want to program for the platform.

  23. heterodox says:

    I really want to define the error code RPC_E_SERVER_DIED_DNR now.

  24. DWalker says:

    I'm amazed that people want to check whether an object exists, and then do something based on that information.  As you say, the object may not exist by the time you try to do something to it.  There are no guarantees that it will exist one millisecond from now.  

    Which, I suppose, means that you can never assume anything is true by the time you want to act on some knowledge.

  25. Ben says:

    @DWalker:

    Right! Like people checking the ACL to find out if they can access the file.

    What if someone has it open? What if you are prevented by a virus scanner or by mandatory access control? What if the ACL changed?

    If you want to know if you can open a file, call CreateFile.

  26. TC says:

    @all: good comments on COM/win32/memory management, thanks.

    @cheong00: I can probably tell you how. Email me on ch.20.keen4some@spamgourmet.com

  27. cheong00 says:

    @TC: No thanks. I've resolved to save the HWND value to kill it later if COM object connection is lost later.

Comments are closed.