Debugging an InvalidCastException


First, obviously, find the two types for which the cast failed and verify that they are the same type or otherwise castable.


Next, if the type was just deserialized, also verify that its assembly successfully loaded in the target appdomain.


If everything seems fine, check to see if the assemblies for those two types are loaded from different locations and in the same appdomain. (The actual cast is done in just one appdomain, even if the exception happens when passing a type between two appdomains.) Even if the bits of those assemblies are totally identical, if they are loaded from different paths, they will be considered different, so their types will be considered different. (See Comparing Already-Loaded Assemblies.)


A quick way to check for that is to examine the loaded module window of a debugger to see if that assembly was loaded multiple times. If it was, break on module loads to get the callstack for the unexpected load. If that’s inconvenient, try getting the Fusion log.


Usually, the problem is that:



  1. The assembly is available in the GAC (or the ApplicationBase) and loaded there by static reference (something was compiled against that assembly).
  2. It has also been loaded dynamically by path, from another path (LoadFrom(), LoadFile(), etc.).
  3. Then, the code tries to cast the type from (2) to the corresponding type from (1).

To fix this, once you find the offending caller, you will need to either cause the two types to be loaded from the same assembly from the exact same path, or avoid doing the cast. To decide between the assemblies at paths (1) and (2), see Choosing a Binding Context. Usually, I recommend using (1) – see Switching to the Load Context for help with implementing that.

Comments (11)

  1. Royston Shufflebotham says:

    I’ve got an InvalidCastException situation after doing an AppDomain.CreateInstanceFrom, and wonder whether you or any of your readers can shed some light, Suzanne?

    I have a controlling assembly which is trying to do some FxCop work. To keep FxCop happy, I need to execute some of my code which calls FxCop inside an AppDomain with an ApplicationBase pointing to the top of the FxCop binary tree. However, my controlling assembly doesn’t (and can’t) live anywhere inside that tree, so I can’t use any private path probing stuff to find my controlling assembly in the new appdomain.

    So, I have an FxCopRunner class (extending MarshalByRefObject) implementing a basic IFxCopRunner interface in my controlling assembly, and I’m doing this to get things off the ground, but I can’t get any casts to work:

    AppDomainSetup ads = new AppDomainSetup();

    ads.ApplicationBase = fxCopBinTree;

    AppDomain ad = AppDomain.CreateDomain( "Remote Domain", AppDomain.CurrentDomain.Evidence, ads );

    // Create a remoted instance of my FxCopRunner class in the newly-created AppDomain

    System.Runtime.Remoting.ObjectHandle oh = ad.CreateInstanceFrom( Assembly.GetExecutingAssembly().Location, typeof(FxCopRunner).FullName );

    Console.Out.WriteLine( oh.Unwrap().ToString() ); // Works, and correctly calls the remote FxCopRunner’s implementation of ToString()

    object fxcopRunner1 = (IFXCopRunner)oh.Unwrap(); // Fails with InvalidCastException

    object fxcopRunner2 = (FxCopRunner)oh.Unwrap(); // Fails with InvalidCastException

    Any ideas on how to get these casts to work? I’d very much appreciate any pointers you can give, as I get the feeling I’m missing something quite fundamental here…

    (There’s no playing around with the GAC here: none of the assemblies – including the FxCop ones – are in the GAC, nor do I want them to be. (This is automated test code which will not go into any production systems.)]

    Cheers,

    (and thanks for a *very* useful and high signal-to-noise blog!)

    Royston.

  2. Hi Suzanne,

    I am working on a plugin for VS .Net 2003 and at some point I needed to create an AppDomain.

    I ran into the "invalid cast" problem.

    I checked that the second AppDomain loaded the target assembly from the same location that the first AppDomain loaded it.

    In fact, the modules debug window in VS shows no new assemblies loaded after the CreateInstanceAndUnwrap(…) call.

    So it is reasonably safe to say that both AppDomains are using the same assembly.

    I still got the cast exception.

    I think I have a work around. The first load of the assembly is no longer needed, so the invalid cast exception *should not* happen.

    I hope.

    I just thought that this might be useful info.

  3. The original workaround did not work.

    I decided that the easiest way in this case is to use a WeakRefence across AppDomains.

    Is there an specific reason why VS forces the "invalid cast" exception?

  4. Suzanne says:

    José: The Unwrap() in CreateInstanceAndUnwrap() may be what’s loading the assembly. So, it is not good enough to see that nothing is loaded after that call has finished. Try getting the Fusion log (see original blog entry).

    Another load of the assembly in the calling appdomain is only relevant if the Unwrap()’d type’s assembly is cast to it. So, that may be why removing another load did not fix this for you.

    Using a simply-named assembly will not solve this problem if it loaded from two different paths. I recommend that you keep it strongly-named, and avoid the use of LoadFrom()/LoadFile()/Load(byte[]) instead.

    Can’t blame this one on VS. 🙂 This is due to the design of remoting – Assembly does not extend MarshalByRefObject, so assemblies are reloaded when passed between appdomain boundaries. Imagine the case where appdomains are on different machines. In that case, it is not convenient or performant to pass the entire file to the remote machine. So, the assembly display name is sent instead, and it is reloaded there. (Having the appdomains on the same machine is not considered a special case.) Additionally, different appdomains have different binding policies which affect what is allowed to be loaded there. So, automatically using an assembly from another appdomain without doing a new bind may not be correct.

  5. José Cornado says:

    Thanks a lot!!

    I got the logic working without a performance hit

    I will poke around the bindng policies when I have a fresher head.

    Again thanks for your help!

  6. Thanks to Suzanne Cook for this one. "First, obviously, find the two types for which the cast failed,

  7. Andrew Miadowicz says:

    Suzanne,

    Your blog has helped shed some light on a rather gnarly problem I and a couple fellow developers have been dealing with recently, but it doesn’t quite solve the issue.  I wonder if you could help answer a couple of specific questions if you still have the time to reply to this blog.

    In the project we’re working on our code runs in a CLR that is hosted (via COM Interop) in a C++ executable.  Our main assembly gets loaded via the magic of the COM registry from the appropriate directory which is _different_ than the directory where the C++ executable resides.  The additional .Net assemblies get loaded based on early bound references (at least for the time being) by the built-in CLR mechanism.  Now, to make things interesting the code so loaded needs to communicate via remoting with other managed code in a separate process.  The assemblies in both processes are early bound to a couple of dlls, one of which in particular contains definitions of shared interfaces.

    The problem we see manifests itself in the form of the InvalidCastException, when we attempt to call into one of the remote objects via one of the shared interfaces.  What’s intriguing, we can successfully create the remote object via the Activator class and cast it to one of the shared interfaces.  We can also call to the remote object to obtain some simple properties (such as strings).  However, as soon as we make a call to a method/property that returns another shared interface from the same shared assembly, we get the aforementioned exception.

    After searching online for a little while we found that someone else reported a similar problem specifically in the case where managed assemblies where hosted in an unmanaged process, and he mentioned that if all the managed dlls were located in the same directory as the original executable, the problem went away.  We tried the same hack and in fact the exception disappeared.  We also verified that the remoting works fine if the two communicating processes are entirely managed.

    Unfortunately we do not have the option of copying our dlls into the path of the unmanaged executable, so we need to understand where the problem really lies and figure out a good solution.  Could you help?

    Our current hypothesis is that the issue is with the shared dlls getting loaded from different paths.  This theory only works, however, if the path CLR considers when checking if an assembly is already loaded is in fact _relative_ to the AppDomain’s base.  Is that the case or does CLR compare absolute paths?  If so, the reasoning would go as follows.  Since in .Net by default the referenced dlls are loaded from the path of the executable (and if build in Visual Studio with the "copy local" set for all references all referenced assemblies end up in this path) the relative path for all assemblies is "".  If a .Net assembly, however, is loaded into a process via COM interop, it may come from any directory whatsoever based on what’s in the registry.  In this case additional assemblies referenced by the main assembly could come from that same directory, but their relative path with respect to the base path (the location of the executable) would definitely NOT be "".  Consequently the types loaded from the dll shared between our two processes would not be compatible and casts would fail.  Are we on the right track here?  Is there some way to circumvent the problem?

    For completeness, all our assemblies are signed, but none is deployed to GAC.  We also played with the location of our dlls, and placed them all in the same absolute path to make sure that their absolute paths looked exactly the same in both processes, and still saw the same InvalidCastException.

    I would really appreciate if you could offer any additional insight.

  8. ferherra says:

    I created MyCustomSqlMembershipProvider.  I extended the functinality adding some extra methods that I need.  To use them, I’m trying to cast the object as follows:

    MyCustomSqlMembershipProvider myProvider = (MyCustomSqlMembershipProvider)Membership.Provider;

    But I get the following error:

    Unable to cast object of type ‘MyProject.App_Code.MyCustomSqlMembershipProvider’ to type ‘MyProject.App_Code.MyCustomSqlMembershipProvider’.

    I created a MyCustomSqlMembershipProvider object and compared the Type with the one of Membership.Provider, and they have different Assemblies.  MyCustomProvider is inside App_Code folder of my project.  Here is my web config entry for the membership provider:

    <membership defaultProvider=”MyMembershipProvider” >

    <providers>

       <clear/>

       <add name=”MyMembershipProvider”

                type=”MyProject.App_Code.MyCustomSqlMembershipProvider, __code”

                connectionStringName=”MyConnectionString”/>

    </providers>

    </membership>

    My question are, how to make the cast?  Why VS2005 assigns different Assembly names to each object?

    Note: [Right now I’m calling the method this way… but I guess that is not the idea behind the provider model:

    Type type = Membership.Provider.GetType();

    object myMembershipProvider = Activator.CreateInstance(type);

    object[] values = new object[] { Request.UserHostAddress };

    bool validUser =(bool) type.InvokeMember(“ValidateInNetworkUser”, System.Reflection.BindingFlags.InvokeMethod, null, myMembershipProvider , values);]

  9. Omer Ganot says:

    Hi,

    I’m facing a similar trouble to the different scenarios specified above:

    I wrote a generic SNMP agent infrastructure which implements the Windows SNMP extension agent.

    This infrastructure is composed of a native C++ DLL which actually implements the extension agent – “Native”,

    and a C++/CLI API dll – “API”, which is combined from a native class so it can be loaded by “Native” and a managed (ref)

    class.

    When the SNMP service launches (snmp.exe in System32), it loads “Native” , which loads the native class in “API”.

    Then, the managed class in “API” searches for an assembly which complies to the file convention “*SNMPAgent.dll”.

    Then, it runs the following code to create an instance of this specific SNMP agent, by reflection:

    // Load assembly

    MDAgentAssembly = Assembly::LoadFrom(AgentFile);

    // Find the class that implement the IMDSNMPAgent interface

    for each (Type^ type in MDAgentAssembly->GetExportedTypes())

    {

    if (type->GetInterface(“IMDSNMPAgent”) != nullptr)

    {

    mMDAgentType = type;

    // Create an instance of the agent

    Object^ temp= Activator::CreateInstance(type);

    mMDAgent = (IMDSNMPAgent^)temp;  <=== EXCEPTION OCCURS HERE

    // Verify agent instance was created

    if (MDAgent != nullptr)

    {

    mMDAgent = MDAgent;

    Debug::WriteLine(“INFO: ManagedAPI::LoadMDSNMPAgent. Successfully loaded and instantiated MDSNPAgent”);

    LoadedAgent = true;

    break;

    }

    } // foreach Type

    I have two different paths – X & Y. On each equal copies of “Native” and “API” are installed.

    On path X I have the specific SNMP agent XSNMPAgent, and on path Y I have the specific SNMP agent YSNMPAgent.

    Both implement the IMDSNMPAgent interface.

    If I configure the SNMP service (through the registry) to load “Native” on only one of the paths, it works just fine.

    If I configure the SNMP service to load “Native” both on path X and on path Y (serially), the first agent is loaded

    perfectly but the other fails and pops an InvalidCastException (happens in the line marked above):

    Exp:System.InvalidCastException: Unable to cast object of type ‘GeneralTools.GenericSNMPAgent.YSNMPAgent’ to type

    ‘GeneralTools.GenericSNMPAgent.IMDSNMPAgent’.

    Needless to say that YSNMPAgent do implement IMDSNMPAgent.

    Moreover, even if I switch the load order of X and Y, the second agent (this time X) fails.

    Notes and attempts

    ==================

    1) The relevant code for loading “API” is:

    // API.dll path find arithmetics

    char APIPath[_MAX_PATH], AgentPath[_MAX_PATH], Log[_MAX_PATH];

    char drive[_MAX_PATH], dir[_MAX_PATH], fname[_MAX_PATH], ext[_MAX_PATH];

    HANDLE hProcess = OpenProcess(  PROCESS_QUERY_INFORMATION |

    PROCESS_VM_READ,

    FALSE, GetCurrentProcessId() );

    HMODULE module;

    void *caller = _ReturnAddress();

    GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS, (LPCSTR)caller, &module);

    DWORD PathLen = GetModuleFileNameEx(hProcess, module, AgentPath, _MAX_PATH);

    /*CloseHandle(module);

    CloseHandle(hProcess);*/

    if (_splitpath_s(AgentPath, drive, _MAX_PATH, dir, _MAX_PATH, fname, _MAX_PATH, ext, _MAX_PATH) != 0)

    {

    OutputDebugString(“ERROR: GenericSNMPAgent::SnmpExtensionInit. Failed to split DLL path”);

    return FALSE;

    }

    sprintf_s(APIPath, “%s%s%s”, drive, dir, GENERIC_SNMP_AGENT_API_NAME);

    sprintf_s(Log, “INFO: Loading %s”, APIPath);

    OutputDebugString(Log);

    //sprintf_s(AgentPath, “%s%s”, drive, dir);

    //sprintf_s(Log, “INFO: SetDllDirectory %s”, AgentPath);

    //OutputDebugString(Log);

    //if (!SetDllDirectory(AgentPath))

    //OutputDebugString(“ERROR: Failed on SetDllDirectory”);

    mGenericAPIDLL = LoadLibrary(APIPath);

    2) API indeed loads with the second agent and tries to load the specific SNMP agent, but the following (errornous) fusion log is received:

    *** Assembly Binder Log Entry  (16/05/2007 @ 15:17:03) ***

    The operation failed.

    Bind result: hr = 0x80070002. The system cannot find the file specified.

    Assembly manager loaded from:  C:WINDOWSMicrosoft.NETFrameworkv2.0.50727mscorwks.dll

    Running under executable  C:WINDOWSSystem32snmp.exe

    — A detailed error log follows.

    === Pre-bind state information ===

    LOG: User = NT AUTHORITYSYSTEM

    LOG: DisplayName = GenericSNMPAgentAPI, Version=0.0.0.0, Culture=neutral, PublicKeyToken=e7ae14dcac16fb18

    (Fully-specified)

    LOG: Appbase = file:///C:/WINDOWS/System32/

    LOG: Initial PrivatePath = NULL

    LOG: Dynamic Base = NULL

    LOG: Cache Base = NULL

    LOG: AppName = snmp.exe

    Calling assembly : (Unknown).

    ===

    LOG: This bind starts in default load context.

    LOG: No application configuration file found.

    LOG: Using machine configuration file from C:WINDOWSMicrosoft.NETFrameworkv2.0.50727configmachine.config.

    LOG: The same bind was seen before, and was failed with hr = 0x80070002.

    ERR: Unrecoverable error occurred during pre-download check (hr = 0x80070002).

    3) Both XSNMPAgent and YSNMPAgent fail to load with the same exception if putting “API” on the GAC. (Though no binding error for “API” this time)

    4) Tried seperating to two different application domains- both failed.

    5) All dll’s (except “Native”) are strongly-named.

    6) Tried LoadLibraryEx- didn’t help.

    7) Skipping the “IMDSNMPAgent” casting and using reflection method invokation worked only partially (it pops ArgumentException for some of the transferred

    parameters)

    8) If I debug, and watch at the contents of _Object^ temp_ (before the casting) I do see the various data members of the created YSNMPAgent instance, but

    again, casting failes.

    9) This is frustrating 🙁

    Please help… For a supporting diagram, please refer to: http://www.imagestation.com/7952793/3917319103

    Thanks!

  10. jdk99 says:

    Hi Suzanne & fellow readers,

    I ran into this issue using the Load(byte[]) context.  As mentioned in the article, each time the assembly was needed to resolve a type (for deserialization, in my case) a new Assembly instance was being created.  This took a few hours to finally figure out because the debugger was telling me that the assemblies were equal, even if the runtime asserts were failing.

    Anyway, because I needed to cast objects correctly without getting this error, I have overridden this default behavior in the AssemblyResolve handler for my AppDomain by maintaining a Dictionary<string, Assembly>.  When being asked to resolve an assembly, I first check in my dictionary to see if the AssemblyFullName has already been loaded by name, and if it has, return that previously built instance.  (I’m careful to use the full name of the loaded assembly rather than the event argument’s request, in case they would ever be different)

    This solved my immediate problem beautifully, but I’m concerned that the default behavior is the way it is for a good reason!  What kind of mayhem am I creating for myself by overriding this behavior and returning the previously loaded Assembly instance upon subsequent calls?

    Thanks,

    J

Skip to main content