How fast is interop code?


How fast is interop code? If you’re in one kind of code and your calling another, what is the cost of the interop?


 


For example, .Net code can call native C++ code (like Windows APIs) and vice versa. Similarly with Foxpro and C++ code. .Net code is often referred to as Managed code because much is managed for the programmer, such as memory allocation. That leaves  C++ code to be called “Unmanaged”. An easy way to interop with C++ code is to use COM (Component Object Model, or sometimes ActiveX) as glue. Whether it’s COM calling .Net or vice versa, the managed boundary is traversed twice: there and back. Similarly with Fox code calling COM code.


 


Fox code calling .Net code (e.g. A Visual Basic COM object is simple to create, call and debug from Excel) will have both Fox to COM and COM to .Net interop.


 


I want to measure raw interop performance, so I want to remove memory allocation and Unicode/String marshalling issues from the tests. I want to have a loop on one side call a very fast method on the other, so that most of the execution time is in the interop, not the loop or the method call. I want to use in-process, same thread calls, so remote procedure calls/marshalling are not being measured.


 


We’ll create a native C++ method that just returns consecutive integers. A simple loop in the .Net or Fox client that calls this method keeps a running total would be a good perf test.


 


Start with this sample ActiveX control code: Create an ActiveX control using ATL that you can use from Fox, Excel, VB6, VB.Net. You don’t need the events and methods from that sample, just the control itself.


 


(If you’re using VS2008, in the ATL Project wizard, select DLL and just choose “Finish”. When adding a method in Class View, make sure to choose the ITestCtrl Interface (defined in MyCtrl.IDL, not ITestCtrl VCCodeStruct defined in MyCtrl_i.h. Similarly, if you’re adding an event, make sure to choose the _ITestCtrlEvents interface under MyCtrlLib in Class View. Also, you need to run the “Implement ConnectionPoint Wizard” and change the call to “Fire_MyEvent”, see http://msdn.microsoft.com/en-us/library/9h7xedd1.aspx)


 


When COM code is called from VB.Net or FoxPro, the calls are not quite direct: COM is used for creating the object and initialization and there is some parameter/return value massaging required per call. Then it’s either a straight virtual function call (vTable) call to IDispatch (late bound) or IUnknown (early bound). IOW, the performance would be slower than the a direct PInvoke or DECLARE DLL call.


 


Let’s add a simple method RetInt with no parameters that just returns an int. Add a method to our COM Control by right clicking on the ITestCtrl interface in Class View and choosing Add->Method to start the “Add Method Wizard”


 


Since all COM interface method calls return HRESULTS, to return a value an additional parameter is added and marked with the RetVal attribute and is passed by ref. So make the Method Name “RetInt”, the Parameter type “LONG *”, and the Parameter Name “RetVal”. Choose the Retval checkbox. Then choose “Add” to add the param to the method.


 


Add another method DoSum similarly. This method will run with no interop, so we have a baseline for comparison. (It runs the loop multiple times because it goes so much faster, but the timing measurement divides out the multiple runs.)


 


The resulting code is added to TestCtrl.CPP. Add the implementation:


 


static LONG g_Int = 0;


STDMETHODIMP CTestCtrl::RetInt(LONG* RetVal)


{


      *RetVal = ++g_Int;      // just return consecutive integers


      return S_OK;


}


 


// DoSum will calculate the value with no interop whatsoever


STDMETHODIMP CTestCtrl::DoSum(LONG nTimes,LONG nInternalLoopCount, DOUBLE* Retval)


{


      LONGLONG nSum;


      for (LONG j = 0 ; j < nInternalLoopCount ; j++) // this code runs so fast we have to do it multiple times


      {


            nSum = 0;


            for (LONG i = 1 ; i <= nTimes ; i++)


            {


                  nSum += i;


            }


      }


      *Retval = (DOUBLE)nSum;


      return S_OK;


}


// RetIntStatic can be called directly via PInvoke or Declare Dll


extern “C” HRESULT __declspec(dllexport) WINAPI RetIntStatic(LONG *RetVal)


{


      *RetVal = ++g_Int;


      return S_OK;


}


 


 


You can add more methods, like a way to reset g_Int to get more accurate results, but I don’t really care about the results, just how long it takes to get them.


 


Of course, you’ll want to run perf tests using optimized Release builds, so you’re not including debug asserts, etc. A really smart optimizing compiler would remove the loops in DoSum altogether!


 


If you have Foxpro, try running this Fox code. Notice that DoLoop can take either the Form or the Control as a parameter. There’s a RetInt method on each.


 


 


CLEAR ALL


CLEAR


MODIFY COMMAND PROGRAM() NOWAIT


 


 


_screen.FontName=”Courier New”      && Make font monospace, not proportional


SET DECIMALS TO 6


g_Int=0


ox=CREATEOBJECT(“MyForm”)


*ox.visible=1


nLoops=1000000


nInternalLoopCnt=1000


ns=SECONDS()


zObj=ox.oc  && use temp var so we don’t deref ox.oc in loop


r = zObj.DoSum(nLoops,nInternalLoopCnt)


?”Internal DoSum “,r,(SECONDS()-ns)/nInternalLoopCnt


 


 


?DoLoop(ox,nLoops,   “With  No Interop” )


 


?DoLoop(ox.oc,nLoops,”With COM Interop”)


 


 


*Use early binding:


oy=CREATEOBJECTEx(“MyCtrl.TestCtrl”,””,””)


?DoLoop(oy,nLoops,”With COM Interop Early Bound”)


 


*Try Declare DLL: like PInvoke


DECLARE  integer _RetIntStatic@4 IN “d:\dev\vc\myctrl\release\myctrl.dll” as RetIntStatic integer @ Retval


 


?DoLoopStatic(nLoops,”With DeclareDLL interop”)


 


 


 


      FUNCTION   DoLoop(zObj as object,nTimes as Integer, sDesc as String) as String


            LOCAL nSum


            nSum=0


            ns=SECONDS()


            FOR i = 1 TO nTimes


                  nSum = nSum + zObj.RetInt()


            ENDFOR


            RETURN sDesc+” Sum= “+TRANSFORM(nSum) + ” “+ TRANSFORM(SECONDS()-ns)


      RETURN


 


      FUNCTION   DoLoopStatic(nTimes as Integer, sDesc as String) as String


            LOCAL nSum


            nSum=0


            nRetval=0


            ns=SECONDS()


            FOR i = 1 TO nTimes


                  RetIntStatic(@nRetval)


                  nSum = nSum + nRetval


            ENDFOR


            RETURN sDesc+” Sum= “+TRANSFORM(nSum) + ” “+ TRANSFORM(SECONDS()-ns)


      RETURN


 


DEFINE CLASS MyForm as Form


      ADD OBJECT OC as olecontrol WITH ;


            oleClass=”MyCtrl.TestCtrl”,;


            height=200,width=300


      left=200


      AllowOutput=.f.


      PROCEDURE RetInt as Integer


            g_Int = g_Int+1


            RETURN g_Int


ENDDEFINE


 


 


 


 


The DoSum call (Fox and VB) was consistent as expected: they both execute in about the same time because there is only one interop call.


 


I consistently saw the COM Interop loop taking about 50% longer than the non interop loop. This makes sense. The code that calls the COM object has to deal with all sorts of parameter types, marshalling, etc. The non interop did the entire calculation within Fox code.


 


The DoSum method has its own internal loop to do the calculation, which does NO interop of any kind in the loop, runs roughly 2000 times faster. That implies there are about 2000 times more instructions executed in the loop.


 


Now I want to run a similar test using VB.Net. Let’s add a new project to the ActiveX control project from above.


 


Choose the Solution Explorer, right click on the solution, choose Add New Project, VB->Windows Forms Application. I put my VB Project within the folder of the TestCtrl project.


 


Right click on the project, and choose “Set As Startup Project” so hitting F5 will start this project.


If you’re on a 64 bit OS, then make sure you target x86 (Project->Properties->Compile->Advanced Compile Options->Target CPU->x86


 


 


Add the ActiveX control to your toolbox: Right click on the toolbox, choose items\COM Components…TestCtrl class.


Now drag the control from the toolbox onto the form. Dbl Click on the form and paste in this code:


 


Public Class Form1


    ‘Note the path: “..\..\..\Release\MyCtrl.dll”


    <Runtime.InteropServices.DllImport( _


            “..\..\..\Release\MyCtrl.dll”, _


           CallingConvention:=Runtime.InteropServices.CallingConvention.Winapi, _


           entrypoint:=“_RetIntStatic@4”)> _


    Friend Shared Function RetIntStatic(ByRef RetVal As Integer) As Integer


 


    End Function


 


 


    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load


        Dim nLoops = 1000000


        Dim nInternalLoopCnt = 1000


 


 


        Dim sStopWatch = Stopwatch.StartNew


        Dim r = Me.AxTestCtrl1.DoSum(nLoops, nInternalLoopCnt)


        Console.WriteLine(“Internal DoSum Native=” + r.ToString + ” “ + (sStopWatch.ElapsedMilliseconds / 1000 / nInternalLoopCnt).ToString)


 


        sStopWatch = Stopwatch.StartNew


        r = Me.DoSum(nLoops, nInternalLoopCnt)


        Console.WriteLine(“Internal DoSum .Net  =” + r.ToString + ” “ + (sStopWatch.ElapsedMilliseconds / 1000 / nInternalLoopCnt).ToString)


 


        Console.WriteLine(DoLoop(Me, nLoops, “With Late bound No interop, calling local VB.Net method”))


        Console.WriteLine(DoLoop(Me.AxTestCtrl1, nLoops, “With Late bound COM interop”))


 


        Console.WriteLine(DoLoopEarlyForm(Me, nLoops, “With Early bound No interop, calling local VB.Net method”))


        Console.WriteLine(DoLoopEarlyCtrl(Me.AxTestCtrl1, nLoops, “With Early bound COM interop”))


 


        Console.WriteLine(DoLoopPInvoke(nLoops, “With PInvoke interop”))


 


        End


    End Sub


 


    Function DoLoop(ByVal zObj As Object, ByVal nTimes As Integer, ByVal sDesc As String) As String


        Dim nSum = 0L


        Dim sStopWatch = Stopwatch.StartNew


        For i = 1 To nTimes


            nSum += zObj.RetInt


        Next


        Return sDesc + ” Sum = “ + nSum.ToString + ” “ + (sStopWatch.ElapsedMilliseconds / 1000).ToString()


    End Function


 


    Function DoLoopEarlyForm(ByVal zObj As Form1, ByVal nTimes As Integer, ByVal sDesc As String) As String


        Dim nSum = 0L


        Dim sStopWatch = Stopwatch.StartNew


        For i = 1 To nTimes


            nSum += zObj.RetInt


        Next


        Return sDesc + ” Sum = “ + nSum.ToString + ” “ + (sStopWatch.ElapsedMilliseconds / 1000).ToString()


    End Function


 


    Function DoLoopEarlyCtrl(ByVal zObj As AxMyCtrlLib.AxTestCtrl, ByVal nTimes As Integer, ByVal sDesc As String) As String


        Dim nSum = 0L   ‘L for Long so doesn’t overflow 32 bits


        Dim sStopWatch = Stopwatch.StartNew


        For i = 1 To nTimes


            nSum += zObj.RetInt


        Next


        Return sDesc + ” Sum = “ + nSum.ToString + ” “ + (sStopWatch.ElapsedMilliseconds / 1000).ToString()


    End Function


 


 


    Function DoLoopPInvoke(ByVal nTimes As Integer, ByVal sDesc As String) As String


        Dim nSum = 0L   ‘L for Long so doesn’t overflow 32 bits


        Dim sStopWatch = Stopwatch.StartNew


        For i = 1 To nTimes


            Dim RetVal = 0


            RetIntStatic(RetVal)


            nSum += RetVal


        Next


        Return sDesc + ” Sum = “ + nSum.ToString + ” “ + (sStopWatch.ElapsedMilliseconds / 1000).ToString()


    End Function


 


    Private Shared g_Int As Long


    Public Function RetInt() As Long


        g_Int += 1


        Return g_Int


    End Function


 


    Function DoSum(ByVal nTimes As Long, ByVal nInternalLoopCount As Long) As Double


        Dim nSum As Long


        For j = 1 To nInternalLoopCount ‘ calculated multiple times because it’s fast


            nSum = 0


            For i = 1 To nTimes


                nSum += i


            Next


 


        Next


        Return nSum


    End Function


End Class


 


Run the code with the Output Window visible. Here, the VB code with interop ran maybe 40% slower, and several times slower than the Fox code. I realized that this was because of the late binding calls the VB code does. The VB DoLoop method takes zObj as an Object, and I invoke the RetInt method on it. That means, the VB runtime latebinder code is called to reflect on the object and see if it has a Retint method on it that can be called. Both the Form and the control have a method with this name. The latebinding was code that I didn’t want to measure, so I added some strongly typed calls that forced the calls to be early bound direct calls, which were much faster. For the Non-interop code doing the entire calculation within VB, the late bound was around 1000 times slower than the early bound, due to the late binder code. For the Interop case, the late bound was about 30 times slower than the early.


 


(Comparing .Net speed with native, the DoSum call (with no interop at all) in .Net was almost 3 times slower than Native, but that’s expected too:  native code runs faster than managed.)


 


These early bound calls are several times faster than the Fox code too: they don’t have to do any parameter packing/checking.


 


However, even the Fox code doing early binding, Fox still has to do a lot of parameter translation between fox types and COM types.


 


The Fox and VB calls via PInvoke/Declare DLL were the fastest of all. They have to do the least parameter translation/packing/checking. This makes sense: the method call is declared to have N parameters of certain types, so less work needs to be done.


 


 


 


Using ILDasm to see the IL for the RetInt, you can see that there isn’t much code. The Fox code for RetInt, however, causes much more code to run.


 


 


.method public instance int64  RetInt() cil managed


{


  // Code size       24 (0x18)


  .maxstack  2


  .locals init ([0] int64 RetInt)


  IL_0000:  nop


  IL_0001:  ldsfld     int64 WindowsApplication1.Form1::g_Int


  IL_0006:  ldc.i4.1


  IL_0007:  conv.i8


  IL_0008:  add.ovf


  IL_0009:  stsfld     int64 WindowsApplication1.Form1::g_Int


  IL_000e:  ldsfld     int64 WindowsApplication1.Form1::g_Int


  IL_0013:  stloc.0


  IL_0014:  br.s       IL_0016


  IL_0016:  ldloc.0


  IL_0017:  ret


} // end of method Form1::RetInt


 


 


 


Or use the debugger to see the native code in DoSum: (cdq is ConvertDoubleToQuadWord)


       LONGLONG nSum=0;


       for (LONG i = 1 ; i <= nTimes ; i++)


       {


692B1DF1 8B C1            mov         eax,ecx


692B1DF3 99               cdq             


692B1DF4 03 D8            add         ebx,eax


692B1DF6 13 EA            adc         ebp,edx


692B1DF8 8D 41 01         lea         eax,[ecx+1]


692B1DFB 99               cdq             


692B1DFC 03 F0            add         esi,eax


692B1DFE 8B 44 24 20      mov         eax,dword ptr [esp+20h]


692B1E02 13 FA            adc         edi,edx


692B1E04 83 C1 02         add         ecx,2


692B1E07 48               dec         eax 


692B1E08 3B C8            cmp         ecx,eax


692B1E0A 7E E5            jle         CTestCtrl::DoSum+21h (692B1DF1h)


692B1E0C 3B 4C 24 20      cmp         ecx,dword ptr [esp+20h]


692B1E10 7F 0B            jg          CTestCtrl::DoSum+4Dh (692B1E1Dh)


       {


              nSum += i;


692B1E12 8B C1            mov         eax,ecx


692B1E14 99               cdq             


692B1E15 89 44 24 10      mov         dword ptr [esp+10h],eax


692B1E19 89 54 24 14      mov         dword ptr [esp+14h],edx


       }


 


 


 


This (optimized) code sums 32 bit values to a 64 bit running sum, so you can see instructions like “ADC”, which is AddWithCarry.


 


As an exercise, on 64 bit, create code like DoSum that natively handles 64 bit ints (or modify this code to use just 32 bits). You’ll see that the loop is trivial.


Hint: make sure you have the 64 bit tools installed.


 


 

Comments (22)

  1. Jerzy says:

    But, where is the conclusion

    How fast is interop code?

    Still do not know

  2. Dataland says:

    Thanks for the interesting and well thought-out post.  Overall I agree with your objective and methodology on measuring pure interop overhead.  But, string marshalling and memory allocations are very real costs of most interop code I’ve seen, so I think in all fairness they should be included, to some degree, in a real-word discussion/measurement of interop performance.  Again, thanks for the analysis and post!

  3. MSbassSinger says:

    I am unclear as to why you would use late binding in VB.  Except for plugin scenarios where the VB code is not supposed to know what components it is loading (usually only a defined plugin interface), any good VB programmer always early-binds a COM DLL or OCX.

    Did I miss something here?  If you want to comapre speed, wouldn’t you compare early bound objects in VB?

  4. Bhavesh Vaghela says:

    Nice Article. Good Comparison.

    Once I was working on a project for which ran into performance related issues due to parameter checking/packing because I was passing too many parameters and moreover I was calling MFC COM code from VB6 which was taking too much. So I converted my whole VB6 ActiveX  to MFC ActiveX project to overcome the performance issue. And you wont believe the application was 73 times faster.

  5. Paul Sanders says:

    Do you have any actual figures?  Some timings on a reasonably fast modern processor would suit me, compared with a simple method call in entirely unmanaged code (and indeed with a method call in antirely managed code).

    I have to say though, that if I was designing a ‘mixed mode’ program, I would try to minimise the number of transitions on general principle.  The rationale for doing things in unmanaged code is generally for performance reasons, is it not, so it would be a shame to throw that all away in overheads.

    Paul Sanders

    http://www.alpinesoft.co.uk

  6. RexNFX says:

    So the moral of the story is to reduce the number of calls to interop. It sure would be cool to see some examples of that strategy. Good article though. Gets the right idea across.

  7. Steve Floyd says:

    Excellent build-up, but there was no crescendo.

    Where are your conclusions (timings)?

  8. SednaY says:

    If the point is to prove it’s better to minimize interop and COM calls, I think you’ve made it.  What does it mean in practical terms though?  If I make one interop call to VFP from ASP.net and VFP does 100 calculations within itself (using the built-in db/cursor engine) and returns the result, couldn’t this still be faster than doing the same sort of routine in managed code using a SQl backend??  in that case interop might still be the way to go?

  9. f says:

    <script>alert(‘hello’)</script>

  10. Dmitri says:

    I think there are no figures published because when you register VS, agree to license terms that prohibit to one to use the product for any kind of benchmarking or performance-related comparisons of the Microsft technologies. Read the license terms 🙂

  11. DLetcher says:

    If you are using C++ and going from managed to unmanaged code I would avoid COM.  That just adds an extra layer of marshalling.  The VC++ compiler in VisualStudio 2005/2008 has efficient C++ .Net Interop code generation.  

    Though, if you are using higer level languages using COM might be the only way to go.

  12. Ted says:

    Any chance MS will release a code generator that automatically generates a C++ interop wrapper for a win32 call?  This C++ wrapper could be included in a C# project for example.

    Generally, MS could make it a goal to have sucessive parts of the win32 api replaced with .NET equivalents over the next 2 years.

    Specalized APIs, such as device driver ones, are excluded.

  13. Marc says:

    If you are concerned about performance, why use .NET in the first place ?

  14. Calvin_Hsia says:
  15. Ramses Reinoso says:

    Hello Calvin,

    I’m a VFP programmer for at least 13 years. Thanks to be there sharing your ideas. My question is, What language you’re more confortable with? C#, C++, VB, etc. (to build desktop applications, of course).  And if you were thinking to replace VFP (as in my case my only programming language) which one you’d suggest to shift to?

    Thanks,

    Ramses Reinoso

  16. superjimbob says:

    Of course interop will have greater impact with the number of parameters also.

    So when looking for speed, sometimes it pays to serialise parameters to an XML string and only marshall the one string parameter ;o)

  17. Zolpidem tartrate extended-release tablets civ. Cheap zolpidem. Zolpidem online. Zolpidem fedex. Zolpidem dosing. Zolpidem eszopiclone indications. Zolpidem.

  18. When running VB.Net or C# code, often it’s useful to call native code. One way is using PInvoke. (for

  19. Tracy says:

    Interesting.  Maybe you have an idea on a problem I am experiencing.  I’ve searched high and low to no avail 🙁  I am successfully interacting with a .net dll via com interop. However, all of the calls are processed synchronously. I really need VFP processing to continue and not wait for a response. Is there a way to call a method in a .net dll via com interop asynchronously? I want to make the call to the method but continue on with the next line in vfp code regardless of the response or whether or not a response is returned. I do need to act on the response though in some cases. Most of my required actions are handled via events which works fine, but there are a couple of cases where I actually need to a return value after calling the method. In most cases though I need to just continue on with the next line of code without waiting for a response.  Any ideas?

  20. Calvin_Hsia says:

    See this post for ideas to get started:

    Create a .Net UserControl that calls a web service that acts as an ActiveX control to use in Excel, VB6, Foxpro

    http://blogs.msdn.com/calvin_hsia/archive/2006/07/14/665830.aspx