See how often your code runs and how much time it takes

Last post I showed how to create your own CLR profiler.

This time, we’ll add a little code to run whenever a managed method is entered and exited. We’ll do this by signing up for the  Enter and Leave function hooks by calling ICorProfilerInfo2->SetEnterLeaveFunctionHooks2 . This will allow us to count how many times a function is called and roughly measure how much time is spent in the function.

The ClrProfiler.cpp code below is similar to last post. I’ve added some code that logs when a method is entered and left.

To run this code, follow the instructions from last post, then replace the CLRProfiler.cpp code with the code below. The ClrLauncher code is unchanged.

This new version of the CLRProfiler.cpp is almost the same as last time’s: the modifications are surrounded by /*<Modified>*//*</Modified>*/

You’ll note that before and after a managed method body is executed, the CLR will call the functions I specify. My code is called with no prolog or epilog code, so the _declspec(naked) needs to be used and any registers used must be saved/restored.

“MyFunctionEnter” and “MyFunctionLeave” just push the parameters, save the registers, then call MyFunctionEnter2, which does the work of logging the call and the amount of time spent in the call. The time calculation is very basic. Upon entry, GetTickCount() is called and the value recorded. Upon exit, the difference between the current tick count and the recorded value is accumulated.

You’ll notice that when you run your code under this profiler, performance is quite a bit slower. Build & run the Release configuration for an improvement

When you exit the profiled app, 2 files are opened in Excel. “LogOutputFunctionMap” and “LogOutputClassMap”

The Class map is the same as from last time. The FunctionMap shows the entrance and exit of each function.

Once Excel opens, hit Alt-N T (to insert a table) then hit Enter to use the entire data set as a table. Click on the down arrow in a column header to sort, filter or search.

Here’s a sample of the Method Enter/Leave and timing data from Excel sorted by longest Ticks.

clip_image001[4]

The enter and leave numbers mostly match, which makes sense. Where they don’t match: the wndproc was entered 1 more time than exited because it processed the “Quit” message.

The GetRect was called 100,301 times, but only took about 660 milliseconds, but the tmr_Tick method took 1724 msecs.

The time reported in the function is inclusive: it includes time spent in called functions.

If you set the environment variables, then launch Visual Studio under this profiler:

set COR_ENABLE_PROFILING=1

set COR_PROFILER={EA6ED0D7-6196-457D-9A2D-B00F1CC3EA8E}

set COR_PROFILER_PATH=c:\users\calvinh\documents\visual studio 2013\Projects\ClrLauncher\Debug\ClrProfiler.dll

VS will be *very* slow because every managed method entry/exit is logged. Also, the name of the executable, like “devenv” or “ClrLauncher” is appended to the file name because VS can launch multiple child processes, which will inherit the environment variables and also be profiled.

When I load a C# WPF app, I get multiple pairs of Profile output for each of “csWPFApplication1.vshost”, “devenv”, “Microsoft.Alm.Shared.Remoting.RemoteContainer”, “IntelliTrace”

<Code>

 // ClrProfiler.cpp : Defines the exported functions for the DLL application.
// File->New Project->VC++->Win32 Project "ClrProfiler"
// in the Win32 Application Wizard, choose application type: Dll, Empty Project.
// Right click on Solution Explroer->Source Files->Add New Item->C++ File. "ClrProfiler.cpp"
//


#include "atlsafe.h"
#include "atlcom.h"

#include <cor.h>
#include <corprof.h>

#include "string"
#include "unordered_map"
#include "map"
#include "iostream"
#include "fstream"

#include "shlobj.h"
#include "shellapi.h"

using namespace std;

//using namespace mscorlib;

#define MAXCLASSNAMELEN 900

#define PASTE2(x,y) x##y
#define PASTE(x,y) PASTE2(x,y)
#define __WFUNCDNAME__ PASTE(L, __FUNCDNAME__)

#define PROF_NOT_IMP(methodName, ...) \
  STDMETHOD(methodName) (__VA_ARGS__) \
{ \
if (m_fLoggingOn)  LogOutput(__WFUNCDNAME__);  \
  return S_OK;  \
} \

// let's define a Guid that represents our profiler
// VS->Tools->Create GUID
#include "initguid.h"
// {EA6ED0D7-6196-457D-9A2D-B00F1CC3EA8E}
DEFINE_GUID(CLSID_MyProfiler,
  0xea6ed0d7, 0x6196, 0x457d, 0x9a, 0x2d, 0xb0, 0xf, 0x1c, 0xc3, 0xea, 0x8e);

// a class that holds information about CLR objects
class ClrClassStats {
public:
  wstring className; // name of class
  int nInstances; // # of instances ever created
  long nSize; // total size (some instances vary in size, like arrays and string)
};
CComQIPtr<ICorProfilerInfo2 > g_pCorProfilerInfo; // can be null


/*<Modified>*/
// Sections of code marked "Modified" are modified from https://blogs.msdn.com/b/calvin_hsia/archive/2014/01/30/10495751.aspx 
struct less_wstringCaseInsensitive
{
  bool operator() (const std::wstring &s1, const std::wstring &s2) const
  {
    return _wcsicmp(s1.c_str(), s2.c_str()) < 0;
  }
};

class FunctionCallInfo
{
public:
  wstring ModuleName;
  wstring ClassName;
  wstring MethodName;
  wstring fullname() const
  {
    return ModuleName + L"!" + ClassName + L"." + MethodName;
  }
  int nEnter;
  int nLeave;
  int TicksAtEntry;
  int TicksTotal;
};

unordered_map<FunctionID, FunctionCallInfo> g_MapFunctionCalls;
CComAutoCriticalSection g_csectFunctionMap;


void __stdcall MyFunctionEnter2(
  /*[in]*/FunctionID functionID,
  /*[in]*/UINT_PTR clientData,
  /*[in]*/COR_PRF_FRAME_INFO func,
  /*[in]*/COR_PRF_FUNCTION_ARGUMENT_INFO   *argumentInfo)
{
  FunctionCallInfo *pFunctionCallInfo = nullptr;
  g_csectFunctionMap.Lock();
  auto res = g_MapFunctionCalls.find(functionID);
  if (res != g_MapFunctionCalls.end())
  {
    pFunctionCallInfo = &res->second;
  }
  else
  {
    ClassID classID;
    ModuleID moduleID;
    mdToken mdtokenMethod;
    HRESULT hr = S_OK;
    CComPtr<IMetaDataImport> pIMetaDataImport = 0;
    hr = g_pCorProfilerInfo->GetFunctionInfo2(functionID, 0, &classID, &moduleID, &mdtokenMethod, 0, 0, 0);
    hr = g_pCorProfilerInfo->GetModuleMetaData(moduleID, ofRead, IID_IMetaDataImport, (LPUNKNOWN *)&pIMetaDataImport);
    mdToken mdTokClass = 0;
    WCHAR szClassName[MAXCLASSNAMELEN] = { 0 };
    WCHAR szMethodName[MAXCLASSNAMELEN] = { 0 };
    WCHAR szModulename[MAX_PATH] = { 0 };
    WCHAR szSourceLine[MAX_PATH] = { 0 };
    hr = pIMetaDataImport->GetMethodProps(mdtokenMethod, &mdTokClass, szMethodName, _countof(szMethodName), NULL, NULL, NULL, NULL, NULL, NULL);
    mdToken tkExtends = NULL;
    hr = pIMetaDataImport->GetTypeDefProps(
      mdTokClass,
      szClassName,
      _countof(szClassName),
      0,
      0,
      &tkExtends);
    LPCBYTE pBaseLoadAddress;
    AssemblyID assemblyId;
    ULONG cchName = 0;
    hr = g_pCorProfilerInfo->GetModuleInfo(moduleID, &pBaseLoadAddress, _countof(szModulename), &cchName, szModulename, &assemblyId);

    WCHAR szFname[_MAX_FNAME];
    WCHAR szExt[_MAX_EXT]; // includes "."
    _wsplitpath_s(szModulename, NULL, 0, NULL, 0, szFname, _MAX_FNAME, szExt, _MAX_EXT);
    _snwprintf_s(szModulename, _TRUNCATE, L"%s%s", szFname, szExt);
    FunctionCallInfo callInfo = {
      szModulename,
      szClassName,
      szMethodName
      // rest of members auto zeroed
    };
    auto res = g_MapFunctionCalls.insert(pair<FunctionID, FunctionCallInfo>(functionID, callInfo));
    pFunctionCallInfo = &res.first->second;
  }
  pFunctionCallInfo->nEnter++;
  pFunctionCallInfo->TicksAtEntry = GetTickCount(); // won't work for recursion, nesting etc.
  g_csectFunctionMap.Unlock();
}

void __stdcall MyFunctionLeave2(
  /*[in]*/FunctionID             functionID,
  /*[in]*/UINT_PTR              clientData,
  /*[in]*/COR_PRF_FRAME_INFO         func,
  /*[in]*/COR_PRF_FUNCTION_ARGUMENT_INFO   *argumentInfo)
{
  FunctionCallInfo *pFunctionCallInfo = nullptr;
  g_csectFunctionMap.Lock();
  auto res = g_MapFunctionCalls.find(functionID);
  if (res != g_MapFunctionCalls.end())
  {
    pFunctionCallInfo = &res->second;
    pFunctionCallInfo->nLeave++;
    pFunctionCallInfo->TicksTotal += GetTickCount() - pFunctionCallInfo->TicksAtEntry;
  }
  else
  {
    auto x = 2;
  }
  g_csectFunctionMap.Unlock();
}

_declspec(naked) void __stdcall MyFunctionEnter(
  FunctionID funcId,
  UINT_PTR clientData,
  COR_PRF_FRAME_INFO func,
  COR_PRF_FUNCTION_ARGUMENT_INFO *argumentInfo)
{
  _asm {
    push ebp
      mov ebp, esp
      pushad
      mov   eax, [ebp + 0x14]   //argumentInfo
      push  eax
      mov   ecx, [ebp + 0x10]   //func
      push  ecx
      mov   edx, [ebp + 0x0C]   //clientData
      push  edx
      mov   eax, [ebp + 0x08]   //funcID
      push  eax
      call MyFunctionEnter2
      popad
      pop ebp
      ret 16
  }
}
_declspec(naked) void __stdcall MyFunctionLeave(
  FunctionID funcId,
  UINT_PTR clientData,
  COR_PRF_FRAME_INFO func,
  COR_PRF_FUNCTION_ARGUMENT_RANGE *retvalRange
  )
{
  _asm {
    push ebp
      mov ebp, esp
      pushad
      mov   eax, [ebp + 0x14]   //argumentInfo
      push  eax
      mov   ecx, [ebp + 0x10]   //func
      push  ecx
      mov   edx, [ebp + 0x0C]   //clientData
      push  edx
      mov   eax, [ebp + 0x08]   //funcID
      push  eax
      call MyFunctionLeave2
      popad
      pop ebp
      ret 16
  }
}
_declspec(naked) void __stdcall MyFunctionTailCall(
  FunctionID funcId,
  UINT_PTR clientData,
  COR_PRF_FRAME_INFO func)
{
  _asm {
    //push ebp
    //mov ebp, esp

    //pop ebp
    ret 12
  }
}
/*</Modified>*/

// declare a COM profiler that implements an interface ICorProfilerCallback3
class MyProfiler :
  public ICorProfilerCallback3,
  public CComObjectRootEx<CComSingleThreadModel>,
  public CComCoClass<MyProfiler, &CLSID_MyProfiler>
{
public:
  BEGIN_COM_MAP(MyProfiler)
    COM_INTERFACE_ENTRY_IID(CLSID_MyProfiler, MyProfiler)
    COM_INTERFACE_ENTRY(ICorProfilerCallback2)
    COM_INTERFACE_ENTRY(ICorProfilerCallback3)
  END_COM_MAP()
  DECLARE_NOT_AGGREGATABLE(MyProfiler)
  DECLARE_NO_REGISTRY()

  MyProfiler()
  {
    LogOutput(L"MyProfiler Constructor");
  }
  bool m_fLoggingOn;
  void LogOutput(LPCWSTR wszFormat, ...)
  {
    if (IsDebuggerPresent())
    {
      SYSTEMTIME st;
      GetLocalTime(&st);
      WCHAR buf[1000];
      swprintf_s(buf, L"%2d/%02d/%02d %2d:%02d:%02d:%03d thrd=%d ", st.wMonth, st.wDay, st.wYear - 2000, st.wHour,
        st.wMinute, st.wSecond, st.wMilliseconds, GetCurrentThreadId());
      OutputDebugStringW(buf);
      va_list insertionArgs;
      va_start(insertionArgs, wszFormat);
      _vsnwprintf_s(buf, _countof(buf), wszFormat, insertionArgs);
      va_end(insertionArgs);
      OutputDebugStringW(buf);
      OutputDebugStringW(L"\r\n");
    }
  }

  // ICorProfilerCallback2 >>

  // STARTUP/SHUTDOWN EVENTS
  STDMETHOD(Initialize)(IUnknown *pICorProfilerInfoUnk)
  {
    m_fLoggingOn = true;
    // tell clr which events we want to be called for
    DWORD dwEventMask = COR_PRF_ENABLE_STACK_SNAPSHOT
      | COR_PRF_ENABLE_OBJECT_ALLOCATED
      | COR_PRF_MONITOR_GC //GarbageCollectionStarted, GarbageCollectionFinished, MovedReferences, SurvivingReferences, ObjectReferences, ObjectsAllocatedByClass, RootReferences, HandleCreated, HandleDestroyed, and FinalizeableObjectQueued callbacks.
      | COR_PRF_MONITOR_OBJECT_ALLOCATED // Object
      /*<Modified>*/ | COR_PRF_MONITOR_ENTERLEAVE // method enter/leave
      /*</Modified>*/
      | COR_PRF_MONITOR_CLASS_LOADS // ClassLoad and ClassUnload 
      | COR_PRF_MONITOR_MODULE_LOADS // ModuleLoad, ModuleUnload, and ModuleAttachedToAssembly callbacks.
      | COR_PRF_MONITOR_ASSEMBLY_LOADS // AssemblyLoad and AssemblyUnload callbacks
      | COR_PRF_MONITOR_APPDOMAIN_LOADS // ModuleLoad, ModuleUnload, and ModuleAttachedToAssembly callbacks.
      | COR_PRF_MONITOR_SUSPENDS //Controls the RuntimeSuspend, RuntimeResume, RuntimeThreadSuspended, and RuntimeThreadResumed callbacks.
      | COR_PRF_MONITOR_THREADS // Controls the ThreadCreated, ThreadDestroyed, ThreadAssignedToOSThread, and ThreadNameChanged callbacks
      ;
    g_pCorProfilerInfo = pICorProfilerInfoUnk;
    g_pCorProfilerInfo->SetEventMask(dwEventMask);
    /*<Modified>*/g_pCorProfilerInfo->SetEnterLeaveFunctionHooks2(MyFunctionEnter, MyFunctionLeave, MyFunctionTailCall);
    /*</Modified>*/
    return S_OK;
  }
  STDMETHOD(ObjectAllocated)(ObjectID objectID, ClassID classID)
  { // gets called whenever a managed obj is created
    HRESULT hr = S_OK;
    _csectClassMap.Lock();
    ClrClassStats *pClrObjStats = nullptr;
    // beware of threading!
    auto res = _ClassDictionary.find(classID);
    if (res != _ClassDictionary.end())
    {// already exists
      pClrObjStats = &res->second;
    }
    else
    { // not found. figure out the name of the obj
      bool fIsArrayClass = false;
      CorElementType elemType;
      ClassID arrayClassId;
      ULONG arrayRank;
      hr = g_pCorProfilerInfo->IsArrayClass(
        classID,
        &elemType,
        &arrayClassId,
        &arrayRank);
      if (hr == S_OK)
      {
        if (elemType == ELEMENT_TYPE_CLASS)
        {
          fIsArrayClass = true;
          classID = arrayClassId;
        }
        else
        {
          hr = E_FAIL;  // todo
        }
      }
      if (hr != E_FAIL)// (S_FALSE if not array)
      {
        ModuleID moduleID = 0;
        mdTypeDef tdToken;
        ClassID classIdParent;
        ClassID typeArgs[20]; //generics
        ULONG32 cNumTypeArgs;
        cNumTypeArgs = _countof(typeArgs);
        hr = g_pCorProfilerInfo->GetClassIDInfo2(
          classID,
          &moduleID,
          &tdToken,
          &classIdParent,
          cNumTypeArgs,
          &cNumTypeArgs,
          typeArgs);
        if (hr == S_OK && moduleID != 0)
        {
          CComPtr<IMetaDataImport> pIMetaDataImport;
          if (g_pCorProfilerInfo->GetModuleMetaData(
            moduleID,
            ofRead,
            IID_IMetaDataImport,
            (LPUNKNOWN *)&pIMetaDataImport) == S_OK)
          {
            WCHAR wszClassName[MAXCLASSNAMELEN] = { 0 };
            mdToken tkExtends = NULL;
            hr = pIMetaDataImport->GetTypeDefProps(
              tdToken,
              wszClassName,
              _countof(wszClassName),
              0,
              0,
              &tkExtends);
            if (hr == S_OK)
            {
              ClrClassStats classStat =
              {
                wszClassName //assign WCHAR to wstring
              };
              if (fIsArrayClass)
              {
                classStat.className.append(L"[]");
              }
              if (m_fLoggingOn)
              {
                LogOutput(classStat.className.c_str());
              }
              auto res = _ClassDictionary.insert(
                pair<ClassID, ClrClassStats>(
                classID,
                classStat)
                );
              pClrObjStats = &res.first->second;
            }
          }
        }
      }
    }
    if (pClrObjStats != nullptr)
    {
      ULONG ulObjSize = 0;
      g_pCorProfilerInfo->GetObjectSize(objectID, &ulObjSize);
      pClrObjStats->nSize += ulObjSize;
      pClrObjStats->nInstances += 1;
      if (pClrObjStats->className == L"WindowsFormsApplication1.Paddle")
      {
        auto x = "Put a breakpoint here";
      }
    }
    _csectClassMap.Unlock();
    return S_OK;
  }

  STDMETHOD(Shutdown)()
  {
    wchar_t desktopFolderName[MAX_PATH];
    SHGetFolderPath(
      NULL, //hWnd, 
      CSIDL_DESKTOP,
      NULL, //token
      0, //dwFlags
      desktopFolderName
      ); // C:\Users\calvinh\Desktop
    /*<Modified>*/
    // sort by name
    map<wstring, FunctionCallInfo, less_wstringCaseInsensitive> mapFuncbyName;
    for (const auto &obj : g_MapFunctionCalls)
    {
      mapFuncbyName.insert(pair<wstring, FunctionCallInfo>(
        obj.second.fullname(),
        obj.second));
    }
    wchar_t exeName[MAX_PATH];
    wchar_t exeNameFull[MAX_PATH];
    GetModuleFileName(NULL, exeNameFull, MAX_PATH); // like C:\PROGRAM FILES\MICROSOFT VISUAL STUDIO 11.0\COMMON7\IDE\devenv.exe
    // now get just the base name of the process
    _wsplitpath_s(exeNameFull,
      /*Drive=*/NULL, 0,
      /*Dir=*/ NULL, 0,
      /*Name=*/exeName, MAX_PATH,
      /*Ext=*/NULL, 0);

    wstring outFileNameFunctionMap(desktopFolderName);
    outFileNameFunctionMap += L"\\LogOutputFunctionMap";
    outFileNameFunctionMap += exeName;
    outFileNameFunctionMap += L".csv";
    {
      wofstream outFile(outFileNameFunctionMap); //dtor will close file
      outFile << "Class Name, Enter, Leave, Ticks" << endl;
      for (const auto &obj : mapFuncbyName)
      {
        LogOutput(L"\"%s\",  %d, %d, %d",
          obj.first.c_str(),
          obj.second.nEnter,
          obj.second.nLeave,
          obj.second.TicksTotal
          );
        outFile
          << L"\""
          //surround name with quotes because could have a ","
          // and we're using CSV (CommaSeparatedValue)
          // ex:
          // mscorlib.dll!System.Collections.Generic.Dictionary`2.System.Collections.Generic.IDictionary<TKey,TValue>.get_Keys
          << obj.first.c_str()
          << L"\""
          << L","
          << obj.second.nEnter
          << L","
          << obj.second.nLeave
          << L","
          << obj.second.TicksTotal
          << endl;
      }
    }
    // now start file with Excel
    ShellExecute(
      0,  // hwnd
      0, // lpOperation
      outFileNameFunctionMap.c_str(),
      nullptr, //lpParameters
      nullptr, // lpDirectory
      0 // nShowCmd
      );

    // sort by name
    map<wstring, ClrClassStats> mapClassbyName;
    for (const auto &obj : _ClassDictionary)
    {
      mapClassbyName.insert(pair<wstring, ClrClassStats>(obj.second.className, obj.second));
    }
    wstring outFileNameClassMap(desktopFolderName);
    outFileNameClassMap += L"\\LogOutputClassMap";
    outFileNameClassMap += exeName;
    outFileNameClassMap += L".csv";
    {
      wofstream outFile(outFileNameClassMap); //dtor will close file
      outFile << "Class Name, NumInstances, TotSize" << endl;
      for (const auto &obj : mapClassbyName)
      {
        LogOutput(L"\"%s\",   %d,  %d", obj.first.c_str(), obj.second.nInstances, obj.second.nSize);
        outFile
          << L"\""
          << obj.first.c_str()
          << L"\""
          << L","
          << obj.second.nInstances
          << L","
          << obj.second.nSize
          << endl;
      }
    }
    // now start file with Excel
    ShellExecute(
      0,  // hwnd
      0, // lpOperation
      outFileNameClassMap.c_str(),
      nullptr, //lpParameters
      nullptr, // lpDirectory
      0 // nShowCmd
      );
    /*</Modified>*/

    return S_OK;
  }

  // APPLICATION DOMAIN EVENTS
  PROF_NOT_IMP(AppDomainCreationStarted, AppDomainID appDomainId);
  PROF_NOT_IMP(AppDomainCreationFinished, AppDomainID appDomainId, HRESULT hr);
  PROF_NOT_IMP(AppDomainShutdownStarted, AppDomainID);
  PROF_NOT_IMP(AppDomainShutdownFinished, AppDomainID appDomainId, HRESULT hr);

  // ASSEMBLY EVENTS
  PROF_NOT_IMP(AssemblyLoadStarted, AssemblyID);
  PROF_NOT_IMP(AssemblyLoadFinished, AssemblyID, HRESULT);
  PROF_NOT_IMP(AssemblyUnloadStarted, AssemblyID);
  PROF_NOT_IMP(AssemblyUnloadFinished, AssemblyID assemblyID, HRESULT hr);

  // MODULE EVENTS
  PROF_NOT_IMP(ModuleLoadStarted, ModuleID);
  PROF_NOT_IMP(ModuleLoadFinished, ModuleID moduleID, HRESULT hr);
  PROF_NOT_IMP(ModuleUnloadStarted, ModuleID moduleId);
  PROF_NOT_IMP(ModuleUnloadFinished, ModuleID, HRESULT);
  PROF_NOT_IMP(ModuleAttachedToAssembly, ModuleID moduleID, AssemblyID assemblyID);

  // CLASS EVENTS
  PROF_NOT_IMP(ClassLoadStarted, ClassID classId);
  PROF_NOT_IMP(ClassLoadFinished, ClassID classId, HRESULT hr);
  PROF_NOT_IMP(ClassUnloadStarted, ClassID classId);
  PROF_NOT_IMP(ClassUnloadFinished, ClassID, HRESULT);
  PROF_NOT_IMP(FunctionUnloadStarted, FunctionID);

  // JIT EVENTS
  PROF_NOT_IMP(JITCompilationStarted, FunctionID functionID, BOOL fIsSafeToBlock);
  PROF_NOT_IMP(JITCompilationFinished, FunctionID functionID, HRESULT hrStatus, BOOL fIsSafeToBlock);
  PROF_NOT_IMP(JITCachedFunctionSearchStarted, FunctionID functionId, BOOL *pbUseCachedFunction);
  PROF_NOT_IMP(JITCachedFunctionSearchFinished, FunctionID, COR_PRF_JIT_CACHE);
  PROF_NOT_IMP(JITFunctionPitched, FunctionID);
  PROF_NOT_IMP(JITInlining, FunctionID, FunctionID, BOOL*);

  // THREAD EVENTS
  PROF_NOT_IMP(ThreadCreated, ThreadID);
  PROF_NOT_IMP(ThreadDestroyed, ThreadID);
  PROF_NOT_IMP(ThreadAssignedToOSThread, ThreadID, DWORD);

  // REMOTING EVENTS
  // Client-side events
  PROF_NOT_IMP(RemotingClientInvocationStarted);
  PROF_NOT_IMP(RemotingClientSendingMessage, GUID*, BOOL);
  PROF_NOT_IMP(RemotingClientReceivingReply, GUID*, BOOL);
  PROF_NOT_IMP(RemotingClientInvocationFinished);
  // Server-side events
  PROF_NOT_IMP(RemotingServerReceivingMessage, GUID*, BOOL);
  PROF_NOT_IMP(RemotingServerInvocationStarted);
  PROF_NOT_IMP(RemotingServerInvocationReturned);
  PROF_NOT_IMP(RemotingServerSendingReply, GUID*, BOOL);

  // CONTEXT EVENTS
  PROF_NOT_IMP(UnmanagedToManagedTransition, FunctionID, COR_PRF_TRANSITION_REASON);
  PROF_NOT_IMP(ManagedToUnmanagedTransition, FunctionID, COR_PRF_TRANSITION_REASON);

  // SUSPENSION EVENTS
  PROF_NOT_IMP(RuntimeSuspendStarted, COR_PRF_SUSPEND_REASON);
  PROF_NOT_IMP(RuntimeSuspendFinished);
  PROF_NOT_IMP(RuntimeSuspendAborted);
  PROF_NOT_IMP(RuntimeResumeStarted);
  PROF_NOT_IMP(RuntimeResumeFinished);
  PROF_NOT_IMP(RuntimeThreadSuspended, ThreadID);
  PROF_NOT_IMP(RuntimeThreadResumed, ThreadID);

  // GC EVENTS
  PROF_NOT_IMP(MovedReferences, ULONG cmovedObjectIDRanges, ObjectID oldObjectIDRangeStart[], ObjectID newObjectIDRangeStart[], ULONG cObjectIDRangeLength[]);

  PROF_NOT_IMP(ObjectsAllocatedByClass, ULONG classCount, ClassID classIDs[], ULONG objects[]);
  PROF_NOT_IMP(ObjectReferences, ObjectID objectID, ClassID classID, ULONG cObjectRefs, ObjectID objectRefIDs[]);
  PROF_NOT_IMP(RootReferences, ULONG cRootRefs, ObjectID rootRefIDs[]);

  // Exception creation
  PROF_NOT_IMP(ExceptionThrown, ObjectID);

  // Exception Caught
  PROF_NOT_IMP(ExceptionCatcherEnter, FunctionID, ObjectID);
  PROF_NOT_IMP(ExceptionCatcherLeave);

  // Search phase
  PROF_NOT_IMP(ExceptionSearchFunctionEnter, FunctionID);
  PROF_NOT_IMP(ExceptionSearchFunctionLeave);
  PROF_NOT_IMP(ExceptionSearchFilterEnter, FunctionID);
  PROF_NOT_IMP(ExceptionSearchFilterLeave);
  PROF_NOT_IMP(ExceptionSearchCatcherFound, FunctionID);

  // Unwind phase
  PROF_NOT_IMP(ExceptionUnwindFunctionEnter, FunctionID);
  PROF_NOT_IMP(ExceptionUnwindFunctionLeave);
  PROF_NOT_IMP(ExceptionUnwindFinallyEnter, FunctionID);
  PROF_NOT_IMP(ExceptionUnwindFinallyLeave);

  PROF_NOT_IMP(ExceptionCLRCatcherFound); // Deprecated in .Net 2.0
  PROF_NOT_IMP(ExceptionCLRCatcherExecute); // Deprecated in .Net 2.0

  PROF_NOT_IMP(ExceptionOSHandlerEnter, FunctionID); // Not implemented
  PROF_NOT_IMP(ExceptionOSHandlerLeave, FunctionID); // Not implemented

  // IID_ICorProfilerCallback2 EVENTS
  PROF_NOT_IMP(ThreadNameChanged, ThreadID threadId, ULONG cchName, __in WCHAR name[]);
  PROF_NOT_IMP(GarbageCollectionStarted, int cGenerations, BOOL generationCollected[], COR_PRF_GC_REASON reason);
  PROF_NOT_IMP(SurvivingReferences, ULONG cSurvivingObjectIDRanges, ObjectID objectIDRangeStart[], ULONG cObjectIDRangeLength[]);
  PROF_NOT_IMP(GarbageCollectionFinished);
  PROF_NOT_IMP(FinalizeableObjectQueued, DWORD finalizerFlags, ObjectID objectID);
  PROF_NOT_IMP(RootReferences2, ULONG cRootRefs, ObjectID rootRefIds[], COR_PRF_GC_ROOT_KIND rootKinds[], COR_PRF_GC_ROOT_FLAGS rootFlags[], UINT_PTR rootIds[]);
  PROF_NOT_IMP(HandleCreated, GCHandleID handleId, ObjectID initialObjectId);
  PROF_NOT_IMP(HandleDestroyed, GCHandleID handleId);

  // COM CLASSIC VTable
  PROF_NOT_IMP(COMClassicVTableCreated, ClassID wrappedClassID, REFGUID implementedIID, void *pVTable, ULONG cSlots);

  PROF_NOT_IMP(COMClassicVTableDestroyed, ClassID wrappedClassID, REFGUID implementedIID, void *pVTable);
  // ICorProfilerCallback2 <<

  PROF_NOT_IMP(InitializeForAttach, IUnknown* punk, void* data, UINT datasize);
  PROF_NOT_IMP(ProfilerAttachComplete);
  PROF_NOT_IMP(ProfilerDetachSucceeded);
protected:
  //key = ClassId: one entry per unique ClassId
  unordered_map<ClassID, ClrClassStats> _ClassDictionary;
  CComAutoCriticalSection _csectClassMap;
};

OBJECT_ENTRY_AUTO(CLSID_MyProfiler, MyProfiler)

// define a class that represents this module
class CClrProfModule : public ATL::CAtlDllModuleT< CClrProfModule >
{
#if _DEBUG
public:
  CClrProfModule()
  {
    int x = 0; // set a bpt here
  }
  ~CClrProfModule()
  {
    int x = 0; // set a bpt here
  }
#endif _DEBUG
};

// instantiate a static instance of this class on module load
CClrProfModule _AtlModule;
// this gets called by CLR due to env var settings
STDAPI DllGetClassObject(__in REFCLSID rclsid, __in REFIID riid, __deref_out LPVOID FAR* ppv)
{
  HRESULT hr = E_FAIL;
  hr = AtlComModuleGetClassObject(&_AtlComModule, rclsid, riid, ppv);
  //  hr= CComModule::GetClassObject();
  return hr;
}
//tell the linker to export the function
#pragma comment(linker, "/EXPORT:DllGetClassObject=_DllGetClassObject@12,PRIVATE")

</Code>