Type.Missing, C#, and Word

Recently there was a little bit of a ruckus about the correct way to talk to the Word object model in C# when it comes to missing arguments. If you've ever used the Word PIAs with C# (Primary Interop Assemblies) you will be familiar with the coding practice below. For example, this slightly modified example comes from the MSDN VSTO 1.0 documentation--an example of how to spell check a string using the word object model in C#:

internal void SpellCheckString()
{
string str = "Speling erors here.";
object ignoreUpperCase = true; 
object missingType = Type.Missing;

bool blnSpell = ThisApplication.CheckSpelling(str,
ref missingType, ref ignoreUpperCase, ref missingType,
ref missingType, ref missingType, ref missingType,
ref missingType, ref missingType, ref missingType,
ref missingType, ref missingType, ref missingType);
MessageBox.Show(blnSpell.ToString(), "False if Errors, True if OK");
}

The first thing that probably comes to mind if you're a VB.NET programmer and you've never seen code written against Word in C# is “Why is this so verbose?“

VB.NET does some special things for you when there are optional arguments in a method, so the VB version of this looks like this: 

Friend Sub SpellCheckString()
Dim str As String = "Speling erors here."
Dim blnSpell As Boolean = _
ThisApplication.CheckSpelling(str, , True)
MessageBox.Show(blnSpell.ToString, "False if Errors, True if OK")
End Sub

In VB.NET you don't have to worry about passing a value for each optional argument--the language handles this for you. You can even use commas as shown above to omit one particular variable you don't want to specify--in this case we didn't want to specify a custom dictionary, but we did want to set IgnoreUpperCase, so we ommitted the custom dictionary argument by just leaving it out between the commas.

The first thing that probably comes to mind if you're a C# programmer and you've never seen code written against Word in C# is “Why is all that stuff passed by reference?“

The first thing to understand is that when you are talking to Word methods, you are talking to the Word object model through interop. The PIA (Primary Interop Assembly) is the vehicle through which you talk to the unmanaged Word object model from managed code.

If you were to examine the IDL definition for “CheckSpelling“ generated from Word's COM Type Library you would see something like this:

        [id(0x00000144), helpcontext(0x09700144)]
HRESULT CheckSpelling(
[in] BSTR Word,
[in, optional] VARIANT* CustomDictionary,
[in, optional] VARIANT* IgnoreUppercase,
[in, optional] VARIANT* MainDictionary,
[in, optional] VARIANT* CustomDictionary2,
[in, optional] VARIANT* CustomDictionary3,
[in, optional] VARIANT* CustomDictionary4,
[in, optional] VARIANT* CustomDictionary5,
[in, optional] VARIANT* CustomDictionary6,
[in, optional] VARIANT* CustomDictionary7,
[in, optional] VARIANT* CustomDictionary8,
[in, optional] VARIANT* CustomDictionary9,
[in, optional] VARIANT* CustomDictionary10,
[out, retval] VARIANT_BOOL* prop);

Note that any parameter that is marked as optional--meaning you can omit the value and Word will pick a reasonable default value or ignore that option--is marshalled as a pointer to a VARIANT in Word (Excel doesn't typically use a pointer to a VARIANT for optional parameters so you don't have this by ref issue for most of Excel). When the PIA is generated, the generated IL ends up looking like this in the PIA:

.method public hidebysig newslot abstract virtual
instance bool CheckSpelling([in] string marshal( bstr) Word,
[in][opt] object& marshal( struct) CustomDictionary,
[in][opt] object& marshal( struct) IgnoreUppercase,
[in][opt] object& marshal( struct) MainDictionary,
[in][opt] object& marshal( struct) CustomDictionary2,
[in][opt] object& marshal( struct) CustomDictionary3,
[in][opt] object& marshal( struct) CustomDictionary4,
[in][opt] object& marshal( struct) CustomDictionary5,
[in][opt] object& marshal( struct) CustomDictionary6,
[in][opt] object& marshal( struct) CustomDictionary7,
[in][opt] object& marshal( struct) CustomDictionary8,
[in][opt] object& marshal( struct) CustomDictionary9,
[in][opt] object& marshal( struct) CustomDictionary10) runtime managed internalcall
{
.custom instance void [mscorlib]System.Runtime.InteropServices.DispIdAttribute::.ctor(int32) = ( 01 00 44 01 00 00 00 00 )
} // end of method _Application::CheckSpelling

Or, what you see in the C# intellisense looks like this:

bool _Application.CheckSpelling(string Word,
ref object CustomDictionary,
  ref object IgnoreUppercase,
  ref object MainDictionary,
ref object CustomDictionary2,
ref object CustomDictionary3,
ref object CustomDictionary4,
ref object CustomDictionary5,
ref object CustomDictionary6,
ref object CustomDictionary7,
ref object CustomDictionary8,
ref object CustomDictionary9,
ref object CustomDictionary10)

So the upshot of all this is that any optional argument in Word has to be passed by ref from C# and has to be declared as an object. Even though you'd like to strongly type the IgnoreUppercase to be a boolean in the CheckSpelling example, you can't. You have to type it as an object or you'll get a compile error. This ends up being a little confusing because you can strongly type the first argument--the string you want to check. That's because in the CheckSpelling method, the “Word“ argument (the string you are spell checking) is not an optional argument to CheckSpelling. Therefore, it is strongly typed and not passed by reference.

So this all brings us back to Type.Missing. 

The way you specify in C# that you want to omit an argument because it's optional (after all, who really wants to specify 10 custom dictionaries?) is you pass an object by reference which you have set to Type.Missing. In our example, we just declared one variable called missingType and passed it in 11 times.

Now when you pass objects by reference to managed functions, you do that because the managed function is telling you that it might change the value of that object you passed into the function. So it might seem bad to you that we are passing one object set to missingType to all the parameters of CheckSpelling that we don't care about. 

After all, imagine you have a function called DoStuff (shown below) that takes two parameters by ref. If you set the first parameter to true, it will do something happy. If you set the second parameter to true, it will delete an important file. But if you pass in Type.Missing to both parameters, it won't do anything--or so you thought.

Because you are passing by ref, what if the code evaluating the first parameter changes it from Type.Missing to true as a side-effect? Now, when the code executes later in the function to look at the second parameter, it will see the second parameter is now true because you passed the same instance to both parameters:

namespace

ConsoleApplication1
{
class Class1
{
[STAThread]
static void Main(string[] args)
{
object missingType = Type.Missing;
DoStuff(ref missingType, ref missingType);
}

static void DoStuff(ref object DoSomethingHappy, ref object DeleteImportantFile)
{
if (DoSomethingHappy == Type.Missing)
{
// Don't do something happy but set DoSomethingHappy to true
DoSomethingHappy = true;
}

if (DeleteImportantFile == Type.Missing)
{
// Don't do anything
}
else if (((bool)DeleteImportantFile) == true)
{
// Do It
System.Diagnostics.Debug.Assert(false, "About to delete an important file");
System.IO.File.Delete("c:\veryimportantfile.txt");
}
}
}
}

You could fix this by declaring an object for each by ref parameter, as shown below.

static void Main(string[] args)
{
object missingType1 = Type.Missing;
object missingType2 = Type.Missing;
DoStuff(ref missingType1, ref missingType2);
}

So you might guess that you might need to rewrite the first method, CheckSpelling, to declare a missingType1..missingType11 because of the possibility that Word might go and change one of the by ref parameters on you and thereby make it so you are no longer passing Type.Missing but something else like “true” that may cause unintended side effects...

WRONG!

Remember that Word is an unmanaged object and you are talking to it through interop. The interop layer realizes that you are passing a Type.Missing to an optional argument on a COM object. Word expects a missing optional argument to be a VARIANT of type VT_ERROR set to DISP_E_PARAMNOTFOUND. So interop obliges and instead of passing a reference to your missingType object in some way, the interop layer passes a variant of type VT_ERROR set to DISP_E_PARAMNOTFOUND. Your missingType object that you passed by reference is safe because it never really got passed directly into Word. It is impossible for Word to mess with your variable, even though you look at the syntax of the call and think it would be possible because it is passed by ref.

So the inital CheckSpelling code is completely correct. Your missingType variable is safe--it won't be changed on you by Word even though you pass it by ref.

But remember this is sort of a special case that only applies when talking through interop to an unmanaged object model that has optional arguments. Don't let this Word special case make you sloppy with other managed methods that you pass values to by ref. When talking to managed methods, you have to be careful when passing by ref because the managed method can change the variable you pass in as shown in the DoStuff example.