Changes in OCX Save Behavior for Office 2007 Documents

Summary

As part of the change for default file formats in Microsoft Office 2007, ActiveX Controls (OCX) saved by Office applications will be asked to save themselves using IPersistPropertyBag, where the property bag being used is a text-only storage medium.  This allows Office to persist the control's data as XML markup in the document, per the guidelines for the Open XML format.  Office supplied controls (Forms^3 controls, implemented in FM20.DLL) work fine with this method, but some custom (3rd-party) controls may not.  Specifically, any MFC built control that does not anticipate this use may encounter problems saving in the new Office 2007 (Open XML) file formats.   Saving to Office 97-2003 binary file formats is not affected.

Status

The change in save behavior is intentional. It allows for better compliance to the Open XML format specification.  However, the change may cause compatibility problems for existing controls that do not support control persistence by IPeristPropertyBag in text-compatible formats.

More Information

Prior to Office 2007, ActiveX Controls persisted inside of Office files would be saved using the IPersistStream interface (or IPersistStorage if the control did not support IPersistStream), and saved as binary data inside of the Office 97-2003 binary file format.  Most OCX controls save fine using this method.

Because of the change to the Open XML file format, Office 2007 will request controls attempt to save themselves in a non-binary format (if the control allows it).  Office does this using the IPersistPropertyBag interface.  If the control supports the interface, Office will ask the control to save its properties in a special property bag that allows only VARIANT types that can be converted to text (to be rendered inline as XML).  Each time the control saves a property to the property bag, Office converts the VARIANT property value to a BSTR (using the VariantChangeType API), and then uses the string representation as the XML field value. If the VARIANT property value does not convert to a BSTR type, an error is returned to the control to let it know the property cannot be saved.  The control can choose to ignore the error (leaving that property unsaved), alter the value to be saved so it can be represented as text (such as encoding the data if it was binary), or return the error to the host application attempting to save the control.  If it returns the error to the host, the host will see that the control cannot be saved as pure XML text, and will then call back using IPersistStream to save the control as binary (just as it did in earlier Office versions).  The binary data is then saved as a separate part in the Open XML file format, instead of inline XML. 

The VARIANT data types that are supported for conversion by Office are as follows:

Data Types

VT Types

Integers

VT_I1, VT_I2, VT_I4, VT_I8, VT_UI1, VT_UI2, VT_UI4, VT_UI8, VT_INT, VT_UINT

Floating Point

VT_R4, VT_R8, VT_DECIMAL, VT_CY

Strings

VT_BSTR

Dates

VT_DATE

Boolean

VT_BOOL

Font/Picture Object

VT_UNKNOWN (Note, this only applies to StdFont and StdPicture OLE objects)

Arrays (VT_ARRAY), UDF structures (VT_RECORD), and binary data (VT_BLOB) are not supported since they don't easily convert to an XML property field. If the control attempts to save an unsupported data type, it will receive an error (DISP_E_BADVARTYPE or E_NOINTERFACE) on the call to IPropertyBag::Write. The control can then decide if it should handle the property differently or fail the IPersistPropertyBag::Save call to then get Office to save the control as binary.

Unfortunately, this design change can cause problems for some existing controls. Specifically, MFC built controls support IPersistPropertyBag externally for use with Property Pages in a host IDE design-time environment.  The default implementation MFC assumes that only the public control properties should be exchanged in the property bag, and does not provide instructions to developers to include internal properties that allow a control to be fully saved in this manner.  In addition, MFC does not throw an exception if a property bag write call fails (it just returns true or false, leaving it to the developer to check each PX exchange call to see if the property can be saved to the bag or not).  So MFC built controls may be more susceptible to compatibility problems because IPersistPropertyBag may be implemented for the control, even if the control designer did not intend to use for control persistence.

Developers building MFC controls should check the BOOL return value from the PX_* exchange methods they call in DoDataExchange to ensure that a value could be written to or read from correctly.  The most common PX_* methods (Bool, String, Double, Long, Short, Color, Currency, Float, etc.), will work fine with Open XML files.  However, the following PX_* methods are known to have problems when used with the Office 2007 property bag save:

MFC Method

Problem

PX_Blob

Binary data not allowed in XML property bag. Will return FALSE when saving in Office. If you ignore the error, the property will not be saved and will be missing at load time. You should encode the data into a string representation, or return an error back to the caller to let them know the control cannot be saved without it.

 

PX_IUnknown

XML Property bag does not save objects. The two exceptions are StdFont object (PX_Font) or StdPicture object (PX_Picture), which can be saved. Method will return FALSE when called during save if IUnknown points to an object that is not a Font or Picture.

 

PX_DataPath

The DataPath property saves as a string (VT_BSTR), and therefore works correct as far as the actual save is concerned, However, when reading the data back into the control, there is a known problem with how MFC makes the data binding from inside of Word and Excel. This can encounter an error, which causes the data exchange to fail sliently without opening the data source. The error stems from a combination of MFC's use of IBindHost to do the data binding, and code in Word and Excel which implement IBindHost for the documents the control is in. Word and Excel expect the bind operation to be relative to the document loaded, whereas the MFC control may be using it for data operations that have no connection to the document. This can result in a an error, causing MFC to actually fail the data bind without giving an error to caller using the PX_DataPath method.

 

Update (August 2008) : The problem with the IBindHost call has been addressed by two separate hotfixes. See KB 956064 for a Word 2007 fix and KB 956511 for an Excel 2007 fix to address the issues and allow the bind to continue even if it is not relative to the document loaded. Both fixes are scheduled to be released in Office 2007 SP2.

 

PX_VBXFontConvert

This method does not applied to content saved in Office files, so should not be a problem in itself. However, it will return FALSE if attempted to be used in Office 2007.

 

Internet Explorer also uses IPersistPropertyBag for control persistence, so limitations that apply when saving a control in HTML could apply to Office 2007 as well.  If the control has known problems loading from an HTML web page, those same problems could occur if control is saved in an Open XML format file.

While Microsoft regrets that some controls may fail to save properly in Office 2007 because of the design change, we believe offering the ability to save control data as pure XML is worth the pain.  We encourage developers to offer XML friendly control persistence, and suggest they test their implementation of IPersistPropertyBag::Save and IPersistPropertyBag::Load to ensure it saves all the control data needed to restore the control on load.

In addition to the problems mentioned above, Microsoft has found a bug in the implementation of the property bag for Office 2007. String data that is saved in the property bag may be accidently modified by the XML encoder process, removing special characters like extra white space, carriage returns, tabs, and certain control characters reserved by XML itself.  This problem is being investigated for fix in Office 2007 SP2.  See this post for more details:

     OCX Persistent Strings May Be Altered When Saved in Office 2007 Open Office XML Format

To avoid problems with multi-line string data that you may want to save, we would encourage you to encode the data (using a standardize encoding technique similar to HTML encoding, or base-64 encoding if you need to save data that may include binary bytes) before saving the data in the property bag. This would ensure it could save in any Office build (patched or not) without losing data.