Overview of Protected Office Open XML Documents

 

Suppose your application requirement is to programmatically create password “protected” Office Open XML (OOXML) documents.  Or, perhaps the requirement is to programmatically convert a batch of unprotected OOXML documents to password protected ones.  The following information outlines some aspects to consider when implementing for this requirement.

 

If you need to review the OOXML specification format you may do so here, Introducing the Office (2007) Open XML File Formats

. The Microsoft ECMA-376 Implementer notes are located here, www.documentinteropinitiative.org

 

The Microsoft Compound File Binary File Format (MS-CFB) specification may be found here, MS-CFB.  Also, you may review the Office binary file specifications for XLS, DOC and PPT here, Microsoft Office File Format Documents

 

When you create a password protected document in the Office User Interface (UI) you will find that it becomes a Microsoft Compound File Binary File Format (MS-CFB) document (not a ZIP package), though the file extension does not change (e.g., xlsx, docx, pptx).  Subsequently, the UI protected document will open within the Office UI but you cannot rename the file extension to *.zip and browse it with Explorer.  In order to programmatically parse the resultant protected binary document you may refer to the CFB and Office Binary File Specification links above.  You may find more information on Office document encryption here, RC4 CryptoAPI Encryption Password Verification, and here, 2007 Microsoft Office System Document Encryption

 

This is the header for a password protected document created in the Excel UI (CFB document):

0000h: D0 CF 11 E0 A1 B1 1A E1 00 00 00 00 00 00 00 00 ÐÏ.ࡱ.á........

 

This is the header for an unprotected document created in the Excel UI (ZIP package):

0000h: 50 4B 03 04 14 00 06 00 08 00 00 00 21 00 9E 6F PK..........!.žo

(Note, PK are the initials for ‘Phil Katz’ the original author of the PKZIP format)

To programmatically create or modify OOXML documents you may use the System.IO.Packaging namespace in .NET to create the documents programmatically, and an example is detailed here, Microsoft Knowledge Base Article 931866.  Or, you may use the Open XML SDK, and there are several examples here, Open XML File Format Code Snippets.  However, at this point in time the Open XML SDK does not provide functionality to open or create files with Compound File protection (CFB document), as detailed above.  In other words, once the document is no longer a zip package it can no longer be opened within the Open XML SDK.  We are looking into ways of improving the SDK to support such scenarios.

 

The example and details below are specific to Excel documents.

 

To programmatically password protect Excel documents you will need to implement the WorkbookProtection Class (and for Word documents this is the DocumentProtection Class, refer to Open XML SDK) , and/or add the workbookProtection element *before* the bookViews element and *after* the workbookPr element. For example:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>

<x:workbook xmlns:x="**https://schemas.openxmlformats.org/spreadsheetml/2006/main**"\>

<x:fileVersion appName="xl" lastEdited="4" lowestEdited="4" rupBuild="4505" />

<x:workbookPr defaultThemeVersion="124226" />

<x:workbookProtection workbookPassword="xsd:hexBinary data" lockStructure="1" lockWindows="1" />

<x:bookViews>

<x:workbookView xWindow="600" yWindow="525" windowWidth="17895" windowHeight="4560" activeTab="1" />

</x:bookViews>

<x:sheets>

<x:sheet name="My Data" sheetId="1" r:id="rId1" xmlns:r="**https://schemas.openxmlformats.org/officeDocument/2006/relationships**" />

<x:sheet name="Chart" sheetId="3" r:id="rId2" xmlns:r="**https://schemas.openxmlformats.org/officeDocument/2006/relationships**" />

</x:sheets>

<x:calcPr calcId="125725" />

</x:workbook>

For more information about the WorkbookProtection Class refer to MSDN, WorkbookProtection Class

* Also note, the workbookPassword xsd:hexBinary WorkbookPassword Property

 

Note, after implementing the document protection mechanism programmatically the document is not considered secure since the password is stored in plain text in the OOXML document structure and can fairly easily be obtained and/or removed by editing the “workbook.xml” file, under the “xl” folder (or the “document.xml” file for Word, under the “word” folder) in the ZIP package.  By comparison, a Compound File Binary file protected document is considered more secure since the password is stored in an encrypted stream in the CFB file format.

 

For more information regarding ECMA-376 document encryption approaches refer to, Standard ECMA-376 Office Open XML File Formats, as mentioned in [MS-OFFCRYPTO] section 1.3.3.