Open XML with Word App - Can we do that?

A general perception amongst developers regarding Office App JavaScript API is that it is not as powerful or as flexible as earlier VSTO based add-in solutions. This, albeit being true to a large extent, is not entirely true. The new Office API is not meant to replace existing Interop based solutions but supplement them. An Office App makes more sense when we look at the security, scalability and flexibility that come along with it.

We all, however, do agree that Open XML is a pretty neat way to manipulate Office documents. It offers nearly all the advantages and even more as compared to Interop solutions. In this post I am going to show you a way in which you can get the full power of Open XML SDK while using Word Apps.

The new Office API provides a way to get the entire document without any user interaction. The method getFileAsync provides a way to get the document either as plain text or in OOXML format. Getting data as plain text is simple and we don’t like doing simple things. Hence we will get data in OOXML format which is returned as byte array.

One thing to note here is that the getFileAsync method allows transfer only up to 4 MB of data but that should be sufficient for most purposes. 

Now that we have the byte array, notice that Office API also provides a handy method to convert it to Base-64 encoded string which we can then POST to any service of our choosing.

 function GetDocument() {
 $('#Loading').show();
 var documentText = null;
 Office.context.document.getFileAsync("compressed", function (result) {
 if (result.status == "succeeded") {
 //Get the file
 var myFile = result.value;
 myFile.getSliceAsync(0, function (resultSlice) {
 if (result.status == "succeeded") { 
//We got the file slice. Now we will encode the data and post to a service
 documentText = OSF.OUtil.encodeBase64(resultSlice.value.data);
 $.ajax({
 type: "POST",
 url: "../DocumentService.svc/ParseDocument",
 data: JSON.stringify(documentText),
 contentType: "application/json; charset=utf-8",
 dataType: "json",
 success: function (data) {
 $('#Loading').hide();
 $('#Content').show();
 $('#Content').append(data);
 },
 failure: function () {
 $('#Content').append("Sorry an error occurred!");
 }
 });
 }
 });
 myFile.closeAsync();
 }
 });
 }

For this example, we are also going to create a secured REST-ful WCF service which will have a single method called as ParseDocument accepting a string parameter. We will creatively name this service as DocumentService.svc.

Now we will POST our document which we got earlier as a Base-64 encoded string to this service. At the service side, we will extract byte array from the string. From this byte-array we can easily create an Open XML WordProcessingDocument object. Having a document object opens up a world of possibilities for us to manipulate the document. In this example I am just returning the document’s contents in HTML format back to the app but you can be more imaginative than that.

 public string ParseDocumentData(string openXml)
 {
 byte[] byteArray = Convert.FromBase64String(openXml);
 using (var memoryStream = new MemoryStream())
 {
 memoryStream.Write(byteArray, 0, byteArray.Length); 
//We got our document on server. Yay!
 using (var wordDoc = WordprocessingDocument.Open(memoryStream, true))
 {
 var settings = new HtmlConverterSettings()
 {
 PageTitle = "Document Analysis"
 };
 
 XElement html = HtmlConverter.ConvertToHtml(wordDoc, settings);
 return html.ToString();
 }
 }
 }
  

 Demo Solution

So next time if someone asks you if we can use Open XML with the new Word API, you know what to say!

P.S. - I have attached the working solution. You will need to launch Visual Studio as administrator to run the sample.

OpenXMLDemo.zip