Follow up on questions around PDF support in Office "12" (part 1 of many)


Sorry for not posting any replies to the PDF questions sooner, it’s been pretty busy. I have a list of all the questions and I’ll try to write up responses to everything over the next week or two. Today, there are a couple of the earlier questions that I wanted to address.


Do we support tagged PDF (for accessibility)?


A number of folks asked this question. While we do not support all possible tags (I don’t know of many applications that do), we do support a number of them. We support tags that assist with text flow, ALT text, and Unicode text in all cases except for InfoPath, Access, and OneNote (at this time). This is an area we would love to get more feedback on. Are there other structures you would like to see tagged?


If we do support tagged PDF, what kind of semantics will Office 12 require the author to use (e.g. headings, styles)?


We are supporting basic tags that do not require special authoring—it is automatically generated by the application from the document data. Specifically though, we are not supporting Heading tags at this time. However, we would be interested in any feedback people may have in this area.


There is of course a challenge here as we start trying to support tagging for other types of structures. Of course everything is full fidelity for the view, so what you see in Office is what you’ll see in the PDF. There is additional functionality though like tagging which is really useful for a number of cases. In Word for instance we have some structural features that would be nice to structure in the PDF with tags. The problem comes with more complex layouts where text that reads continuously is actually disconnected in the PDF output. This is a challenge with any format that is a print based format. Like I said though, we’re definitely open to suggestions and would love feedback on what type of support you’d like to see here.


Why is this built directly into Office as opposed to a print driver in Vista?


I can’t speak for what the Vista team plans to do, but I want to be clear that this is actually much more powerful than just going through a printer driver. Folks had mentioned that the Mac OS had built in support for PDF, but that is through printer drivers [Correction: looks like I was wrong here, see first comment]. Because this is a “native” solution where we are working with application data that is upstream from where a printer driver solution begins, we are starting with much more complete information about the file. It allows us to do a better job tagging documents and implementing interactive features like internal and external hyperlinks. Additionally, transparency and gradient quality will be much better.


Is this publish only? Or will we support opening PDFs as well?


This is a publish, one-way only operation . We are neither shipping a special viewer nor doing any work to make PDF files readable by the Office applications.


What choices were made with font embedding/subsetting/outlining?


As a rule, we embed and subset embeddable fonts (when permitted). We are not doing font outlining in the general case.


Where’s the innovation? 🙂


Well, as I said in the comments, this particular feature is about responding to customer demand. We’ve had a ton of people ask us for this support, and so we’re providing it. I don’t really care who already has this functionality, it has nothing to do with why we did this. You’re seeing all kinds of press around this because it’s something that a lot of people have been asking for and a lot of people care about. There are tons of great blog entries out there from people who are excited that we are building this directly into the product. I don’t think anyone has claimed that this particular feature is about innovation (unlike something like the new user interface). It was a lot of work to build though. It’s natively supported in Word, Excel, PowerPoint, Access, Publisher, OneNote, Visio, and InfoPath, which is a really big undertaking.


OK, sorry I haven’t had a chance to answer all the other questions yet. It’s really busy right now as we’re trying to get stuff tightened down for the first Beta coming out in the next couple months. I’m also trying to pull together a couple posts dealing with the questions around OpenDocument and that’s taking some time as well. Not to mention, I have a lot more fun talking more about scenarios and things you can do with the formats, so I really want to get back onto those topics. I’ll try to get to the other questions around PDF though in the next couple days.


-Brian

Comments (24)

  1. Rosyna says:

    OS X’s support for PDF is not through any printer driver. OS X’s native format is PDF. All of its drawing is based on "Display PDF". It’s coordinate system is identical to PDF’s as well.

    You can also choose to save to PDF instead of printing to a printing to a printer in the print panel. I think that’s where the confusion you’re having stems from.

  2. BrianJones says:

    Thanks for the clarification Rosyna. I updated the post just to mention that I was wrong there and for folks to look at your comment for more clarification.

    -Brian

  3. anon says:

    "Word, Excel, PowerPoint, Access, Publisher, OneNote, Visio, and InfoPath, which is a really big undertaking. You can’t build new functionality like that."

    If you are using the GDI output, then I wonder what makes you think it’s hard to get PDF supported in all of those applications. After all, GDI is application-independent.

    If you are outputting a native XPS with each application, converted then to PDF, then it’s indeed a totally different story. I wouldn’t bet much on that.

  4. scot says:

    Brian,

    I am no fanboy of Microsoft but I have to say this is really big, and Thank you !! This goes a long way to cross platform viewing and long term storage of documents.

  5. BrianJones says:

    I thought it would be obvious, but I guess not. Don’t use profanity in your comments. If you do they’ll be deleted.

    Here’s what the original post said (cleaned-up):

    "What about mentioning you were wrong on anything you said about OpenDocument and correcting your FUD-infested articles? We don’t give a **** about PDF support in Office Bloatware Edition v12, we want to know why you correct obvious mistakes and leave obvious lies unchanged."

    As I said, please don’t use profanity. In addition, try not to use the term "we" without letting me know who the "we" is that you are refering to. If you read a lot of the news, blogs, comments, my e-mail, etc. you would see that a lot of people care about the PDF support. In order to better understand it would be nice to know who "we" is.

    I haven’t lied on anything I said around OpenDocument. There were some unclear subtleties around licensing and IP that I was asking about. A few days after my post they updated the site and now it looks like things are a bit more clear.

    Also, please try to keep the comments targeted at the subject of the post. Thanks!

    -Brian

  6. Rosyna says:

    About the fonts thing. Are you just going to embed the characters needed in the font to render the PDF? Are you going to skip common ones like Arial (and the others that are embedded in PS printers)?

    And the one question sounds like Office won’t be able to open the PDF files it just created. Is that correct?

  7. yugal joshi says:

    Well it seems that you have "listened" to customers query only when your purported Open standard XML based Office 12 was rejected by MA Commonwealth. Which means that though there was not even a hint of this feature until recently its sudden advent meant that you had this functionality ready for quite sometime but did not gave to the customers because it was not costing you money. But now as you need to support an open document format like PDF you were quick to add it to Office. This is hypocrisy masquerading as customers benefit.

  8. Craig Ringer says:

    That’s great news about font embedding, since that behaviour should be suitable for customer submissions from Publisher, etc for prepress. It’ll finally be possible to handle Publisher docs at work 🙂

    Brian: Thanks very much for your patience with all this, and thanks for the informative update.

    Rosyna: That’s an important question, and I’d also be interested in knowing. If the "standard 13" fonts are omitted by default, in my view it’s really important that there be an option like "embed all fonts" to include them. The "standard 13" aren’t as uniform as would be nice, and embedding them is necessary for prepress use.

    I also think it’d be good to know if the PDF support will output text with no colour component in /DeviceGray not /DeviceRGB . This is important when accepting jobs for prepress, such as from Publisher.

    Also, if not all fonts can be embedded (eg due to licensing limitations), it’d be great if the user was presented with a warning about this. This will save a lot of frustration for some users.

  9. Mark Nethercott says:

    When you talk about limited tagging and excluding heading from tagging, does that mean that the PDF output will, or wont have the ability to build what Adobe call a table of Bookmarks? This is the ‘table of contents’ that sits in the Adobe bookmarks panel and allows user navigation.

  10. Rick says:

    Having PDF support native in Office would be great but, if it’s to be a real alternative to buying the full Acrobat, it has to produce PDFs which are at least as accessible as the current PDFMaker plug-in.

    While some of the tags may well be expendable as you suggest, heading tags surely are an important omission – the headings in a document outline its basic structure, provide a quick overview of its content, and allow automatic bookmarking as mentioned by Mark.

    I doubt Adobe will be too forthcoming to help you with this one, so why not approach the vendors of accessibility software for their advice on which tags it is important to support. Get the backing of organisations such as the RNIB.

    Knowing such consulatation had taken place would be the best way to give me confidence that the the Office PDF feature was the one to use.

  11. FARfetched says:

    First, while I’m no fan of Office in general, I have to say this will be a nice feature. OK, OK, I’m biased: I’m thoroughly cheesed at Adobe for dropping MacOS support for FrameMaker & seeing Adobe in for some pain is gratifying. 🙂

    Questions…

    First, will there be support for bookmarks and links (inside and outside the PDF)? You said "it’s for publishing only" — if that means *print* publishing, obviously you don’t need those features. But a well-designed PDF can be pretty interactive. Note that "tagged PDF" is a bit of a different animal than what I’m talking about here; tagged PDF can be re-flowed on the fly for use with PDA-size screens.

    Second, what kind of configuration options do you expect to give users? Adobe Distiller, for example, can downsample bitmaps to make a smaller PDF. If you plan to support bookmarks, you’ll probably have to allow for specifying the deepest heading level to "pick up" and perhaps even allow for paragraph formats other then Heading1..9.

    Statements:

    OpenOffice’s PDF export is incredibly quick. At least on NeoOffice (the OSX-native version), it’s about twice as fast as saving to OO native format. It’s not that important, but at least it’s impressive. 😉

    What’s more important to me than PDF export (partly because, as a Mac user, I already have it) is whether Office 12 (or whatever the MacOS version will be called) has cleared up stability and corruption issues. As a technical writer, I’ve lost enough work to Word application crashes and corruption for unknown reasons that I’m loath to take it seriously as a work tool. I’ve also received some seriously gnarly Word files from co-workers that OpenOffice was able to deal with better than Word itself.

  12. Joe Clark says:

    Rick, there is no reason whatsoever to assume that Adobe might not be "too forthcoming" with help for Microsoft to get tagging right. The PDF/UA working group was just on a conference call this week where I mentioned the upcoming PDF support, and I immediately posted to the mailing list about it with a stern warning that we have to guide MS right now on this.

    The record shows that Adobe is as helpful as it can be (with only a few developers working on accessibility) with third parties, including me.

    I’ve met the person at RNIB who is studying office (small-O) formats and PDF and am not convinced he could be guiding Microsoft on the *structure* of its tagged PDFs. User interface, yes.

    BTW, PDF/UA has a Weblog.

    http://pdf-ua.blogspot.com

  13. Neil says:

    Will Office 12 PDF generation support pdfmark for embedded objects? If so, will the mechanism for inserting custom PDF elements still be the print field or will there be some new mechanism?

  14. BrianJones says:

    Hey folks, Cyndy Wessling is now blogging on all the details of the PDF support coming in Office ’12’: http://blogs.msdn.com/cyndy_wessling/archive/2005/10/08/478419.aspx

    -Brian

  15. Darts says:

    Will there be support for bookmarks and links?

  16. BrianJones says:

    Hi Darts,

    Yes, there will be support for bookmarks and links.

    -Brian

  17. smile says:

    i use ABC Amber Visio Converter is an advanced utility which converts your VSD (MS Visio) files to any format you wish (PDF, HTML, CHM, RTF, HLP, DOC, and many more) easily and quickly.

    http://www.yaodownload.com/video-design/imageeditors/abc-visio-converter_imageeditors.htm