Google support for Open XML formats


I noticed this last week but forgot to blog about it. If you do a google search and the result is an Open XML file (.docx; .xlsx; .pptx) they give you the ability to view them in the browser using their own rendering technology. They support all three formats, and the results are pretty rich (surprisingly richer than what they provide for ODF files).


Take a look for yourself. I did a search for .xlsx files:



If you click on the “View as HTML” link, they transform it into HTML and give you a rich view:



And if you open it in Excel, you’ll see that the preview was actually pretty accurate:



-Brian

Comments (19)

  1. Oliver says:

    So when Google turn up at a TC meeting and comment about the spec being something that they can’t implement are they representing themselves or the corporation?

    Stephen pointed out a while ago that IBM have also implemented the spec in Lotus Quickr 8, although I’ve yet to take the time to look at how far they have got. The product documentation states that you can import OpenXML files into Lotus Quickr without the need for Office on the PC doing the importing.

    http://notes2self.net/archive/2007/08/07/lotus-quickr-8-amp-openxml.aspx

  2. hAl says:

    Interesting feature.

    It looks like only XLSX files are supported that way and not DOCX files yet.

    Also it seems that Google search with filetype:docx is still missing a lot of files that I can find workings links to, using Live search contains:docx search parameter.

    Anyways Brain, why does Google support the filetype:docx search parameter whilst your own Live search still does not.

  3. Well, you are right that it is a heck of a lot better than the ODF spreadsheet support, which seems to be nothing but grabbing the text out of the document.  I’m betting that Rob Weir doesn’t blog about this feature.

  4. Christian says:

    Does anybody know how it comes that so few docx files are found by google? I’m pro Office OpenXML, but Rob Weird used google results as arguments against the widespread use of the new file format, so I think it should be examined how it comes to this!

  5. A says:

    I think there might be a lot of users who are afraid to switch to the Office 2007 because of the new UI.  Those types of people are also the same type of people who would be afraid to download the compatibility pack for 2003, fearing that it might "mess up their machine".  

    These people don’t care about the format itself, they just don’t want to change the way they work.  For example, I had to search for the fields option several times before I could remember where they were in the Ribbon.  Some users just don’t want to have to deal with that.

    Those who are not afraid upgrading are aware that there are people out there who are, so are still saving their formats as 2003 for the benefit of the others.  I know in my office there are a handful of us who are using 2007, but since the bulk of us aren’t, we save our docx files as doc to share them.

    This will change, but there is a lot of inertia in companies ("why invest time and money for the new Office when the one we’ve got is working fine?"  You’ll probably find a lot of companies running Office 2000) and home users ("you mean I’ve got to relearn everything?").

    I suspect that is just one of many reasons why we aren’t seeing that many files out there.  I’m sure Rob would love to say it’s because of the OOXML to ODF converter, and the lack of docx is because everyone’s converting 😉

  6. Doug Mahugh says:

    Google’s support for the Open XML formats continues to improve. They recently started identifying DOCX,

  7. Brian,

    The weird thing is that it has now reduced to 3380 when I search.

    I also found that Websphere Portal can convert Office 2007 documents to HTML automatically:

    http://publib.boulder.ibm.com/infocenter/wpdoc/v6r0/index.jsp?topic=/com.ibm.wp.zos.doc/wpf/dcs_info.html

    DB2 Content Manager V8.4 supports Open XML – http://www-1.ibm.com/support/docview.wss?uid=swg21288972

    Maybe you can list these as Open XML apps as well 😉 At this rate, IBM will be hogging the list.

    Well it looks like the following orgs are using xlsx:

    Bayer

    http://www.stockholders-newsletter-q3-06.bayer.com/en/bayer_regions_Q3_06.xlsx

    State of West Virginia

    wvde.state.wv.us/materials/STEP7Allocation2007-08.xlsx

    Washington State Board of Education

    http://www.sbe.wa.gov/GraduationRequirementsDatabase1.xlsx

    State of Idaho

    http://www.pte.state.id.us/Forms_Publications/Agriculture/FFA/American%20Honorary%20Teacher.xlsx

    Kentucky Board of Emergency Services

    kbems.kctcs.edu/Senate%20bill%2066%2008%20Apps%20Received.xlsx

    University of North Carolina

    ils.unc.edu/courses/2008_spring/inls261_001/materials/task04.spreadsheets/Formulas_practice.xlsx

    University of Oregon

    https://scholarsbank.uoregon.edu/dspace/bitstream/1794/4956/1/patron-1.xlsx

    East Hampshire Council (in the UK and within Lotus Notes, for extra comedy value)

    http://www.easthants.gov.uk/allservices.nsf/0/F35E89A101CBF23F802573D00033AEA2/$File/Summary+June+06.xlsx

    There were what looked like a bunch of other universities, government departments etc around the world too, but I got bored of looking.

    It will be interesting to see what this looks like in a year’s time.

    I certainly know some of our users are using xlsx, mainly for >65K row support.

    Gareth

  8. Dave S. says:

    @A – One of the things users expect in a compatible application is not that the formats are compatible but that the interface is.

    I’ve been trying Office 2007 for a while and find so many of the commonly used features (for me) carefully hidden in places that are not intuitive. Intuitive really means ‘already familiar with,’ either by direct experience or a migration of one experience to another.

    In my case, that would be 6 wordprocessors, 5 spreadsheets, 5 CAD systems, 5 operating systems – being not intuitive means not using any previous convention.

    Used to be – Insert Break. Now it is Page Layout Break. Except Page breaks don’t really affect the page layout, just which page or how many pages the information is on.

    Yes, there’s a training cost. Imagine a company with 50,000 employess taking just 8 hours apiece at, say, $12 an hour average. Is that an amount you would spend for those employess to do exactly the same job in exactly the same way after also paying for all new software licences?

    I don’t know how many people use MS Office software, but this little interface change is going to cost the US a large amount of money. It’s difficult to imagine a productivity payback to offset it.

  9. Dave S. says:

    Interesting –

    Opera flags http://www.meerlytalking.com as blacklisted. Never had Opera put up a notice like that before.

    Probably their ISP supplies unscrupulous people with web access.

    I can’t see the big deal though. The most complex formula in that sheet is what was available in Lotus 1-2-3 circa 1986; it doesn’t even use named ranges, which were available then.

    Aren’t the precalc’d values stored with the cell numbers anyway?

    One still cannot rename the columns so that a formula such as =sum(Huckabee) can be written with "Huckabee" clearly at the top of the column. Sure one can use row one, place names in the columns (as is done here) but then everyone has to be careful not to rename the "Huckabee" range, or change the row 1 entry to "Paul."

    A look at the sophistication of the .xlsx sheets out there suggests that most users posting their stuff could have done quite well with 1-2-3. I’m a bit hesitant to open the russian and chinese sourced files to see what they are like.

  10. JasonG says:

    I found searching for more advanced xlsx documents to be illuminating.  Some of the more advanced formatting options are not supported in html view, but many formula are.  I checked out "Cell Conditions" and "ANSI Character Set"

    Of interest is the differing off-by-one results between http://librayeung.com/excelwrong.xlsx and http://209.85.165.104/search?q=cache:jWY5XVnAWVIJ:librayeung.com/excelwrong.xlsx+filetype:xlsx&hl=en&ct=clnk&cd=3&gl=us

    Kudos to the Google team!  wonder when we’ll see compatibility in Google docs and spreadsheets?

  11. wmigda says:

    yeah, that is something – displaying simple spreadsheet data, great success! Do you honestly think, that displaying this spreadsheet is the hardest thing a developer could face when attempting to implement at least a tiny part of the over-6000-page-specification non-ISO ECMA-pay-per-validation OOXML format? I bet they (google) will stop right there or soon after, since OOXML defined the way it is done now is a developer’s worst nightmare (or they have at hand a sizeable bunch of spare cheap monkeys willing to dig in with improving OOXML support in google).

  12. I have read a number of blogs posts and articles recently that demonstrate the groundswell of support

  13. I have read a number of blogs posts and articles recently that demonstrate the groundswell of support

  14. Im jestem starszy tym trudniej mi uciec od stwierdzenia, że świat jest pełen hipokryzji. Czasem to, aż

  15. Im jestem starszy tym trudniej mi uciec od stwierdzenia, że świat jest pełen hipokryzji. Czasem to, aż

  16. Un po’ di tempo fa ti ho parlato del nuovo formato documentale Office Open XML e del suo cammino nel

  17. Un po’ di tempo fa ti ho parlato del nuovo formato documentale Office Open XML e del suo cammino nel

  18. With the ballot resolution later this month, the temperature is high around the issues on Open XML. 

  19. With the ballot resolution later this month, the temperature is high around the issues on Open XML.