MS Filter Pack released !


I’m pleased to announce that after months of blood, sweat and toil, the MS Filter Pack is finally available !!! The package can be downloaded from :


 http://www.microsoft.com/downloads/details.aspx?FamilyId=60C92A37-719C-4077-B5C6-CAC34F4227CC&displaylang=en


Contents:


The filter pack includes the following IFilters:


·         Metro (.docx, .docm, .pptx, .pptm, .xlsx, .xlsm, .xlsb)


·         Zip (.zip)


·         OneNote (.one)


·         Visio (.vdx, .vsd, .vss, .vst, .vdx, .vsx, .vtx)


Supported Products:


·         SPS2003, MOSS2007, Search Server 2008, Search Server 2008 Express


·         WSSv3


·         Exchange 2005


·         SQL 2005, SQL 2008


·         Windows Desktop Search 3.01, WDS 4


Overview:


·         The Filter Pack installs the above IFilters on the machine


·         Each IFilter is registered with Windows Indexing Service


·         Each product above has a corresponding KB to describe how to register the filters


Q&A:


“I noticed <product X> is not listed as a supported product, why is it not included?”


          When we created the project plan we came up with the list of Microsoft Search products that we would be supporting.  During the project lifecycle we’ve tested to ensure that the Filter Pack works properly with each of these products.  We will work to determine if any new Search products can be supported in the future.


 


“Is the Filter Pack localized for <language y>?”


          The Filter Pack will be localized in 36 different languages (see below).  It has been passed off for localization – details will be posted as they become available.  At the time of release (12/18), the Filter Pack will available in en-us only.


 



















































































Fully Localized SKU Languages


Language Pack Languages


Arabic


Bulgarian


Brazilian


Croatian


Chinese (SC)


Estonian


Chinese (TC)


Hindi


Czech


Latvian


Danish


Lithuanian


Dutch


Romanian


English


Serbian (Latin)


Finnish


Slovak


French


Slovenian


German


Ukrainian


Greek


 


Hebrew


 


Hungarian


 


Italian


 


Japanese


 


Korean


 


Norwegian (Bokmal)


 


Polish


 


Portuguese


 


Russian


 


Spanish


 


Swedish


 


Thai


 


Turkish


 


 


“Is the Filter Pack available for x64/x86?”


          The Filter Pack will be available in both x64 and x86 – there are two separate downloads (same location).


 


“What about the Tiff/MODi IFilter”


      –        Unfortunately, at the time of release, the TIFF filter is not shipped with the Filter Pack. We do understand how important


                the issue is for our customers and will be working on providing an alternative solution.


Comments (66)

  1. I&#39;m pleased to announce that after months of blood, sweat and toil, the MS Filter Pack is finally

  2. I&#39;m pleased to announce that after months of blood, sweat and toil, the MS Filter Pack is finally

  3. MikeH says:

    Hy!

    nice to see the support for Office2007 file formats but how about XPS?

    You mentioned Metro, but Metro in my understanding also includes XPS, so it is already working?

    Regards

    Mike

  4. debh says:

    Mike, XPS is a windows component and has to be downloaded seperately. The Filter comes with the XPS Viewer. As a matter of policy, we do not ship Windows components in Office products.

    HTH,

    Deb.

  5. MikeH says:

    Deb, in http://blogs.msdn.com/ifilter/archive/2007/03/24/indexing-xps-documents-with-moss-2007.aspx#5424346″>http://blogs.msdn.com/ifilter/archive/2007/03/24/indexing-xps-documents-with-moss-2007.aspx#5424346 you said the filter registration is on your radar, so I thought it is still not possible to use XPS inside of WSS 3.0?!?

    What do I have to do to use XPS on WSS 3.0? http://blogs.msdn.com/ifilter/archive/2007/03/24/indexing-xps-documents-with-moss-2007.aspx describes only the way for MOSS 2007.

    Just to speculate to get it done: I install the XPS Essential Pack, then I follow the instruction in http://support.microsoft.com/?id=946338 and use .XPS and {1E4CEC13-76BD-4ce2-8372-711CB6F10FD1}??? If yes, any problems with foreign languages?

    Thanks for your help!

    Mike

  6. debh says:

    Mike, this should work. As for localization issues, our loc team is still testing the Filter pack for localized languages.

    Of the top of my head, I don’t see any problems with registration for foreign languages.If you find any, please let us know and we’ll communicate it to the localization team.

    regards,

    Deb.

  7. David Bailey says:

    Is this filter pack supposed to do Full Text searches?  We have it installed and it doesn’t return results for information contained within these filetypes.

  8. Jamson says:

    I also encoutered same problem. Who can tell me how to solve it? Thanks!

  9. debh says:

    Are you refering to SQL full text searches?

  10. Sudeep says:

    Have any of you been able to get the onenote documents to appear in MOSS searches?

  11. eric says:

    We are running SQL2005 in 32bit mode on a Windows 2003 x64 Server. I see in the event log that the Filter Pack was installed, but even after the restart of the server, the files types are not listed.

    select document_type, path from sys.fulltext_document_types

    The Fulltext Indexes uses language "neutral." Could that be my problem? Or is it the x32 SQL on a x64 Platform?

  12. David Bailey says:

    I’ve got the Adobe pdf filter installed and MOSS returns text within the document.  Is the filter pack also supposed to do this with these document types also.

  13. debh says:

    David, the filter pack does not include a pdf filter. Its only available from Adobe and FOXIT.

    BTW, did you get the SQL full text search to work ?

    Thnx,

    Deb.

  14. debh says:

    Eric, we don’t support mixed mode installations.  The Filter Pack version must match the OS architecture.

    Hence: 32-bit SQL, 64-bit Filter Pack, 64-bit OS won’t work (as attempted below).

    HTH,

    Deb.

  15. David Bailey says:

    I guess I stated my question wrong.  I was wondering if MOSS 2007 does full text searching on these types of documents (docx, vsd [which the crawler states the wrong IFilter is installed], zip, etc.) because the crawler is not picking up the content in these files.  Does MOSS do a full text search on the document types listed above?

  16. debh says:

    David, MOSS does do a full text search for docx and vsd files. There was slight mistake in the FilterPack KB where the location to put the one of the registry keys was wrong. Can you please give it a shot after reading the "Errata to Filter Pack KB" published on this site?

  17. David Bailey says:

    It appears the correction is only for a one note type and doesn’t mention anything about vsd or docx types.

  18. deb says:

    David, the first correction applies to all the filters.You should set the values under this reg key:

    HKEY_LOCAL_MACHINESOFTWAREMicrosoftOffice

    Server12.0SearchSetupContentIndexCommonFiltersExtension

    The CLSIDs of the all the filters except OneNote is correct.

    HTH,

    Deb.

  19. David Bailey says:

    Deb, Here is the response from the crawl logs after running the errata.

    Crawled (The filtering process could not process this item. This might be because you do not have the latest file filter for this type of item. Install the corresponding filter and retry your crawl. )

  20. David Bailey says:

    Deb, I re-ran my crawl it the full text search on vsd file works!  Whats the story on zip files?  Thanks for the help.

  21. Wolfgang says:

    On my test system I can confirm that .docx files contained in a .zip file are filtered. Search results do show up in a MOSS 2007 in this case. However: .doc and .txt files included in a .zip archive are not filtered. I also used "ifilttst /i test.zip /v 3 /d" and examined the resulting .dmp file. This confirmed that only the .docx file was accessed. Any hint? (.txt and .doc files themselves are filtered, of course)

  22. Allan Pedersen says:

    I’m configuring a sharepoint search site, that should support indexing of tiff documents. Unfortunately, this is no longer support in MOSS 2007.

    Does anyone has any experience with tiff ifilters?

  23. Deb says:

    Currently there is a problem filtering Office97-2003 files within the Zip package.

    We’re looking into the issue.

    There are no OOB TIFF filters in MOSS2007 –

    we’re evaluating some solutions presently.I’ll

    post an update once we have an ETA.

  24. Andy says:

    TIFF filter is important as this gets some OCR-like features.

  25. Troy says:

    Does the iFIlter for OneNote work on a x64 system (Windows 2008/MOSS2008)?  It does not appear to.

  26. Query123 says:

    Is this included in Vista SP1 or Office 2007 SP1? Thanks.

  27. Deb says:

    No Filter pack is not a windows or office component and hence has to be downloaded seperately.

    Earlier post: OneNote Filter should work on x64 system.

  28. Bill says:

    Is there any news on the alternative solution for the TIFF filter which was not shipped with the Filter Pack? You suggested last August this would be shipped in the Pack. Then, in December, you said you were were working on this alternative, but there seems to be no further information over the last four months. Just the barest bones of a roadmap would be helpful.

  29. FrankHi says:

    This question had been asked a month ago already. My customer urgently needs this. Is there ANYTHING except "sorry again" I can tell them?

    Thx

    Frank

  30. Grobiii says:

    Hello, for myself…i got 2 systems, 1 x64 and another x86. I installed the MS iFilter pack on both, and on x86 all is working fine, but i cant search for vsd files on the x64 system. Crawllog shows no errors, he just said "*.vsd file crawled" but the search gives no result. 🙁

    any known workaround?

  31. Nick says:

    Another hand in the air for TIFF OCR content indexing, or ANY idea of timescales.

    Also, Acrobat iFilter (Acrobat 8.1.2) does not work out of the box, I have compiled instructions here (from various sources):

    http://nickwhite.spaces.live.com/blog/cns!94355F53A65D0989!734.entry

  32. Leon says:

    As i see now May 2008 and i don`t found any localization for this filters.  Interesting why we don`t use docx and 2007 office…

  33. Rob Polachek says:

    This may have been answered previously so I’m sorry if it is redundant but I’m having trouble extracting text from a .docx file with the x64 filter pack and FiltDump. This works fine on a x32 machine and yes, the x64 machine is completely 64 bit (Win Server 2003, SQL 2005)

    I get the following error: Error 0x80004005 loading IFilter

    Anyone else have a similar issue? Thanks

  34. Captaris TIFF iFilter for Microsoft SharePoint Server 2007, Microsoft Search Server Express and Microsoft Search Server 2008.

    see

    http://www.captaris.com/tiff_ifilter/index.html

    Best Regads

    Charly

  35. Captaris TIFF iFilter for Microsoft SharePoint Server 2007, Microsoft Search Server Express and Microsoft Search Server 2008.

    see

    http://www.captaris.com/tiff_ifilter/index.html

    Best Regads

    Charly

  36. tjk says:

    How will Office documents of previous version be handled? This pack does not explicitly list .doc, .xls as supported. For example the text extracted from a .docx is fine, but from a .doc is not – words are split (‘wit h’ instead of ‘with’) for no reason (no break of any kind exists).

    I believe, that before the filter pack, IFilters for office documents were part of a Windows installation. Have those been removed when I installed the filter pack?

    Thanks.

    tjk 🙂

  37. Ralph says:

    Post Infrastructure update and hotfixes, attempting to crawl VSD files and getting that error:

    Crawled (The filtering process could not process this item. This might be because you do not have the latest file filter for this type of item. Install the corresponding filter and retry your crawl. )

    (by the way … it is a "success message" rather an an error (can you finally once and for all FIX that?? It’s So Sloppy!)

    This is 64 bit Windows SErver 2003 R2 multi processor (2-16 depending on the server on the farm) … I see the latest download date is Oct 08 assumes you fixed something … or did you break something instead?  

  38. Ralph says:

    Post Infrastructure update and hotfixes, attempting to crawl VSD files and getting that error:

    Crawled (The filtering process could not process this item. This might be because you do not have the latest file filter for this type of item. Install the corresponding filter and retry your crawl. )

    (by the way … it is a "success message" rather an an error (can you finally once and for all FIX that?? It’s So Sloppy!)

    This is 64 bit Windows SErver 2003 R2 multi processor (2-16 depending on the server on the farm) … I see the latest download date is Oct 08 assumes you fixed something … or did you break something instead?  

  39. UJ says:

    Does any body know what all additional properties are supported by Filter Pack 2007 which were not present in Filter Pack 2003. Is there any documentation available for Filter Pack 2007.

  40. deb says:

    AFAIK, there was only one filter pack, which was released in 2007. Are you referring to Office 2003 ifilters ?

  41. UJ says:

    Typo in my earlier question, yes deb I was referring to office 2003 ifilters. I need to know what all additional properties are supported in Filter Pack 2007 which were not part of Office 2003 ifilters.

  42. deb says:

    UJ, the Filter pack ships the ifilter for office 2007 documents. It does not ship the office 2003 ifilter as this is shipped with the OS.

    I don’t believe we have a detailed list of properties that can be handled by both the filters – it’s usually whatever properties are supported by the document format. I’ll ask around to see if I can find an official list.

  43. UJ says:

    Hi Deb, any luck in getting list of properties/keys for each IFilter in Filter Pack 2007?

  44. Paul Morton says:

    It seems that this filter pack does not dump the contents of embeded objects (i.e. an excel embeded in the body of a word document). I beleive that there is a version available for sharepoint which will also filter the embeded content. Can you point me in the right direction?

  45. Jim says:

    I have tested this on a Vista 64 bit system and on XP 32 bit system.  IFilter does not work with .docx files.  

    So, is there a fix?  

  46. Jim says:

    Hello?  Microsoft?  Can you tell me if IFilters for .DOCX files work on Vista x64?  Is this BLOG dead, IFilters project forgotten about, or just embarrassingly does not work?  

  47. Josh says:

    Hi, Jim. Sorry it took a while to get back to your comment. All of the filters should be working on Vista. Can you be more specific with what you’re seeing? All of our testing shows the filters are working correctly. Have you installed the December CU, listed above?

  48. Jim says:

    Well I now believe it is my problem.  Yesterday I ran Citeknet.IFilterExplorer.x64 on my system and did not see DOCX listed.  I ran the tool on an associates VISTA system and DOCX was listed as processed by offfiltx.dll.  Two others with Vista64 also had the dll.  I searched for that dll on my system and could not find it!  I am on another project for the next few days so I’ve not been able to explorer this further.  Is there a secret to installing the dll if I copy it from another system?  Do you think this is my problem?

    Our IT dept says my system has all of the current updates.

    In appreciation for your response – THANKS!

  49. deb says:

    Jim, copying the dll should be fine provided the registry entries from your last installation was not corrupted. The safe thing to do will be to reinstall filter pack.Make sure the ifilter explorer shows docx as a listed type.

  50. Jim says:

    Hello Deb!

    I reinstalled MS Office 2007.  It still did not correct my problem. I installed office2007sp2-kb953195-fullfile-en-us.exe and that did not help.  I had someone else run my test and it failed processing DOCX files the same as it did on my system.  All I want to do is extract the text content from various MS Office documents as well as others from within a C# program.  The Office documents are the short coming.

    I am trying the following "Code Project" example as a starting point.  It works fine for .DOC files but fails for .DOCX files. Can you lead me to a means of successfully using MS supplied I-Filters to pull the text from .DOCX files from a C# program?  (I running Vista 64, but this needs to run on a wide range of PCs.)

    Thanks,

    Jim

    http://www.codeproject.com/KB/cs/IFilterReader.aspx?fid=1532771&df=90&mpp=25&noise=3&sort=Position&view=Quick

  51. Deb says:

    Jim,can you please try the following:

    1. Check the directory :

    D:Program FilesCommon FilesMicrosoft SharedFilters and ensure offfiltx.dll is there

    2. Use FileMon and Regmon with a "filter" set on offfiltx.dll and see if there are any errors.

    3. Use filtdump that ships with windows SDK to filter the word files and send us the error code.

    thanks,

    Deb.

  52. Jim says:

    Deb,

    Sorry, the only thing I can do is verify that the directory location for offfiltx.dll.  And it’s in:

         C:Program FilesCommon FilesMicrosoft SharedFilters

    Otherwise I am running Vista 64, both FileMon and RegMon don’t work in that environment.  They issues a warning to use ProcessMon then end.  Process Mon does not have a filter for DLLs. So I filtered on the path of the DLL and the test program.  I see no references to offfiltx in the trace output.  The non DOCX test cases work and do extract text from DOC files and PDFs.

    I checked with a co-employee who has the Windows SDK v6 installed but can’t find anything by name of filtdump.  

    I’m stuck?

  53. Deb says:

    Jim, just wanted to clarify whether you can find pptx and xlsx files when you search in Vista x64 ?

    BTW, do you have an email we can reach you at ?

    -Deb.

  54. Jim says:

    Good morning Deb,

    I installed the SDK for Vista.  No filtdump was included.  I did ask another co-employee not running Vista and he found it on his system and he said he could see the text of his documents with various attributes.  So it would seem the big problem here is Vista!

    Yes I do have email.  If you could, can you remove it before posting this note.

    j g h  at  m e t a f i l e  dot  c o m

  55. Jim says:

    Deb, More info –

    The source of these programs were from a non Vista system (XP 32 bit).  I ran them on my 64 bit Vista  system.

    I ran FiltDump  on a docx file.  The result is: "Error 0x80004005 loading IFilter"

    I ran FiltReg and it did NOT list .docx at all.  It did list .doc files as processed by "Microsoft Office Filter (%systemroot%system32OffFilt.dll)".

    Again,

    Thanks

  56. Jim says:

    I guess I should briefly describe what I have found to close loose ends.

    The IFilter loaded on Vista may very well work OK with MS applications that are matched for the OS and platform (32/64).  My issue surfaces when I attempt to process a DOCX file from a 32 bit application on a 64 bit OS.  My limited understanding of this (I mean limited) is that when the new IFilters (which handle DOCX and other X files) are installed, the system register entries are not setup to handle them when the 32 bit to 64 bit crossover lookup takes place.  The cross links are missing in the registry.

    I would like to thank Josh of MS for his very informative and skilled help in resolving this situation. And Deb for hanging in and supporting this blog.

  57. nabeel says:

    I face the same problem with SQL Server 2008 full text search with .docx, I can’t find text inside those files, it works for .doc my operation system is windows vista 64, any fix to this problem.

    Thanks,

    Nabeel.

  58. jteitel says:

    Nabeel,

    Could you give some more details as to your problem? I assume you’ve installed the Filter Pack. Have you followed the instructions for registering the filters with SQL Search? They can be found at http://support.microsoft.com/default.aspx?scid=kb;en-us;945934

  59. stabilo says:

    I’m running into similar problems to the last few posters with .docx not being indexed by SQL Server 2005 FullText Search.

    I’m running Vista Enterprise SP1.

    I installed the Microsoft Filter Pack from here:

    http://www.microsoft.com/downloads/details.aspx?FamilyId=60C92A37-719C-4077-B5C6-CAC34F4227CC&displaylang=en

    I then registered it with SQL Server 2005 to work with Full Text Search, using these instructions:

    http://support.microsoft.com/default.aspx?scid=kb;en-us;945934

    I did the same with the PDF IFilter from Adobe.

    The end result is that the Office 2007 file formats such as .docx are not indexed by FTS. But no such problem with PDF files!

    If I run the query:

    select document_type, path from sys.fulltext_document_types where document_type = ‘.docx’

    I get:

    C:Program FilesCommon FilesMicrosoft SharedFiltersofffiltx.dll

    So any ideas why this should continue to be failing? Reading this thread and seeing the same problem being over a year ago with arguably no solution does not seem very promising.

  60. Anshul says:

    Hi i am unable to search/filter some content in my .xlsx file.

    I had latest version of offfiltx dll registered with me.

  61. nabeel says:

    Yes I Installed the IFilter and follow the instruction in two operating systems Windows 7 and Windows Vista but no luck to find content in .docx or any office 2007 files, the FTS is working fine with pdf,txt,and office 2003 files.

  62. wa says:

    same problem here…cannot read .docx files

  63. VV says:

    On Windows Server 2008 (64-bit), I am failing to create an instance of the Office 2007 Filter for Office Documents (CoCreateInstance against offfiltx.dll)

    After reading the above posts, I tried the FiltDump.exe (downloaded the Windows Search 3.X SDK) against a Word 2007 document (.docx) and got the following error:

    Error 0x80030002 loading IFilter

    Is there anything else I need to do/install to get the filtdump.exe to work?

    Could someone shed some light here?

    Thanking you in advance.

    Best Regards,

    VV

  64. Above MCSE says:

    The 32bit Office 2010 Filter Packs will not install on Windows 7 x64. This is a problem for 32bit software that needs an IFilter interface to access the plain text for .docx files and others. Since the default Windows 7 x64 supports 32/64 for .doc files, why is this a problem now?