Recent IFilter implementation and deployment questions.


Please post your questions and comments about using, implemnting and deploying IFilters to work with Microsoft Search Products here. If deemed necessary, the discussion topics of broader interest will be sorted into seperate threads.

Comments (102)

  1. debh says:

    The Search Daemon which loads the filter is 64 bit in 64 bit Sharepoint.In general, a 64 bit process cannot load a 32 bit COM dll.

    The same is applicable for 64 bit WDS(Windows Desktop Search).

    On a similar note, if you’re writing custom IFilters as a plugin for Sharepoint/WDS, please compile them into 64 bit DLLs so that they can be consumed from withing 64 bit Search Daemons.

  2. alogan says:

    Hey,

    instead of downloading all separate iFilters from different websites is there a pack that installs the most popular iFilters automagically?

    – Angus

    ————————————————————————————–

    We have a plan for a filter pack that will include an wide array of MS/third party filters like Visio,OneNote etc.The release ETA is sometime around June/July 2007.Any filter included in the Filter Pack has to pass through intense security and fuzz testing. If you’re writing a propeitary filter and want it included in our filter pack, please let me know and I can provide the details.

    Deb.

  3. Andrew Jones says:

    Is there a tool that will allow me to easilt manage the filters that are in use. I’m running vista and trying to work out wht PDF files are not being indexed. I assume some bug in the Adobe iFilter and have tried to use a different iFilter but it is tricky to work out which iFilters are being used.

     ———————————————————————————————————–

    There can be two reasons why the PDF files are not indexed.

    1. The PDF Filter is corrupt. In this case, use the ifiltst.exe utility to filter the PDF files and look at the dump.The utility is located here:

    http://www.microsoft.com/downloads/details.aspx?FamilyID=9d467a69-57ff-4ae7-96ee-b18c4790cffd&DisplayLang=en

    The documentation can be found on msdn:

    http://msdn2.microsoft.com/en-gb/library/ms692580.aspx

    2. If the filter is functioning properly, then the registry entries might be incorrect.You can check if the right registry keys are accessed with regmon and if the search service actually loads the PDF filter dll with filemon.

    By the way, which version of the PDF filter are you using ?

    -Deb

  4. MarkR says:

    > In general, a 64 bit process cannot load a

    > 32 bit COM dll.

    Is this also true for Vista 64bit? I’ve installed Acrobat Reader 8.0 which comes with an IFilter but Vista’s indexing service claims that the ‘Registered IFilter is not found’ for PDF files. I am wondering if this is a 64/32bit compatibility issue.

  5. debh says:

    Yes , the v.8 reader comes with a 32 bit IFilter which cannot be loaded from 64 bit search service in 64 bit vista.

  6. MarkR says:

    Thank you very much for the confirmation Deb.

  7. Marcovanschagen says:

    Currently I am planning a new version of our 2005 version DWG iFilter. This is to support the newer 2007 DWG file format, and address questions on it’s operation with SQL 2005 and MOSS (sharepoint) 2007.

    Would you be interested in information on this product, please visit http://www.cadcompany.nl/ifilter

    As I am unexperienced with iFilter development, I have many questions to find answers for.

    In my preparations, Deb Haldar has provided me with crucial information to help me get on the right track. I would like to share this information in his blog to help make this the "One stop shop" for IFilter related issues. Probably a seperate thread will be created to track the development cycle of an iFilter from scratch.

    The existing 2005 version iFilter project is coded mostly in C++, in VC 6.0.

    Some questions I’d like to discuss:

    – Should we use C++ or transfer to dot Net and why?

    – What is required for coding a proper iFilter

    – Testing for Multithreading compatiability

    – Registration with Sharepoint and SQL 2005

    I would like to start sharing my information soon, and I am interested in your comments.

    Marco van Schagen

  8. sudipto says:

    I’ve some documents created on a solaris platform with Star Office. I want to index them from sharepoint 2003 and was wondering is Microsoft / Sun has an IFilter for Star Office?

  9. debh says:

    Currently neither Microsoft nor Sun Microsystem have an IFilter for Star Open Office documents. However, you might be able to find some third party vendors making this Ifilter.I came across one here:

    http://www.ifiltershop.com/staroffice-openoffice-ifilter.html

    Let me know if this serves your purpose.

  10. Jim Welch says:

    Thanks for the guidance regarding the lack of an available 64-bit PDF IFilter from Adobe.  Very helpful.

    Another workaround for MOSS web farm deployments that has not been mentioned is to deploy a 32-bit index server while keeping the rest of the SharePoint farm 64-bit.  It gives you the performance benefits at the database and web front-end tiers, the two most likely points for bottlenecks, until it is ok to upgrade the index server to 64-bit.

  11. Brian says:

    Deb et al.,

    Do you know if there will be a 64-bit ifilter for Adobe PDFs for the 64-bit Vista editions?

  12. debh says:

    Brian,

    So far Adobe has not made an announcements as to whether they will be releasing a 64 bit version of PDF filter.The issue is discussed at length under the thread:

    http://blogs.msdn.com/ifilter/archive/2007/01/08/microsoft-s-strategy-for-dealing-with-32-bit-search-binaries-within-64-bit-servers.aspx

    Deb.

  13. David Watt says:

    Hello all,

    Is there a place where I can get just the Office 2007 iFilter, to install it on an MS Index Server deployment?  I’d rather not install all of Office to get just the iFilter.

    Thanks in advance,

    Dave

  14. debh says:

    Dave, presently there is no such facility like that to the best of my knowledge.However, we’re planning a seperate downloadable package of MS filters sometime in the middle of 2007.

    Currently, if you want to index Office 2007 documents, you’d need to install MOSS 2007 or office client 2007.

  15. David Watt says:

    Thanks, Deb.  Not the answer I was hoping for, but I’m looking forward to the package of iFilters, when it arrives.

    Dave

  16. Ryan Waltz says:

    How do you deploy an x86 version of Index Service on a 64-bit platform running WSS 3.0? I would like to provide backward compatibility for iFilter for pdf documents.

  17. debh says:

    Ryan, I don’t think one can deploy a 32 bit version of indexing service on a 64 bit windows machine.

  18. kert says:

    ::Is there a tool that will allow me to easilt manage the filters that are in use.

    Yes, see IFilter explorer at citeknet.com

  19. hjlee says:

    I have to encrypt office files when I upload them to MOSS.

    So, I can’t search my encrypted office files!

    I want to decrypt them.

    For text files, I made a decrypt filter and replaced with text filter. It works well.

    But, about office files, I can decrypt them but I can’t extract text from them – because my filter doesn’t know about office format!

    I can’t find the way to use original offfilt.dll in my new filter.

  20. debh says:

    Lee, this is an interesting situation. I’d recommend that you try the following:

    1. Make a managed wrapper for the IFilter Interface.

    2. In the wrapper, decrypt the file, save it and then pass it to the offfilt for indexing the normal content. That is, invoke the services of offfilt dll on the decrypted file.

    You can find an example of managed wrapper for IFilter interface at :http://www.pinvoke.net/default.aspx/Interfaces/IFilter.html

    Please note that creating temp files is a hack and is not recommended. However, as far as I know, if someone encrypts certain files, then it should not be searchable anyways – the guiding philosophy being "Whatever is not visible to the naked eye, is not returned in search results".

    But then again, I’m sure you have a valid reason to want it implemented this way:)

  21. hjlee says:

    Thank you Deb Haldar.

    I made a wrapper filter and it works well with desktop search.

    To do that, I replaced registry because the encrypted files are still have the same extensions.

    And I also changed some registries for MOSS.

    But, about MOSS, still I can’t invoke my new filter.

    MOSS still invokes the old dll (for offfiltx.dll) or just tell me some error that filtering failed and I might install a new filter.

    I took a long~ time with this, and I’m in a urgent.

    Please help me.

    WHAT I SHOULD CHANGE IN REGSTRY TO INVOKE MY NEW FILTER? – IN DETAIL.

  22. debh says:

    Lee, the most likely cause of why the new filter is not invoked is because the CLSID of actual offfiltx is still prsent in the registry. Here’s what I’d try doing:

    1. Note down the CLSID of your IFilter wrapper.

    2. change the following reg key with the CLSID of your new filter:

    [HKEY_LOCAL_MACHINESOFTWAREMicrosoftOffice

    Server12.0SearchSetupContentIndexCommonFiltersExtension.ext]

           Default REG_MULTI_SZ = IFIlter CLASSID

    3. If you still get an error in filtering, try to determine using regmon and filemon what filter binaries were accessed(your wrapper/ original offfiltx.dll) and what sequence of regitry calls were made to pick up the filter.

    As a matter of curiosity, which company are you making this customization for? It might be intresting to think about providing an OOB way to do this:)

  23. hjlee says:

    Thank you for your really fast reply!

    But, I already did both things – changing registry, and checking through regmon.

    It’s the first time for me to use regmon.

    And, the log is so long it’s not easy to check it.

    But, currently, my wrapper makes a log when it accessed. And it makes nothing.

    Anyway, I can see a result with regmon when I filter it with only offfilt.dll, and my wrapper(decfilt.dll), as follows:

    mssdmn.exe –> QueryValue –> HKLMSoftwareCLSID{MYCLSID}InprocServer32[default] –> SUCCESS –> "decfilt.dll"

    and

    mssdmn.exe –> QueryValue –> HKLMSoftwareMicrosoftWindows NTCurrentVersionImage File Execution OptionsDllNXOptionsdecfilt.dll –> NOT FOUND.

    I am trying to change xls, doc documents.

    I’m not sure I can tell you what company I making this for, when I can, I’ll tell you 😉

    Can you give me any more hint with my information?

  24. debh says:

    Lee, it seems like your filter is trying to write non data segments in memory. A simple workaround(hack) might be to disable the XD/NX bit in BIOS and disable DEP in general or just disable DEP for mssdmn process as follows:

    1. Open Explorer

    2. Right click on "My Computer"

    3. Select "Properties"

    4. Select the "Advanced" tab

    5. In the "Performance" frame, click the "Settings" button

    6. Select the "Data Execution Prevention" tab

    7. Click the "Add…" button

    8. Find the daemon’s executable – "C:Program Files…12binmssdmn.exe" and click "Open"

    9. Reboot and go back to being productive

  25. hjlee says:

    Thank you very much Deb Haldar.

    Now it looks invoke my dll.

    Then, I met an assert.

    It’s from atlcomcli.h

    _NoAddRefReleaseOnCComPtr<T>* operator->() const throw()

    {

    ATLASSERT(p!=NULL);

    return (_NoAddRefReleaseOnCComPtr<T>*)p;

    }

    When I search through desktop, it doesn’t make an assert.

    I am so sorry bordering you. But, I took too long time to fix this, and I have no improvement.

    I’m new about Microsoft programs, and it’s very tough to me.

    If you can, please help me a little more.

    And, I have an another question. Is it possible to create a 64bit ifilter with 32bit machine?

  26. debh says:

    Lee, AFAIK this usually happens when you try to uninitialize a COM interface which still

    has a non zero reference count.Please check if your code actually decrements the ref count to zero before you call CoUninitialize()or CoUninitializeEx().

    Also using the ref count debugging preprocessor directive <_ATL_DEBUG_INTERFACES> added to stdafx.h before atlbase.h should help in isolating the issue.

    Yes, you can compile a 64 bit version of ifilter on 32 bit machine, by modifying the compiler flags in VS to 64 bit.

  27. hjlee says:

    Thank you again.

    I found that it happens because my filter fails when loadifilter to invoke offfilt.dll internal.

    The return value of loadifilter is just ‘E_FAIL’.

    Can’t I use loadifilter in my filter?

    And I also can not use fopen fprint and so on.

    Is there anything needed to use file in ATL?

    I am coding with C++, and I’m afraid to code in C style because I’m very familia with C and unix environment.

    Thank you for your time and concern.

  28. debh says:

    Lee, it’s difficult to debug without seeing the code. However, you can try the following option:

    1. Use LoadLibrary to get a handle to the module.

    2. Use GetProcAddress( hMod, "DllGetClassObject" ) to retrieve the function ptr.

    3. Declare a classfactory and use the function pointer in step 2 to retrieve the classfactory interface pointer.

    4. Call CreateInstance on the classfactory to instantiate the filter.

    AFAIK, indexing service uses the LoadIFilter, but I’m not sure about other search products.

    Let me know if this works for you. May I also suggest that you contact Microsoft product support for MOSS 2007. These folks are really good in trouble shooting problems like this and total confidentiality about implementation is guaranteed.

  29. hjlee says:

    Thank you for all your help, Dab.

    Finally I fixed it.

    It was because my decrypter couldn’t get the key address. And it made thing messy.

    (In fact, I’m still wondering why it can’t process the key path like ‘c:\dir\keyfile’. The decrypter gets the path in that way and it worked well with desktop search. So I moved the key to windows/system32 folder.)

    It was very helpful to use the debugger in the way descripted on the other document in this blog.

    Anyway, now I have only 64bit problem.

    Thank you very much for your help.

  30. debh says:

    You’re welcome Lee! Glad to know you found

    the info on the blog helpful :)

  31. Charan says:

    I need some information on iFilters .

    Are they tightly bound to the WDS or its a general thing which i can use without using any Windows Search products as an Intermediate One  .

    In general ….

    Is it possible to develop my own Search Engine and use the iFilters that are already available  in my search engine and totally isolate Windows Search products ?

  32. debh says:

    Ifilters are not tightly coupled to WDS or any MS Search products in general.

    As far as developing your search engine is concerned using IFilters, the answer is yes.But please keep in mind that IFilters only forms a small part of the whole indexing pipeline in general.

    However, I’ve seen third party vendors using IFilters to write custom applications to index documents from within their applications.

  33. Charan says:

    Thanks Deb for ur comments …

    I have another question ..

    Do we need to have a legal copy of MS-Office to use the Office iFilters ???

    And also ….  can i have some links which describes the various methods the Office iFilter offers to its clients  in processing the office documents ??

    thanks,

    Charan

  34. debh says:

    Charan, as of now the only way to get the Office 2007 IFilter is by buying either MOSS 2007 or Office 2007. Thus, the answer is yes, you’d need a copy of either of the two above mentioned products.

    However we do have a plan to release the office filter as a seperate (and free!!!) downloadable package sometime middle of this year.

    The methods exposed by offiltx.dll are the ones described in the IFilter Interface documentation on msdn. This should be sufficient for extracting data and building an index.

    regards,

    Deb.

  35. Charan says:

    Thanks Deb for the update .

    Deb, i went through the iFilters methods .

    I dont see a method  which gives a Preview of ( say ) a Page in a document .

    Is such a method available ?

    What i precisely  want to do is  the following …

    1. Extract Text from all the pages of a  Doc ( Eg :MSWord  )

    2. search for a string in the text retrieved .

    3. If i found a Hit in Page "i"

    i Need  to highlight that string in that page (i) and give a Preview  of that page with highlighted string .

    Is there any methods available to do such things ???

    Thanks a lot for your time ,

    Charan

  36. debh says:

    Charan, you can achieve step 1 with the office IFilter. You’d need your own indexer to do step 2. Step 3 is more complicated and Ifilters cannot do this for you.You’d need to write your own plugin for that.But this is an extremely intereting concept – we’d definitely like to hear more about it:)

  37. debh says:

    This is a recent quetion I recived by mail, which I could not respond to because the sender’s email server just hated me:( I’ll answer it here:)

    ———————————————–

    From: jerome.schulist@hotmail.com [jerome.schulist@hotmail.com]

    Sent: Thursday, April 12, 2007 9:50 PM

    To: Deb Haldar

    Subject: (Filter Central) : iFilter for Office 2007

    I have searched all over for iFilter implementations for the Office 2007 formats with little success.  The only information I have found  mentions how Sharepoint 2007 include them by default.  I am specifically trying to add Office 2007 filter support to Sql Sever 2005.  Are the Office 2007 filters available anywhere as a patch/install for Sql Server 2005?  Thanks and regards,

    Jerry

    <Deb>

    Jerry, the office 2007 filters ships with sharepoint or office client 2007. We have an initiative

    in progress to release them as a seperate downloadable package which can be consumed from

    other search products like SQL.The ETA is around middle of this year.

    regards,

    Deb.

  38. Charan says:

    I wish there was some provision for displaying a preview of a Page with the highlighted text :-)

    ( microsoft should consider this and add some interfaces to the iFilters to support previewing   :-) )

    any way,  thanks for  your time Deb .

  39. Heath says:

    Is there a 64bit version IFilter for Visio available?

    Thanks,

    -Heath

  40. debh says:

    Heath, we’re in the process of releasing a 64 bit version in the Filter pack.

    cheers,

    Deb.

  41. Steve Meredith says:

    How can one index .tif and .mdi files on Vista? The Office 12 MODI IFilters don’t work on Vista.

  42. debh says:

    Steve, unfortunately there’s no way to index TIFF files on Vista :(

  43. Sharad (gs.com) says:

    Hi Deb,

    Either my Searching skills are not matured enough or I’m in a unique situation (unlikely!)…

    I haven’t been able to pick on correct documentation or reference around how to make RTF documents indexed in MOSS! Is it out of the box, or we need iFilter? Where can we get it (MS?), third-party?

    Any pointers are very much ppreciated.

    — Sharad

  44. debh says:

    Hi Sharad,

    rtf formats are indexable OOB. Please navigate to:

    Shared Services Administration: SharedServices1 > Search Settings > File Types ->New File type and add <rtf> as a new file type to be indexed by MOSS.

    cheers,

    Deb.    

  45. Sharad says:

    That’s interesting Deb.

    In our case, we are being told that we have to buy third-party for this basic format!

    I’ll confirm and get back. Thanks.

    — Sharad

  46. Adam says:

    I had heard a rumor that MS was going to release an update for MOSS 2007 that includes a TIF IFilter.  Is this true and if so how and when can I get it?

    Thanks,

    Adam

  47. gupta-ashish-k@hotmail.com says:

    Hi Deb,

    I am trying to index RTF documents in MOSS 2007. I added <rtf> file type as you told and then ran indexing service successfully, but it is not searching RTF docs.

    I created RTF doc using MS Word, wrote some english and non-english (numbers, special characters) and saved as "*.rtf".

    Any help would be much appreciated.

    Thanks.

    Ashish Gupta

  48. Bala says:

    Hi deb,

    I am trying to search the Pdf files in the MOSS search , but i was not able to get those files in the Result page. What should be done in order to get those pdf files listed in the Search Results page.

    Any help would be much appreciated.

    Thanks.

    Bala

  49. debh says:

    Hello Bala,

    the easiest way would be to add the pdf extension to list of crawled file types and then install the foxit pdf filter. Please follow the steps listed here:

    http://blogs.msdn.com/ifilter/archive/2007/05/10/long-awaited-64-bit-pdf-ifilter-finally-available.aspx

    Note that you do not need to change the CLSID anymore as the latest foxit installer takes care of it.

    Thanks,

    Deb.

  50. debh says:

    Hi Deb,

    I am trying to index RTF documents in MOSS 2007. I added <rtf> file type as you told and then ran indexing service successfully, but it is not searching RTF docs.

    I created RTF doc using MS Word, wrote some english and non-english (numbers, special characters) and saved as "*.rtf".

    Any help would be much appreciated.

    Thanks.

    Ashish Gupta

    =========================================

    Ashish, did you recycle the search service after you added the rtf extension?

    -Deb.

  51. debh says:

    Well, you know what they say adam – "Don’t believe in rumours ! " :)

    On a more serious note, we’re planning to release the TIFF filter with filter pack – the release date of which is still TBD.

    cheers,

    Deb.

  52. Rolf says:

    Dep, you know that the unavailability of a TIF and MDI filter just for the "Microsoft Document Imaging" part of Office keeps us not upgrading to MOSS 2007. How can MS discontinue the capability to index the own Office stuff? We need this and I can not understand decissions to pospone this functionality so far out.

    Rolf

  53. Eric says:

    I saw some comments here about a release of iFilters for various Microsoft formats somewhere in the middle of 2007… Is there any progress on this ?

  54. Eric says:

    I forgot to ask: will that release contain iFilter for the XML format of Word 2003 (WordML)?

  55. debh says:

    Yes the tentative timeline for the release is towards end of this year.

    I don’t think it’ll have a filter for WordML. But I’ll double check.

  56. Eric says:

    We actually trusted Microsoft when they announced that they would largely support XML with their release of Office 2003. So we build a solution based on the WordML format… and we are unable to index them… The workaround would be to install Office 2003 on the servers where the indexing in SQL occurs… Which is, in my opinion, a non-sense. So I’m very worried about the content of the iFilter package. It looks like Microsoft is not that much supporting previous versions anymore… In the past, we never had to worry about backward compatibility… Now I start to fear even for the previous release… while Office 2007 had been released only a few months ago…

    I looked around on the net, and I think we are not the only one to ask for this… It looks like Microsoft wants to forget about its previous release quickly and give all support on the Open XML format.

    I realy do hope that the kit will contain the ifilters for at least the previous version of Office… Otherwise it would mean that the only unsupported version of XML would be 2003, since previous binary formats where already supported…

    Please consider our demand…

    Thanks

  57. Guy Wiggins says:

    Deb, will the IFilter for TIF when it comes out be able to OCR TIF images as well as you could do in Sharepoint 2003? Are there any 3 party TIF iFilters that you know of? Is the release of the Filter pack still the end of the year? Thanks

  58. Anton parrish says:

    Do you know of any Corel WordPerfect iFilters for 64-Bit MOSS 2007 deployment?  

  59. Jason says:

    Any update on when the iFilter pack will be released?

  60. Frank Albrecht says:

    Hi Deb!

    I’ve got the same problems when indexing .rtf files as Ashish has.

    I added the rtf-file-type, restarted both search services and resetted the iis (sometimes a miraculous task ;))

    Unfortunately it didn’t change anything. rtf-documents are still not indexed.

    Do you have any further ideas about that?

    Is a solution for this problem included in SP! for WSS and/or MOSS, which MS has released right now?

    Thanks in advance!

    Regards

    Frank

  61. Simon Sabin says:

    Office 2007 iFilters are now available

  62. Cherie Warren says:

    When will the Tiff IFilter be available?  Is a 64-bit tiff ifilter available now?

  63. Shai Shapira says:

    I hope I’m not missing something obvious, but how do I make the MOSS search look in plain text files which are not in .txt extension (or any other extension that’s indexed by default)?

    I added the extension I need (.cs) to the list of extensions searched, reset the IIS and indexed the files, but the search only looks in the file names, no search in the actual file content is done. Am I missing something?

  64. Richard says:

    Hi Deb,

    I corrected registry key according to your blog. The citeknet also dispays the registing is good. Crawl log also says the VSD is crawled. But why it can’t still crawl body of Visio. Following is the search result summary.

    Thanks a lot!

    "C:Program FilesMicrosoft OfficeVisio112052"BASFLO_M.VST chengxul Microsoft Visio ASB TC010497202052 进程 : 将此形状拖到绘图页上。 : 11 : 1FBE7366-0000-0000-8E40-00608CF305B2 预先定义的进程 : 拖到绘图页后,可以添加一个特定进程,如子例程或模块。

  65. debh says:

    Hi Richard,

    Can you please send us the vst file and we can have a look at it ?

    Thanks,

    Deb.

  66. Richard says:

    I have installed Adobe ifilter 6.0 on ShapePoint Services 3.0 with Acrobat Reader 8.1.1.   I made all the necessary changes on registry and still could not search pdf files.  Any suggestions?

  67. Jim says:

    WDS seems to be not able to index the contents of my visio VSD files.  The properties appear to get indexed but not the contents.  In advanced options I specified that contents of VSD be indexed.  Any suggestions?

  68. debh says:

    Jim, can you send me the file (or a similar file) that you’re trying to index ?

    Thanks,

    Deb.

  69. Jim says:

    Hey Deb,

    I think this might not be a file issue.  I initially had trouble getting this working but after reinstalls of the filter pack, WDS and reboots (in various orders) it worked.  Because of these initial problems I wanted to make sure I had a repeatable process that I could share with my co-workers … so I un-installed WDS and the filter pack and have since been unable to get it to work.  I think this is more an issue with the filters not being properly registered with WDS.  However, I will happily send you an example file, just not sure how to do that.  My email is james.nowak@ge.com.  Thanks for your help.

    -Jim

  70. Deb says:

    Jim, did you take a look at the KB errata published on this blog? Ideally you’d install WDS and then install filter pack.

    Also, you can send me the problematic file at debh@microsoft.com – My team does not make the visio filter, we’re just a shipping vehicle for this filter. Nevertheless, I’d like to have a look if the issue is affecting Microsoft customers.

    Thanks,

    Deb.

  71. Fabian says:

    Hello,

    is there an ifilter for .sql (sql query files)? Or could it be possible to use the .txt Filter for this type?

  72. Deb says:

    AFAIK, there is no ifilter for .sql (sql query files) – it might be possible to use tquery.dll to filter these files. You can try following the registration steps for 3rd party filters mentioned in this blog and register the text ifilter to handle .sql  files. But I’m kind of skeptical about the results you’ll get back.

    One way to testdrive the validity of results is to rename the .sql  files to .txt and try searching through them. If this gives you relevant results, registering tquery.dll for handling .sql  should work.

  73. Johnny says:

    I face a problem here, i had 2 machine, which is Window Server 2003 Standard SP2 (OEM) and Window Server 2003 Standard SP1. Both of them i also install TIF IFilter but only the Window Server 2003 Standard SP1 are working.

    I also had try with 2 machine with the same Window OS. Window Server 2003 Enterprise Edition SR2, both machine i also had installed with TIF IFilter, but only one are working and other one was fail.

    All testing was doing on the clean machine with the same step. Why only had either one is working and other was not?

  74. debh says:

    Johnny, you’re using the TIF filter with SPS2003 right ?

  75. Johnny says:

    Deb Haldar, thank! for the reply.

    Yes! we are using the TIF filter with SPS2003 and non SPS2003. The same problem was appear. Is there any solution for the TIF Filter problem that i had? Thank!

    Regards

    Johnny

  76. Johnny says:

    Dear Deb Haldar,

    As i was found on the http://support.microsoft.com/kb/837847, the solution they give is go to edit the registry of HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlMSPaper

    add new point of PerformOCR and enable optical character recognition, type 1.

    I also had try as the http://support.microsoft.com/kb/291835

    I did as they said in the machine that not work with TIF Filter, and the result was the same. TIF Filter didn’t work. Any other solution for this?

    Thank!

    Regards,

    Johnny

  77. debh says:

    Johnny, I’d try the following:

    1. On the machine that you have TIF working, take a snapshot of the registry keys accessed

    by using Regmon.

    2. Do the same on the machine where TIF is not working. Here, try to find which keys were looked at and if there were any errors. Compare with the ones in step 1.

    Also, on the machine where TIF is not working, do you see any application errors in the event viewer?

    regards,

    Deb.

  78. Johnny says:

    Deb Haldar, Thank! for fast reply…

    I starting the test with a clean machine which already formated and installed OS only. I had test step by step which including the snapshot of registry keys accessed by using Regmon. The result is still the same, TIF filter is still remain not working in the same machine, which the TIF Filter are working, is still remain in the same machine. All the step and OS I used are same, the only different of the machine is the hardware. It couldn’t be the hardware problem. Are there any other solution for it?

    Thank!

    Regards,

    Johnny

  79. Johnny says:

    Deb Haldar, there is no Key are founded with error. Thank!

    Regards,

    Johnny

  80. Johnny says:

    Deb Haldar, I already try solution as you told, but i didn’t found an error on the registry key. Any other solution? Please…

    Thank!

    Regards,

    Johnny

  81. debh says:

    Johnny, AFAIK it should not be hardware dependent – can you check the ULS logs under %programFiles%CommonFilesMicrosoft SharedWebServer extensions

    and see if there are any errors ?

  82. Johnny says:

    Deb Haldar,

    This few days, I already check all the thing and do all the compare. There is nothing wrong with it. Do you had other solution. Thank for the help.

    Regards

    Johnny

  83. Bala says:

    How do I get the iFilter for Office 2007 working with SQL Server 2005. After I install the filter pack, i checked the sys.fulltext_document_types table and did get the additional extensions recorded(docx, potx..) However, the fulltext does not seem to index the docx, xlsx,etc. Any advice…

  84. Bala says:

    How do I get the iFilter for Office 2007 working with SQL Server 2005. After I install the filter pack, i checked the sys.fulltext_document_types table and did get the additional extensions recorded(docx, potx..) However, the fulltext does not seem to index the docx, xlsx,etc. Any advice…

  85. Eric says:

    Hello,

    I see where to download the iFilter pack at http://www.microsoft.com/downloads/details.aspx?FamilyId=60C92A37-719C-4077-B5C6-CAC34F4227CC&displaylang=en.

    However, there seems to be a registration process that limits the use of the filters to the following:

    Office SharePoint Server 2007

    Search Server 2008

    SharePoint Portal Server 2003

    Windows SharePoint Services v3.0

    Exchange Server 2007

    SQL Server 2005

    SQL Server 2008

    I have server 2003 enterprise with the default indexing server.  Will this filter pack allow me to find office 2007 files?

    Thanks,

    Eric

  86. DorjeM says:

    Same issue as Eric

    Server 2003 iis 6 No office / No sharepoint installed.

    I’ve installed the Filter pack but am still unable to index docx files using the default indexing server.

    Any suggestions on how to get this working in the environment specified ?

    DM

  87. DorjeM says:

    Deb and Eric,

    I’ve found the soluton – thanks to some of the kind folks who provide Premier support.

    A registry change is required

    I’ve documented this here

    http://dorjem.blogspot.com/2008/04/office-2007-files-indexed-by-indexing.html

    DorjeM

  88. Txhiaj says:

    Hi, i have a problem with MOSS,

    it seems that i need to install two different ifilters for my users to open CAD and MPG files.

    I’ve tried search for DWG ifilters that is not a trial verison, and also tried for MPG, both with no success.

    Any one who can help me with this problem i would greatly appreicate it.

    thanks

  89. Sameer says:

    Hi,

    I am trying to make an ifilter. I downloaded the WDS SDK. And i tried using the FilterSample project from Windows Search 3x SDKIndexingFilterSample. But i am getting many errors in 3 .idl files when i try to build the project. The idl files having problems are

    mshtml.idl, dimm.idl, mshtmhst.idl

    These files came with the Vista SDK. So how should i proceed? Also i wanted to know where can i get the other ifliter sample codes which are mentioned on the msdn?

    Thanks.

  90. Robbie says:

    Hiya we also use WordML format… and we are unable to index them. We have a MOSS 2007, WSS 3.0 and badly need a iFilter that handle WordML format. It was mentioned that "The workaround would be to install Office 2003 on the servers where the indexing in SQL occurs…" Will that work proper?

    Is there a new KIT for this problem out there that can aid us? Any third part software that can do the trick?

    Thanks.

  91. Wouter says:

    Hi,

    I have modi setup with moss 2007 and it’s actually indexing the OCR information the correctly when it is available.

    This is a good thing, but what I want it to do is create OCR information when it’s not available.

    Is that possible to do? I’ve set the PerformOCR registry key to 1, and restarted the search service but that did not work.

    Any ideas?

  92. Naresh says:

    Hi ,

    I have used the MODI DLL and implemented a method in VC++ to recognize the characters in .tiff as well as .bmp files.

    One thing i have noticed that the OCR method of IDocument interface will fails on passing either .tiff or .bmp file with single character of size less than 72.

    will you please help me to fix this problem ASAP.

    Its Urgent…..

  93. Naresh says:

    Hi ,

    I have used the MODI DLL and implemented a method in VC++ to recognize the characters in .tiff as well as .bmp files.

    One thing i have noticed that the OCR method of IDocument interface will fails on passing either .tiff or .bmp file with single character of size less than 72.

    will you please help me to fix this problem ASAP.

    Its Urgent…..

  94. coding for fun says:

    Hi, has anyone managed to install the htmlprop Ifilter with search server 2008 express?

    I am trying to map some crawled properties to types other than strings, as an example

    map <meta content="28" name="age" /> to a crawled property of type int.

    I am trying to use the htmlprop.dll that came with the CrawlingMetadataHtmlprop sdk example.

    Any advice?

  95. Имеется: Windows Sharepoint Services 3.0 (почти халява) + Search Server 2008 Express (халява)….

  96. Prakash Tandukar says:

    I am able to read the properties of MS Office 2007 (*.docx,*.xlsx) documents using IFilter (codes available at

    http://vbaccelerator.com/home/Resources/Babbage/NET_IFilter/IFilter.zip)

    but could not read any property of MS Office 2003 (*.doc) document. What should be changed in the code in order to read properties of MS office 2003 documents as well.

    Thanks

  97. Deb says:

    While I’m not intimtely familiar with this code, it seems like we’re not passing the correct IFilter INIT flags to enable property extraction.

    The enumeration below should be extended to support the full blown IFILTER INIT flags if we want to index all properties.

    http://msdn.microsoft.com/en-us/library/ms691091(VS.85).aspx

    private enum IFILTER_FLAGS : int

    {

    /// <summary>

    /// The caller should use the IPropertySetStorage and IPropertyStorage interfaces to locate additional properties.

    /// When this flag is set, properties available through COM enumerators should not be returned from IFilter.

    /// </summary>

    IFILTER_FLAGS_OLE_PROPERTIES    = 1

    }

  98. Prakash Tandukar says:

    Thank you Deb

    I tried to get ifilter instance using BindIFilterFromStorage() method like below,

    IFilter *pFilter = 0;

      HRESULT hr ;  

      DWORD flags = 0;

    IStorage *pStorage = NULL;

    // Open the document as an OLE compound document.

    hr =  ::StgOpenStorageEx(filename, STGM_READ | STGM_SHARE_EXCLUSIVE, STGFMT_STORAGE,

    0, NULL, 0, IID_IStorage, (void**)&pStorage);

    if(SUCCEEDED(hr))

    hr= BindIFilterFromStorage(pStorage,0,(void**)&pFilter);

    else

    return ;

    if (FAILED(hr))

         {

            pFilter->Release();

            throw exception("BindIFilterFromStorage() failed");

         }

         hr = pFilter->Init(IFILTER_INIT_INDEXING_ONLY |

                                    IFILTER_INIT_APPLY_INDEX_ATTRIBUTES |

                                    IFILTER_INIT_APPLY_CRAWL_ATTRIBUTES |

                                    IFILTER_INIT_FILTER_OWNED_VALUE_OK |

                                    IFILTER_INIT_APPLY_OTHER_ATTRIBUTES,

    0, 0, &flags);

         if (FAILED(hr))

         {

            pFilter->Release();

            throw exception("IFilter::Init() failed");

         }

         Start();

         STAT_CHUNK stat;

         while (SUCCEEDED(hr = pFilter->GetChunk(&stat)))

         {

            if ((stat.flags & CHUNK_TEXT) != 0)

               ProcessTextChunk(pFilter, stat);

            if ((stat.flags & CHUNK_VALUE) != 0)

               ProcessValueChunk(pFilter, stat);

         }

         Finish();

         pFilter->Release();      

    But this time also, it can not read any property of *.doc file.

    The flags value is always 1 after calling  pFilter->Init() function.

    Next pFilter->GetChunk() function never returns CHUNK_VALUE.

    How to use the IPropertySetStorage and IPropertyStorage interfaces to locate additional properties?

    Thanks

    Prakash

  99. Jim Brown says:

    Is there a filter available for Microsoft Works file types (.wps, .xlr, and .wdb)? If not, is one planned? Thanks.

  100. ken says:

    Hi Deb,

    I tried emailing you via this blog's contact link, but I guess you don't check it that often 😛

    Do you have any examples of how to get an IFilter to return a multi-value/multivalue from IFilter::GetValue?

    I tried wrapping the COM values in a SAFEARRAY, but Vista's indexing service doesn't recognize it at all.  I'm trying to test on Sharepoint 2010, but still struggling w/ the install for that so haven't been able to yet 😛

    I have put in enough instrumentation to determine that indexing service only calls ::GetValue once instead of calling it multiple times until it finds no more values, so the only other thing it can return is a SAFEARRAY.

    Also, are there limitations on multivalue data types?  I.e., can it be a multivalue of ints, dates, etc. instead of only strings?  I've found references that multivalues can be strings, but nothing else…

  101. Martin Dann says:

    Has anyone come across the issue with the 64 bit filter pack not honouring line breaks in .docx files? This is causing words on different lines to be joined together when reading from the file.