SharePoint 2007 Search – Trying to crawl Office documents that contain embedded links?

I’ve seen a couple of these issues come from our customers so wanted to get the word out.  

When crawling PPTX files with embedded links in 2007 generates the following error in the crawl logs:


“The filtering process could not be initialized. Verify that the file extension is a known type and is correct”. 


Before applying the fix, ensure this is the issue you’re running into by reviewing the crawl log and looking at the actual documents to ensure they have embedded links.


The fix is applying the Microsoft Office 2010 Filter pack on the 2007 server hosting the Index role:




Comments (2)

  1. Eric says:

    Two questions:

    – does this affect SharePoint 2010?

    – your article title says "Office documents" but your text only refers to PPTX files. Is it only PPTX files, or all Office 2007 files with embedded links?

  2. Russ Maxwell says:

    No this doesn't affect SharePoint 2010.  This would affect all Office 2007 files with embedded links  🙂