Many people have been asking for the best practice or a guide to properly maintain Lotus Notes indexing function in SharePoint Search. So here it is, this is not a official guide, but our experience in several big customers. I will write this in a Q/A format, so you can navigate to see which question applies to your current problem.
Q1. How many Lotus Notes content source can I crawl at the same time?
A1: One content source per Domino Server. If all of your stuff are put on a single Domino Server, you have to crawl them one by one. But If you have several Domino servers to index, then you can index them at the same time. This is a limitation of IBM Lotus Notes C++ API. So you may need to carefully set schedules to crawl these content sources.
Q2. How many Lotus Notes content source shall I crawl at the same time?
A2: The only difference from the 1st question is CAN/SHALL. There should be a limit on this number, but what is this number? I don’t have the direct answer for the question, because this number depends on your hardware performance, memory usage, network legacy and bandwidth…. so many factors. For a recent hardware with 8GB ram, I would recommend 3,with scheduled memory recycling – we will talk about this later.
Q3. I have a Notes database indexed, but how come the time of full crawl is nearly the same with incremental crawl?
A3: During an incremental crawl, SharePoint search engine will check LastModifiedTime property of target documents/items, and to determine if the target object should be fully retrieved back to its index. However, for certain content source, this property is not retrieved or mapped to something else by mistake, therefore, the engine can only get all the content back to check if there’s any difference. I’m checking a possible solution for this problem, and will update if I can find something.
Q4. Should I use x86 or x64 for Lotus Notes indexing?
A4: Because of the limitation of IBM Notes C++ API, Notes Protocol Handler can only run on a x86 box. However, you can still use x64 query servers and WFEs. Remember: the same tier should not be mixed with x64/x86 boxes, but you can have x86 indexer tier with x64 query and x64 wfe tiers, this is recommended for Notes search in SharePoint 2007/Search Server 2008. (IBM released x64 version of their API recently, but it’s impossible to make current NotesPH to work with that, many things changed)
Q5. You mentioned memory recycling – what does that mean?
A5: Due to x86 limitation, the memory per process is limited to certain number. And because we are calling Notes client through API, it’s quite possible MSSEARCH/MSSDMN process will hit memory limit after a crawl of large numbers of documents. So I recommend you to recycle these processes for every certain amount of time. This can prevent possible stuck of the crawl. In order to do this, you might need to write your own schedule program with SharePoint search administration APIs, and restart osearch service when it’s need. I will also add this function to SharePoint Search Admin 0.81 and later in a few days.
Q6. Any ideas about security trimming support? What should I do in Domino side?
A6: You can use Lotus Notes users and groups to control security, and map them to AD users to achieve search result security trimming in SharePoint. But it is generally advised to not use Lotus Notes Roles for security control, as there’s no correspond thing in active directory.
Q7. To be added.
Btw, I’m moving to a new position in IW PMG, as a Technical Product Manager to drive SharePoint IT Pro readiness. So in future there would be more things like SharePoint Governance appear on this blog:).