Crawling and Large ACLs in Sharepoint

A colleague of mine had questions recently regarding the ability to crawl Sharepoint sites that contain items with very large ACLs (with >2000 ACEs). So the debate began over what actually can and cannot be crawled when a site contains an item or items that have large ACLs.

If a site contains an item with a large ACL, does that mean the entire site cannot be crawled?

Or the entire list that the item resides in cannot be crawled?

or only the item itself?

 

Since current documentation on this matter is vauge at best, instead of debating this to death we decided to setup a test lab and see for ourselves what the actual affect on crawling really is.

Below is the result from our testing:

 

Below is a crawl log from testing performed with an overloaded ACL inwhich 5000 domain users where given full control to one document item in a document library. The document https://teampolar/testdocs/test1.doc is our ACL overloaded doc.

Afterward we ran an incremental crawl.

A screenshot of the crawl log is below.

the ".... and monitoring.doc" (just above it in crawl log) is in the same https://teampolar/testdocs document library and as you can see we can crawl it fine, along with everything else currently in the testdocs library and site. 

So looks like the only thing that we can't crawl is the specific item(s) that have > 1800 ACEs as the Site ACL itself contained the 5000 user objects with the grayed out "Limited" access right which is typical when a user is given explicit access to an item within the site.

But in crawl log screenshot below, you can clearly see that the only items we are not able to crawl is the particular item that has the large ACL.

 

 

LargeACL_CrawlLog

 

 

 

 

 

 

 

 

 

LargeACL_CrawlLog.jpg