FAST Search for SharePoint 2010 Item-Level Security Using FAST Search JDBC Database Connector

[We now take a brief interlude. This post is written by my colleague Richard Palawan and will cover the JDBC connector which is a holdover from the orginal Fast product that was ported over for our legacy customers. There are times where it comes in handy in Fast Search Server and Richard will cover that. Enjoy! --Carlos Valcarcel]

Richard Palawan

Microsoft Corporation, April 2012

Background

When setting up an enterprise search service, it is necessary to apply specific access restrictions different parts of the crawled content. When crawling database content into FAST Search for SharePoint 2010, using the BDC framework is the preferred method for implementing item-level security in FAST Search. However, there are conditions under which using the FAST JDBC Database Connector is more appropriate. Refer to online documentation for reference; e.g. https://msdn.microsoft.com/en-us/library/ff383278.aspx .

There are a few references and samples that describe the process of implementing item-level security using BDC. However, there is little documentation on how to achieve the same result using the JDBC Database connector. In this writing, we go over a JDBC-based implementation.

Security Model – Item Level

The architecture of FAST provides a means of storing a Security Descriptor (SD) for each index record. The Security Descriptor contains two lists: Granted SID, Denied SID, where SID is a Security Identifier of a security principal such as a user or a group. The SID can refer to varying security providers such as Active Directory (AD), Forms Authentication, and Windows Live.

Search Results Trimming

When a call is received by the FAST query processor, it also contains a claims ticket indicating the identity of the end user and the user's Security Groups (SG). Since each record in the search index may contain a Security Descriptor, the FAST query engine intersects the user's identity and associated SG's against the record's access control list (ACL), and if access is granted, and access is not denied, the record is included in the result set; otherwise, it is skipped.

Injecting the Security Descriptor as Part of the Database Record

In order to implement this using the JDBC database connector, we need to assign the information in the security descriptor directly to a special property in the FAST engine named docacl. The feed processing pipeline recognizes the special property and does not convert the value into a crawled property, but instead assigns the value directly to the docacl field in the index.

Generating a Security Descriptor

Here is the description of the docacl value that holds the security descriptor information according to MSDN (https://msdn.microsoft.com/en-us/library/ee626835(office.12).aspx):

"Attribute key: docacl

Attribute type: cht::documentmessages::string_attribute

Attribute value:

The docacl attribute MUST contain a list of space separated entries in the following format:

<deniedRightFlag><userStoreID><securityIdentifier> . <deniedRightFlag> is "9" if the rest indicates a deny permission or "" if it is a grant permission. <userStoreID> is the user store identifier of the user/group given in <securityIdentifier>. <securityIdentifier> is the user/group security principal identifier that was granted or denied permission to the item. A docacl value MUST only contain alphanumeric characters. If the <securityIdentifier> contains other characters, the <securityIdentifier> MUST be encoded with a base-32 variant of [RFC4648] using an alphabet with a-z and 1-6 and no equal sign (=) padding at the end. "

On any server where FAST Search is installed, we get two PowerShell cmdlets that encode and decode security principal entries according to the base32 encoding scheme mentioned above.

Get-FASTSearchSecurityEncodedSid

https://technet.microsoft.com/en-us/library/ff393821.aspx

Get-FASTSearchSecurityDecodedSid

https://technet.microsoft.com/en-us/library/ff393823.aspx

 

Here is an encoding example:

Get-FASTSearchSecurityEncodedSid -SID S-1-5-21-255901184-1604012920-1887928677-2217443

which results in the following encoded string:

aecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa

 

Note that it probably would be easier to use the -Users option and the domain user name such as domain1\user1.

*Note: All SID and encoded strings used in this writing are intentionally invalid. They are used only to illustrate expected formats.

In order to inject this as the Security Descriptor, you would need to add the User Store ID, and possibly the Denied flag if this is a principal that is denied access. If this an Active Directory account, the User Store ID would be 'win'. The denied flag is '9', pre-pended to the overall string.

Example of an allowed account:

winaecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa

Example of a disallowed account:

9winaecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa

 

If you need to set multiple entries of disallowed and allowed entries, concatenate the encoded strings using a space separator, starting with the disallowed list:

9winaecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa winaecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa

 

If you need to look at actual FAST encoded strings for your security principals, you run a sample collection using the BDC SQL provider and capture the strings in the generated FiXML files, search for 'docacl'.

Adding the docacl as a Select Field in the JDBC Database Connector Configuration XML File

Once you obtain the values from running Get-FASTSearchSecurityEncodedSid, pre-pending the User Store ID and optionally the denied flag, and concatenating multiple encoded SID's into a single string, the result is ready to be inserted into the FAST index along with the rest of the document content. We achieve this by adding a new field named docacl to the SQL Select statement in the JBDC Database Connector configuration file.

In the following example, I need to set the same value of the Security Descriptor to all of the documents in a particular content source. In the JDBC Database Connector configuration XML file, I modified the SQL statement to add the docacl field as follows:

<parameter name="JDBCSQL" type="string"> -<description>

<![CDATA[This or JDBCSQLFile must be provided. SQL query to crawl against. Note, any valid SQL is valid here. <br>Use %TIMESTAMP% where last crawl time gets inserted as a datetime value. <br>Use %TIMESTAMPSEC% where last crawl time gets inserted as number of seconds since epoch <br> Examples: <br> Oracle: SELECT * FROM tableName WHERE dateField > TO_TIMESTAMP('%TIMESTAMP%','yyyy-MM-DD"T"hh24:mi:ss')&nbsp; <br>Note that the time stamp format used must be as indicated here. <br>MS SQL Server: SELECT * from tableName WHERE dateField > convert(datetime,'%TIMESTAMP%',126)<br>select * from employees <br>Default: (none)]]>

</description> -<value>

<![CDATA[SELECT [field1],[field2] ,…,[fieldn], '9winaecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa winaecqaaabgsakfiaaaakazoppz3exg16345io23kqyaa' AS [docacl] FROM [server1].[sch1].[table1]]]>

</value>

where the value in quotes '9winaecqaa … winaecqaaa…' was generated by running the FAST cmdlet and concatenating the output as described earlier.

If you need to set a different value for each document, you would have to set the Security Descriptor encoded value in the source database table.

 

This is a complete implementation of a JDBC Database Connector based item-level security.