Cloud Hybrid Search Service Application

Cloud Hybrid Search-Overview

Hybrid search makes finding content easy, wherever the content lives. A company has a hybrid environment if their content and applications are spread across on-premises and Office 365. To complement the existing hybrid search solution, which is based on federating search results, a new capability has been introduced in the product. This new hybrid search solution is called cloud hybrid search and is based on indexing on-premises content in Office 365.

While the existing federated search results model of inbound and outbound hybrid search continues to be supported, this new hybrid solution brings numerous new capabilities into the hybrid sphere. The new hybrid solution takes away the complexities related to queries coming in to the SharePoint Server on- premises environment via a reverse proxy. Therefore, a public SSL certificate is no longer required for inbound authentication requests from the Secure Store application in SharePoint Online. The infrastructure requirements for configuring cloud hybrid search is discussed later on in this blog. Below is a pictorial representation describing the query and crawl overview in next generation hybrid search.

 

image   

Cloud hybrid search will be available to SharePoint Online customers and either of these solutions

§ As Public Preview in the August 2015 Public Update (PU) for SharePoint Server 2013  https://support.microsoft.com/kb/3055009      

§ In the public preview release of SharePoint Server 2016 IT Preview
https://www.microsoft.com/en-us/download/details.aspx?id=48712     

 

You may already be familiar with the federated search results model of inbound and outbound hybrid search, as well as its implementation guidelines. This blog contains a series of articles on how to configure these scenarios – see below

Office-365-configure-hybrid-search-with-directory-synchronization.aspx     

Office-365-configure-hybrid-search-with-directory-synchronization-password-sync-part2.aspx     

Identity-federation-amp-single-sign-on-deployment-for-hybrid-search-in-office-365-sharepoint-online-part3.aspx        

Sharepoint-2013-configure-on-premises-users-to-leverage-office-365-for-their-mysite-onedrive-part-4.aspx     

Configure-onedrive-for-business-as-a-hybrid-search-vertical-in-sharepoint-onpremise-search-center-part5.aspx       

Configuring-search-verticals-for-hybrid-search-experiences-in-sharepoint-2013-and-sharepoint-online-part6.aspx         

Configuring-microsoft-web-application-proxy-server-for-inbound-hybrid-topology-with-office-365-and-microsoft-sharepoint-server-2013-part7.aspx        

Troubleshooting-certificate-validation-errors-for-inbound-hybrid-search-with-office-365-and-microsoft-sharepoint-server-2013-part-8.aspx      

The-all-new-cloud-search-service-application-coming-to-sharepoint-2013-and-sharepoint-2016.aspx     

The focus of this blog is to deploy and develop an understanding of the Cloud Search service application.    

 

The cloud hybrid search solution provides the ability to crawl and parse on-premises content and then process and index it in Office 365. When users query the search index in Office 365, they thus get search results from both on-premises and Office 365 content. The parsed content from on-premises is encrypted while in transit from the on-premises crawler through to the content processing stages in Office 365. We will talk about the specifics of encryption and data transfer later in this content.

 

The crawling configuration, including that of the Search service application, content sources, crawl rules etc. is carried out in the on-premises environment. Modification to search experiences, for example search schema changes, are performed at the Office 365 level. By deploying the cloud hybrid search solution customers can finally achieve the benefits of a unified index and search experience spanning on-premises and online content in a single result set.

 

You can join the TechNet support forum for cloud hybrid search here https://social.technet.microsoft.com/Forums/en-us/home?forum=CloudSSA  

Note:  The cloud Search service application is currently not available for customers outside the regular Office 365 multitenant service, including China data centre customers and Government cloud customers.    

The deployment and configuration steps documented here are based on the preview release of the cloud Search service application. As the product moves towards General Availability and the SharePoint 2016 on premises release to market the processes may change.    

 

How to configure the cloud Search service application

 

In order to configure the cloud Search service application, the below steps needs to be performed.

Mandatory Configuration

1. Synchronize users and groups from on-premises to Office365 Azure Active Directory    

2. Create cloud Search service application   

3. Install onboarding pre-requisites

4. Execute onboarding script

5. Create on-premises content sources

6. Configure outbound query federation

7. Configure SharePoint Online search vertical

Optional configuration

1. Publish cloud Search service and consume from SharePoint 2010.

2. Customization.   

 

Synchronize users and groups from on-premises to Office365 Azure Active Directory      

A prerequisite for cloud Search service application deployment is to have a common user identity across on-premises and online, this enables search results to be correctly security trimmed against the user identity performing the query. The synchronization of user identity is performed using a directory synchronization tool. Below is the list of supported synchronization tools.    

Tool

TechNet Links

Dirsync

        https://msdn.microsoft.com/en-us/library/azure/hh967642.aspx

         https://msdn.microsoft.com/en-us/library/azure/jj151831.aspx

AADSync

         https://msdn.microsoft.com/en-us/library/azure/dn790204.aspx 

AADConnect

         https://azure.microsoft.com/en-us/documentation/articles/active-directory-aadconnect/

    

 To configure directory synchronization for your environment, follow the steps in https://blogs.msdn.com/b/spses/archive/2013/10/22/office-365-configure-hybrid-search-with-directory-synchronization.aspx     

Directory synchronization also ensures the correct user properties are available in Office 365 Azure AD to be able to carry out a process known as user rehydration.   

Create a cloud Search service application

A cloud Search service application (SSA) cannot be created using the central admin SSA creation user interface. The reason being that the cloud SSA requires a property setting that is not applied by the UI based creation process. This property is called CloudIndex and must be set to true for a cloud SSA. CloudIndex is a read-only property of any deployed SSA and as such cannot be set post creation. By definition this also implies that an existing regular SSA cannot be converted to a cloud SSA.

The property value for a SSA can be checked by executing
(get-spenterprisesearchserviceapplication).cloudindex       

 

image

The cloud SSA should be created by executing a SSA creation PowerShell script and setting the CloudIndex property to true. Later, when we execute the on-boarding script, another property called IsHybrid is set to 1 for the SSA.

New-SPEnterpriseSearchServiceApplication -Name $SearchServiceAppName -ApplicationPool $appPool -DatabaseServer $DatabaseServerName -CloudIndex $true            

 

The CloudIndex property disables the normal ContentPlugin and the IsHybrid property initializes the AzurePlugin so the content gets ready to be pushed to Azure.

The SharePoint Admin is free to use their own script to generate the cloud SSA and scale out as long as they set this property. For convenience we have provided a sample creation script named CreateCloudSSA which generates a single server search instance and sets this property correctly. By running the CreateCloudSSA script, you create a cloud SSA that crawls your on-premises content.  

Note: All scripts are provided AS IS without warranty of any kind. Please see the official TechNet guidance for links to supported scripts.

On the server that is running SharePoint Server 2013 or SharePoint Server 2016 Preview:  Copy the sample script below and save it as CreateCloudSSA.ps1 and run it.  

When prompted, type:

The name of the SharePoint Server search server.

The Search service account in this format: domain\username.      

A name of your choice for the cloud SSA.     

The name of the SharePoint Farm database server.

   

Verify that you see a message that the cloud SSA was created successfully.      

 

Sample CreateCloudSSA.ps1

-----------------------------------------------------------------------------------------------------------------------------------------------------

## Gather mandatory parameters ##    

## Note: SearchServiceAccount needs to already exist in Windows Active Directory as per Technet Guidelines https://technet.microsoft.com/library/gg502597.aspx ##    

Param(    

[Parameter(Mandatory=$true)][string] $SearchServerName,      

[Parameter(Mandatory=$true)][string] $SearchServiceAccount,     

[Parameter(Mandatory=$true)][string] $SearchServiceAppName,     

[Parameter(Mandatory=$true)][string] $DatabaseServerName     

)    

Add-PSSnapin Microsoft.SharePoint.Powershell -ea 0     

## Validate if the supplied account exists in Active Directory and whether supplied as domain\username    

    if ($SearchServiceAccount.Contains("\")) # if True then domain\username was used     

    {     

    $Account = $SearchServiceAccount.Split("\")     

$Account = $Account[1]     

    }     

    else # no domain was specified at account entry     

    {     

    $Account = $SearchServiceAccount     

    }     

    $domainRoot = [ADSI]''     

    $dirSearcher = New-Object System.DirectoryServices.DirectorySearcher($domainRoot)     

    $dirSearcher.filter = "(&(objectClass=user)(sAMAccountName=$Account))"     

    $results = $dirSearcher.findall()     

if ($results.Count -gt 0) # Test for user not found     

    {      

 

    Write-Output "Active Directory account $Account exists. Proceeding with configuration"     

## Validate whether the supplied SearchServiceAccount is a managed account. If not make it one.    

if(Get-SPManagedAccount | ?{$_.username -eq $SearchServiceAccount})      

    {     

        Write-Output "Managed account $SearchServiceAccount already exists!"     

    }     

    else     

{     

        Write-Output "Managed account does not exists - creating it"     

$ManagedCred = Get-Credential -Message "Please provide the password for $SearchServiceAccount" -UserName $SearchServiceAccount     

        try     

        {     

        New-SPManagedAccount -Credential $ManagedCred     

        }     

        catch     

        {     

         Write-Output "Unable to create managed account for $SearchServiceAccount. Please validate user and domain details"     

         break     

         } 

    }     

Write-Output "Creating Application Pool"      

$appPoolName=$SearchServiceAppName+"_AppPool"    

$appPool = New-SPServiceApplicationPool -name $appPoolName -account $SearchServiceAccount     

Write-Output "Starting Search Service Instance"     

Start-SPEnterpriseSearchServiceInstance $SearchServerName     

 

Write-Output "Creating Cloud Search Service Application"

 

$searchApp = New-SPEnterpriseSearchServiceApplication -Name $SearchServiceAppName -ApplicationPool $appPool -DatabaseServer $DatabaseServerName -CloudIndex $true     

Write-Output "Configuring Admin Component"     

$searchInstance = Get-SPEnterpriseSearchServiceInstance $SearchServerName     

$searchApp | get-SPEnterpriseSearchAdministrationComponent | set-SPEnterpriseSearchAdministrationComponent -SearchServiceInstance $searchInstance     

$admin = ($searchApp | get-SPEnterpriseSearchAdministrationComponent)     

Write-Output "Waiting for the admin component to be initialized"     

$timeoutTime=(Get-Date).AddMinutes(20)     

do {Write-Output .;Start-Sleep 10;} while ((-not $admin.Initialized) -and ($timeoutTime -ge (Get-Date)))     

if (-not $admin.Initialized) { throw 'Admin Component could not be initialized'}     

 

Write-Output "Inspecting Cloud Search Service Application"

 

$searchApp = Get-SPEnterpriseSearchServiceApplication $SearchServiceAppName     

 

Write-Output "Setting IsHybrid Property to 1"     

$searchApp.SetProperty("IsHybrid",1)     

#Output some key properties of the Search Service Application    

 

Write-Host "Search Service Properties"      

Write-Host "Hybrid Cloud SSA Name : " $searchapp.Name     

Write-Host "Hybrid Cloud SSA Status : " $searchapp.Status     

Write-Host "Cloud Index Enabled : " $searchApp.CloudIndex     

Write-Output "Configuring Search Topology"     

$searchApp = Get-SPEnterpriseSearchServiceApplication $SearchServiceAppName     

$topology = $searchApp.ActiveTopology.Clone()     

 

$oldComponents = @($topology.GetComponents())

 

if (@($oldComponents | ? { $_.GetType().Name -eq "AdminComponent" }).Length -eq 0)

 

{

$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.AdminComponent $SearchServerName))     

}    

$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.CrawlComponent $SearchServerName))     

$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.ContentProcessingComponent $SearchServerName))     

$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.AnalyticsProcessingComponent $SearchServerName))     

$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.QueryProcessingComponent $SearchServerName))     

$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.IndexComponent $SearchServerName,0))     

 

$oldComponents  | ? { $_.GetType().Name -ne "AdminComponent" } | foreach { $topology.RemoveComponent($_) }

Write-Output "Activating topology"     

$topology.Activate()    

$timeoutTime=(Get-Date).AddMinutes(20)     

do {Write-Output .;Start-Sleep 10;} while (($searchApp.GetTopology($topology.TopologyId).State -ne "Active") -and ($timeoutTime -ge (Get-Date)))     

if ($searchApp.GetTopology($topology.TopologyId).State -ne "Active") { throw 'Could not activate the search topology'}     

Write-Output "Creating Proxy"     

$searchAppProxy = new-spenterprisesearchserviceapplicationproxy -name ($SearchServiceAppName+"_proxy") -SearchApplication $searchApp     

Write-Output " Cloud hybrid search service application provisioning completed successfully."     

    }     

    else # The Account Must Exist so we can proceed with the script     

    {     

    Write-Output "Account supplied for Search Service does not exist in Active Directory."     

    Write-Output "Script is quitting. Please create the account and run again."     

 

    Break

} # End Else       

Successful execution of the CreateCloudSSA.ps1 script should show something similar to the printscreen below

clip_image002

Validate the search topology and cloud index parameter by executing the following script

Validate the search topology and cloud index parameter by executing the following script

Add-PSSnapin Microsoft.SharePoint.Powershell

$ssa = Get-SPEnterpriseSearchServiceApplication     

Get-SPEnterpriseSearchTopology -Active -SearchApplication $ssa     

Get-SPEnterpriseSearchStatus -SearchApplication $ssa -Text |ft Name, state,Partition,Host -AutoSize     

$ssa.CloudIndex     

The expected output is a list of Search Service Application components showing status online and the CloudIndex property of the Search Service Application showing true

clip_image002[5]

Note: Only one cloud SSA is allowed per Farm, however, you can have multiple non-cloud SSAs in the same SharePoint 2013 or 2016 farm. You can scale out the topology, follow the steps in Change from the default search topology to a small enterprise topology.

Install hybrid onboarding prerequisites

In order for you to successfully execute the on-boarding script, the following prerequisites must be installed on the SharePoint Server where the script is executed.

The following script can be used to validate and if required install the pre-requisites. Note, however that the pre-requisites must be downloaded prior to executing the script.    

Create a new folder on a server in the SharePoint Farm e.g. c:\scripts and save the script file to that location. Additionally, download the two prerequisites and save in the same folder. The two prerequisites can be downloaded from:    

Microsoft online sign in assistant - https://www.microsoft.com/en-us/download/details.aspx?id=39267      

Microsoft Azure AD PowerShell - https://go.microsoft.com/fwlink/p/?linkid=236297      

 

# This script installs the two required prerequisites:

# AdministrationConfig-EN.msi and msoidcli_64.msi.    

# It is assumed that these are available in the same folder as the script itself.     

# See the following links for downloading manually:    

# - https://www.microsoft.com/en-us/download/details.aspx?id=39267    

# - https://go.microsoft.com/fwlink/p/?linkid=236297    

#    

function Install-MSI {     

param(    

    [Parameter(Mandatory=$true)]     

    [ValidateNotNullOrEmpty()]     

    [String] $path     

)    

    $parameters = "/qn /i " + $path     

    $installStatement = [System.Diagnostics.Process]::Start( "msiexec", $parameters )      

    $installStatement.WaitForExit()     

}    

$scriptFolder = Split-Path $script:MyInvocation.MyCommand.Path     

$MSOIdCRLRegKey = Get-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\MSOIdentityCRL" -ErrorAction SilentlyContinue     

if ($MSOIdCRLRegKey -eq $null)     

{    

    Write-Host "Installing Office Single Sign On Assistant" -Foreground Yellow     

    Install-MSI ($scriptFolder + "\msoidcli_64.msi")     

   Write-Host "Successfully installed!" -Foreground Green     

}    

else    

{    

    Write-Host "Office Single Sign On Assistant is already installed." -Foreground Green     

}    

$MSOLPSRegKey = Get-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\MSOnlinePowershell" -ErrorAction SilentlyContinue     

if ($MSOLPSRegKey -eq $null)     

{    

    Write-Host "Installing AAD PowerShell" -Foreground Yellow     

Install-MSI ($scriptFolder + "\AdministrationConfig-EN.msi")     

    Write-Host "Successfully installed!" -Foreground Green     

}    

else    

{

    Write-Host "AAD PowerShell is already installed." -Foreground Green     

}    

------------------------------------------------------------------------------

Successful execution of the pre requisite installer shows the following output

clip_image001

The script has installed both components in silent mode. To validate that the components were successfully installed you can execute the following PowerShell;

To confirm the Microsoft Online Single Sign On Assistant is successfully installed

(get-service msoidsvc).Status

A response of Running is a report of success

clip_image001[5]

To confirm the Windows Azure AAD PowerShell is successfully installed import the msonline modules

import-module msonline

import-module msonlineextended       

A successful import results in no error message    

clip_image001

If you wish to double check the import then use the –verbose switch to show the cmdlets being successfully loaded

import-module msonline -verbose     

import-module msonlineextended -verbose      

clip_image001[11]

Both Office Single sign-on assistant and AAD PowerShell MUST succeed before continuing with the onboarding process.

Note: If you try to proceed to onboarding without these components installed the onboarding script will recognize this state and exit gracefully.    

 

On-boarding process      

The next stage of configuring the on-premises farm to be able to index into the Office 365 search index is called onboarding. Onboarding consists of several stages and is implemented by executing the OnBoardHybrid-Search.ps1 PowerShell script and supplying parameters for the Tenant Url and optionally the cloud SSA name.

Note: There is a process underway to add the onboarding stage to the Office 365 Hybrid Picker as an HRC based application. This will negate the need for the PowerShell script.    

 

On-boarding stages      

There are four keys stages in the onboarding process

Get-HybridSSA

This stage validates that the cloud SSA name supplied as a parameter to the script execution is valid. If multiple SSAs are found and no parameter is supplied, then it attempts to validate an SSA that has the IsHybrid property set to 1.

Prepare-Environment

This stage checks that the prerequisites for deployment are installed. It checks for MicrosoftOnline Single Sign-on Assistant and for Windows Azure PowerShell. If either of these tools are missing the script will exit with a prompt to install them.

Connect-SPFarmToAAD

This completes the OAuth trust configuration with Azure Access Control services (ACS) and deploys the ACS Proxy. Additionally, it deploys a new SPO connection proxy to enable the farm to communicate with the external endpoint of the cloud search service

Add-ServicePrincipal

The penultimate stage adds the O365 Service Principal ID to the local farm and sets the correct Service Principal Name in Azure AD for the on-premises url. This ensures outbound query federation can succeed between the O365 tenant and the on-premises farm.

The process completes by settings some additional parameters on the Cloud SSA.    

Important: Before executing the Onboard-Hybridsearch.ps1 script it is advised that you close all open PowerShell Sessions including PowerShell ISE and PowerShell cmd windows.    

The OnboardHybrid-Search.ps1 script can be downloaded along with documentation from the Microsoft Connect Site – https://connect.microsoft.com/office/Downloads/DownloadDetails.aspx?DownloadID=58777 

 

If you do not have access to the cloud Hybrid Search preview program, you can request access via the link below https://connect.microsoft.com/office/SelfNomination.aspx?ProgramID=8647&pageType=1   and we highly recommend checking the link for the latest version of the script prior to execution.

Output from executing the OnboardHybrid-Search.ps1 looks similar to the following    

 

clip_image002

 

When the message Connecting to O365 appears, you will be prompted to sign in using a tenant global admin account:

clip_image005

The next stage of execution shows the connecting to the ACS Endpoint. If a message identical to the below printscreen shows Signing Credential already exists, this means the script may have been run before and the Service Principal Names (SPNs) are already in place. This is a safe operation.

clip_image007

You may on occasion see the output matching the screenprint below. This is indicative of a connection issue with ACS and the script will retry for four minutes before exiting.

It is safe to rerun the script if you received the exception, however at this point we strongly recommend you begin a new PowerShell session and close all existing PowerShell sessions.

clip_image009

If connection is successful, the script will proceed and you should be presented with an output matching the following printscreen

clip_image011

In this printscreen you can see the TenantID, Authentication Realm and the connected endpoint address.

At this point on-boarding is completed.    

 

Configure content sources on-premises

Content sources in the Cloud SSA are no different to Content Sources in a regular Enterprise SSA. Follow the steps in https://technet.microsoft.com/en-us/library/jj219808.aspx?f=255&MSPPError=-2147217396 to create and manage content sources in the cloud SSA.

clip_image013

Crawl SharePoint 2013

The next step after on-boarding to the service and creating the on-premises content source(s), is to execute the first full crawl. The information below is from an incremental crawl to reduce the amount of information and show the pertinent data for crawling a new item and completing the indexing to the cloud.

Let us take a look at the Uls logs when we crawl a SharePoint 2013 farm from the Cloud SSA.      

The Web Application we will crawl is https://sp13      

Uploaded a text file - sampledoc.txt -> https://sp13/Shared%20Documents/sampledoc.txt .       

Turned on verbose logging and below is what we see.       

-----------------------------------------------------------------------------------------------------------------------------------------------------

Incremental Crawl started here and the CrawlID is 129

08/18/2015 20:12:57.93 w3wp.exe (0x1FE4) 0x050C SharePoint Server Search Admin Audit 97 Information An incremental crawl was started on 'SharePoint2013' by OnPremDomain\ServiceAcct e449259d-e4f8-30ff-cbe7-e91005bdfcf5

08/18/2015 20:12:58.13 mssearch.exe (0x1194) 0x1270 SharePoint Server Search Crawler:Gatherer Plugin e5ey Verbose CGatherer::LoadCrawls New Crawl Requested Component 91a6119d-38c1-4d49-be29-4cb689a77288-crawl-0, CrawlID 129, Project 1 [gatherobj.cxx:6758] search\native\gather\server\gatherobj.cxx        

 

Using web service calls to sitedata.asmx, we gather the changes on the site from previous crawl: We can see the Item we added.

08/18/2015 20:13:05.21 mssdmn.exe (0x2730) 0x2F18 SharePoint Server Search Connectors:SharePoint aiwts Verbose GetChanges CorrelationID e749259d-b72a-e0d1-f02c-0ee137b3e39c Url https://sp13/_vti_bin/sitedata.asmx Search RequestDuration 382 SPIISLatency 0 SP RequestDuration 375        

08/18/2015 20:13:05.21 mssdmn.exe (0x2730) 0x2F18 SharePoint Server Search Connectors:SharePoint allng VerboseEx GetChanges received WS response       

<GetChangesResult>      

<SPContentDatabase Change="Unchanged" ItemCount="1">      

<SPSite Change="Unchanged" ItemCount="1" Id="{1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}">      

<Messages />      

<SPWeb Change="Unchanged" ItemCount="1">      

<SPList Change="Unchanged" ItemCount="1">      

<SPListItem Change="Add" ItemCount="0" UpdateSecurity="False" UpdateSecurityScope="False" Id="{4c464428-78c5-4373-847f-64240a2eb83a}" ParentId="{e360f42c-b82c-49c7-b8e8-9338639c92b4}_" InternalUrl="/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2" DisplayUrl="/Shared Documents/sampledoc.txt" ServerUrl="https://sp13" CRC="0" SequenceNumber="246" Url="sampledoc.txt">      

</SPListItem>      

</SPList>      

</SPWeb>      

</SPSite>      

</SPContentDatabase>      

<MoreChanges>False</MoreChanges>      

<StartChangeId>1;0;e2453c4e-184a-4214-ab08-c4ebde2b8513;635755248354630000;8484</StartChangeId>      

<EndChangeId>1;0;e2453c4e-184a-4214-ab08-c4ebde2b8513;635755248354630000;8484</EndChangeId>      

</GetChangesResult>        

 

This item is picked up and crawled next. Per crawl logs, this DocID is 718 :

 

clip_image002

    

Corresponding ULS data :    

08/18/2015 20:13:05.21 mssdmn.exe (0x2730) 0x2F18 SharePoint Server Search Connectors:SharePoint dv9u VerboseEx Emit change log crawl link sts4://sp13/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2, DisplayURL = https://sp13/Shared Documents/sampledoc.txt , url=sts4://sp13/contentdbid={e2453c4e-184a-4214-ab08-c4ebde2b8513}   [sts3filt.cxx:4644] 

Once MSSDmn picks up the crawled properties for the document, we can see these being sent to AzurePlugin like below :    

[    

Filter on     

Category - Crawler:Azure Plugin    

Correlation - fbda6832-4933-4b2b-8b02-42ec747ed373    

08/18/2015 20:13:05.21 mssearch.exe (0x1194) 0x0C88 SharePoint Server Search Crawler:Azure Plugin amn1g Verbose CAzurePIFilterSink::AddAttribute : attrName : GUID:#12, value : ChangeLogLink [azurepifiltersink.cxx:433] search\native\gather\plugins\azurepi\azurepifiltersink.cxx fbda6832-4933-4b2b-8b02-42ec747ed373     

08/18/2015 20:13:05.21 mssearch.exe (0x1194) 0x0C88 SharePoint Server Search Crawler:Azure Plugin amn1g Verbose CAzurePIFilterSink::AddAttribute : attrName : GUID:#12, value : sts4://sp13/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2 [azurepifiltersink.cxx:433] fbda6832-4933-4b2b-8b02-42ec747ed373     

08/18/2015 20:13:05.21 mssearch.exe (0x1194) 0x0C88 SharePoint Server Search Crawler:Azure Plugin amn1g Verbose CAzurePIFilterSink::AddAttribute : attrName : GUID:#12, value : https://sp13/Shared Documents/sampledoc.txt  [ azurepifiltersink.cxx:433] fbda6832-4933-4b2b-8b02-42ec747ed373    

 What we see above is repeated for all the new documents/content identified and crawled under different correlation IDs. Once above details are gathered for a document, we see a SUBMIT to Azure like below. The first time this happens, we authenticate and grab a token with Azure. Example :     

08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x324C SharePoint Server Search Crawler:Azure Plugin amnzz VerboseEx AzureServiceProxy::SubmitDocuments: submitting document : 15, operation : 0, fullUri : https://searchserviceendpoint/v1.0/77133120-bdb0-42e1-815a-77e881f87db4/index/sp/sp/35c98eb9d863faeca5a7c5892fa522bc7a8bd8d4f9b1cd767f4c613035e31c71 , { "type" : "long", "value" : 2732042392} , "012357BD-1113-171D-1F25-292BB0B0B0B0:#324" : { "type" : "long", "value" : 4359478693111617600} , "noindex" : true , "oldnoindex" : true        

 08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x324C SharePoint Foundation Claims Authentication airze Verbose Current identity context: '{"elevated":"true","nameid":"."}'        

  

8/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x324C SharePoint Foundation Application Authentication ahi1z Verbose Created OAuth2 bearer credentials: {"realm":"GUID","claims":{}}              

08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x1FE0 SharePoint Foundation Azure Access Control adnhy Verbose Discovery home realm: GUID'              

08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x1FE0 SharePoint Foundation Application Authentication age6i Verbose Issuing OAuth2 S2S token for identity GUID/searchserviceendpoint@GUID'. tokenType: 2        

Once all the documents are submitted, we perform Index callbacks for all the documents :    

08/18/2015 20:13:38.15 mssearch.exe (0x1194) 0x3298 SharePoint Server Search Crawler:Azure Plugin amn35 VerboseEx CAzurePlugin::StatusTask Starting, docs 1 [azurepiobj.cxx:931] search\native\gather\plugins\azurepi\azurepiobj.cxx        

08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Azure Plugin amnz7 VerboseEx AzureServiceProxy::GetDocumentsStatus: received Index callback for GUID/operation/ruFukw/XcbWQA/17592187534485/10    

There is a completion confirmation after the index callback for every DocID crawled:    

Detailed completion information for the DOCID we are testing (718) is in the below example:    

08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Azure Plugin amn4m Verbose CAzurePlugin::CompleteDocument docID 718 status 2 [azurepiobj.cxx:1189]        

08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Gatherer Service drb9 VerboseEx InProgress chk rel Rmv hr 0x0 Portal_Content, sts4://sp13/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2 docid 718 type 2 flgs 0x132a sts 2 [4ba320ac-7a96-4027-9f4a-0901589cd6f4     

08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Gatherer Service kf7a VerboseEx Destroying CInProgressEntry sts4://sp13/siteurl=/siteid={GUID}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2 docid 718 type 2 flgs 0x132a sts 2 [inprogressentry.hxx:51] 4ba320ac-7a96-4027-9f4a-0901589cd6f4     

08/18/2015 20:13:55.20 mssearch.exe (0x1194) 0x24B8 SharePoint Server Search Crawler:Gatherer Plugin e5g1 VerboseEx CGatherer::CommitTransaction succeeded CrawlID 129, DocID 718 [gatherobj.cxx:8826] search\native\gather\server\gatherobj.cxx 4ba320ac-7a96-4027-9f4a-0901589cd6f4     

  

8/18/2015 20:13:55.20 mssearch.exe (0x1194) 0x24B8 SharePoint Server Search Crawler:Gatherer Service e574 Verbose Destroying CTransaction docid 718 type 2 [gthrtrx.cxx:571] 

  

Managed property for hybrid search results    

There is a new property that has been defined for all content that has been crawled from on-premises, it’s named IsExternalContent. It’s a managed property and is set to 1 for content that is crawled on-premises.

clip_image002[10]

The property can be used to restrict a query for online/on-premises results, as a refiner or in a result source. Below are some examples

 

clip_image004

Example search verticals

Custom result source using Local SharePoint results plus a filter which excludes results from on-premises can be set up. TIP: Can be used during validation of hybrid search in the production tenant.

 

clip_image002

  

Result source query for SharePoint Online only    

{searchTerms} NOT(IsExternalContent:1)    

Result source query for as an example a filtered Support Forum    

{searchTerms} (    

Path:https://sp2010 OR

Path: OR    

Path:https://demohybrid.../../supportforum)    

For example, you can use the default result source using Local SharePoint results but can rename to "Everything" in the Search Navigation configuration. It uses Local SharePoint results plus a filter on which sites to include in the search results.     

 

People crawl scenario  

This managed property becomes very important when considering the people crawl scenario. By default, all people in the SharePoint Online User Profile application will be indexed by the Office 365 Search Service. By default, all people in the SharePoint Online User Profile application will be indexed in the Office 365 index. If you additionally crawl people from the cloud SSA, you generate a duplicate set of people content in the Office 365 index. This will be confusing to end users as searching for a person will return multiple results.    

There are two ways to approach this problem today.     

§ Make the Office 365 User Profile service the primary source of user information and let Office 365 search take care of the indexing and presentation. With this approach you do not need to crawl people on-premises. 

  § Crawl the on-premises people profile store in addition to Office 365 crawling the tenant profile store. This will result in the described scenario of duplicate search results for each person, however you can use query transformation to decide which results you want to display. Even providing the ability for end users to choose between the different result sources at query time.      
     

Businesses who have a richly populated on-premises profile store, perhaps with additional augmentation from line of business applications, may want to maintain their primary source of people information as this store. In order to avoid duplicate search results and to focus the results on the primary store, query transformation must be implemented.

To utilise the on-premises profile store as the primary people search source you should follow these step 

 

1. Create a new result source or copy the existing people results source

2.     Edit the new result source and modify the Query Transformation box to include the Managed Property IsExternalContent as below.
{?{searchTerms} ContentClass=urn:content-class:SPSPeople IsExternalcontent:1}

clip_image002

 

3. Then you can create a new search results page and configure the Core Search Results web part to consume this new search result source.

 

4. Complete the implementation by adding the new page to the search navigation settings. This will add the new page as a search vertical within the search center, as per the screenshot below.

 

clip_image004

 

To utilize the Office 365 profile store as the primary people search source you should follow the same steps but using a slightly different query transformation at step two, as follows.

{?{searchTerms} ContentClass=urn:content-class:SPSPeople NOT IsExternalcontent:1}     

Note: The difference in the two transform is the insertion of NOT before the managed property to force the exclusion of External content ie Non Office 365 People Results.    

For people results transformation you can make a copy of the inbuilt people result source and modify it to include the IsExternalContent managed property restriction for filtering against online or on-premises people sources.    

 

For on-premises people

{?{searchTerms} ContentClass=urn:content-class:SPSPeople IsExternalcontent:1}     

 

For online people

{?{searchTerms} ContentClass=urn:content-class:SPSPeople NOT IsExternalcontent:1}     

 

Publishing service applications between SharePoint Farms     

As with most service applications the cloud SSA supports cross farm publishing. This capability allows us to extend the cloud hybrid search capability to SharePoint 2010 farms as well. Following the guidelines for publishing service application you should now be able to publish Cloud SSA and consume it from a SharePoint 2010 farm. This can be achieved following TechNet guidelines here.

If SharePoint 2010 is configured to consume a published cloud SSA, then the end users would be able to leverage the default scopes like “This site” and All Sites in SharePoint 2010 to query against the cloud index. This enables you to get the same cloud hybrid search experience in SharePoint 2010 that we see in SharePoint 2013.     

 

Delve Experience     

The cloud hybrid search solution also influences what users can search for and see in Delve. In the preview version of cloud hybrid search, Delve will allow users to see search results from on-premises as well as Office 365. Gestures such as adding an item to a board will work just the same for on-premises content as for Office 365 content.

When the GA version of cloud hybrid search is released on-premises content from SharePoint will also show up as activity on the Me page or a person page in Delve.     

Content from on-premises will not show up on the Delve Home Page because the views are not sent from the Cloud Search service application to Office 365.     

POST BY Manas Biswas [MSFT] & Neil Hodgkinson [MSFT]