Moving Path Based to Host Named Site Collections

This post illustrates a problem with detaching content databases that contain site collections restored from path-based site collections to host named site collections.

Background

The recommendation for SharePoint 2013 is to use a single web application and leverage host named site collections.  In a previous post, I wrote about What Every SharePoint Admin Needs to Know About Host Named Site Collections.  In that post, I showed one approach for moving an existing path-based site collection to a host-header site collection.  This is invaluable if you have too many web applications in your farm and need to consolidate the site collections while preserving URLs.  It’s also invaluable to improving the health of your farm as I have seen multiple farms that suffered performance issues that were resolved by consolidating web applications to host-named site collections.

As a reminder, I provided the following sample script:

 Backup-SPSite https://server_name/sites/site_name -Path C:\Backup\site_name.bak 
Remove-SPSite –Identity https://server_name/sites/site_name –Confirm:$False 
Restore-SPSite https://www.example.com -Path C:\Backup\site_name.bak -HostHeaderWebApplication https://server_name 

This works, and the site collection is restored successfully with the new host header.  However, there are some additional considerations you’ll want to be aware of.

Existing Web Application With the Same Url

The first problem is that the site collection may be at the root of a web application with the same URL that you are trying to move to a host named site collection.  For example, I have a web application, Intranet.Contoso.lab, that contains a single root site collection that is path-based.  I want to move this to a host named site collection, but that URL is already in use.  The fix is to delete the web application first.  Don’t worry, you have the option of preserving the content database just in case something goes wrong, in which case you could create a new web application using the existing content database and you’ll be back to where you started.  Here is a function that you can use to move your path-based site collection to a host-named site collection and optionally delete the existing web application while preserving the original content database.

 

  
 function Move-PathBasedToHNSC(
    [string]$siteUrl, 
    [string]$backupFilePath, 
    [string]$pathBasedWebApplication, 
    [bool]$deletePathBasedWebApplication, 
    [string]$hostHeaderUrl,
    [string]$hostHeaderWebApplication, 
    [string]$contentDatabase)
{
    Backup-SPSite $siteUrl -Path $backupFilePath

    if($deletePathBasedWebApplication)
    {
        #If the HNSC uses the same URL as an existing web application,
        #the web application must be removed
        Remove-SPWebApplication $pathBasedWebApplication -RemoveContentDatabases:$false -DeleteIISSite:$true
    }
    else
    {
        #Not removing the web application, so just remove the site collection
        Remove-SPSite -Identity $siteUrl -Confirm:$false
    }

    Restore-SPSite $hostHeaderUrl -Path $backupFilePath 
        -HostHeaderWebApplication $hostHeaderWebApplication -ContentDatabase $contentDatabase    
}

Move-PathBasedToHNSC -siteUrl https://HNSCMoveTest2.Contoso.lab 
    -backupFilePath "C:\Backup\HNSCMoveTest2.bak" 
    -pathBasedWebApplication https://HNSCMoveTest2.contoso.lab 
    -deletePathBasedWebApplication $true 
    -hostHeaderUrl https://HNSCMoveTest2.contoso.lab 
    -hostHeaderWebApplication https://spdev 
    -ContentDatabase WSS_Content_HNSC

Before I run the script, here’s what the list of web applications looks like:

image

After running the script, the web application is gone, and I now see the host named site collection in the new web application and in the content database that I specified.

image

As the administrator, I’m happy because there’s one less web application to maintain and, likely, the performance of my farm will increase a bit. 

Detaching (Dismounting) the Content Database

Here’s where the weird things start happening.  You can detach a content database so that it’s not serving any content, but the database is still in SQL Server.  You might do this for a number of reasons, such as upgrades.  Let’s try detaching the content database using PowerShell:

 Dismount-SPContentDatabase WSS_Content_HNSC

Now we want to attach it again.

 Mount-SPContentDatabase "WSS_Content_HNSC" -WebApplication https://spdev

Go back to the browser and hit refresh, and after some time the host-named site collection will render correctly.  However, we have a few problems.  First, go look at the site collections again in Central Administration.  You might see that the site collection is gone!  We run some PowerShell to see what’s up:

PS C:\> get-spwebapplication https://spdev | get-spsite -limit all

Url                                                   

---                                                   
https://spdev  

Huh?  Where’d my site collection go?  If we go into the content database, we can see the site is still there.  However, the site doesn’t seem to actually be there.  I tried Get-SPSite, even stsadm –o EnumSites, and the site isn’t showing anywhere.  Thanks to my colleague, Joe Rodgers, for showing me the fix. 

 $db = Get-SPContentDatabase WSS_Content_HNSC
$db.RefreshSitesInConfigurationDatabase()

This refreshes the sites in the site map in the configuration database, at which point the site collection appears again in PowerShell and in the UI.

image

 PS C:\> get-spwebapplication https://spdev | get-spsite -limit all

Url                                                    
---                                                    
https://spdev                                           
https://hnscmovetest.contoso.lab                        
https://hnscmovetest2.contoso.lab                       
https://hnscmovetest3.contoso.lab 

If you are upgrading and have used this technique to move path-based to host-named site collections, I would definitely recommend keeping this in mind.  Note that this behavior does not seem to happen when you create a new host-named site collection or a new path-based site collection, it only seems to happen when you move an existing path-based site collection to become a host-named site collection. I also only tested this in SharePoint 2010. 

Summary

SharePoint scales by having many site collections instead of many web applications, and Host named site collections are a fantastic way to get there without changing URLs.  Honestly, the bit about detaching and attaching the content database and losing the information in the site map seems like unintended behavior to me.  I haven’t tried this in SharePoint 2013 to see if the problem reproduces there, I’d be interested to see if anyone reproduces this in an SP2013 environment.  If so, leave comments!