Patching SharePoint with No Downtime using SQL Server AlwaysOn

As highlighted in a previous post, it is perfectly possible to patch SharePoint without suffering any downtime at all if you have x2 SharePoint farms to play with. This is great if you’re worried about the update process taking either too long for a maintenance window or not completing at all (initially).

Edit: here's some important info about patching farms with synchronised service-apps too - https://blogs.msdn.microsoft.com/sambetts/2016/03/23/patching-sharepoint-dr-farms-with-replicated-service-applications-with-sql-server-alwayson/

Previously I showed how this is done with SQL Server log-shipping to synchronise changes between a primary & Disaster Recovery (DR) SharePoint farm but times move on and now we have SQL Server AlwaysOn for SharePoint content-database syncing, which is overall a much superior solution over SQL Server log-shipping.

So this post is basically the same thing, just with SQL Server AlwaysOn instead of log-shipping. Here’s a picture of what we’ve got setup:

image

Both farms have their own configuration databases, service-application databases, etc, but both share the content databases via SQL Server AlwaysOn (with synchronous updates preferably). More about synchronising content-databases with AlwaysOn here. AlwaysOn is superior to log-shipping mainly in that switching the primary node is trivial; updates will flow in either direction depending on which SQL instance is the master, and that’s something that’s basically impossible to do with log-shipping.

There are two options for this “no-downtime-while-patching” process; “safe & read-only” during the upgrade and “less-safe but full read/write” functionality for SharePoint users. The riskier route is possible only if you’re sure that your new build of SharePoint can run your un-upgraded content databases.

What does that mean? Read on…

SharePoint Database/Binary Compatibility Ranges

To explain why there are two options you need to understand SharePoint compatibility ranges for databases. In short, each SharePoint build has certain compatibility ranges for every type of database the binaries will interact with; an optimal (supported) version, and then a minimum version. Anything in that range will work (with caveats if the version isn’t optimal); anything out of that range SharePoint won’t touch, otherwise you’ll likely just see this message in Central Administration:

clip_image003

So key to being sure you can express-upgrade SharePoint is knowing that your new build will actually work temporarily with old databases. It’s very rare SharePoint won’t touch old databases but you won’t know until you try – the responsibility for checking this is entirely yours.

If you want a guarantee that the compatibility mode will work you’ll just have to dry-run the upgrade on a test environment first to be sure.

Compatibility Range Confirmed: Perform Read/Write Upgrade

The best thing about this method is it doesn’t limit users, hence it’s the preferred way to upgrade.

Short version: upgrade DR farm 1st (with content DBs disconnected) while users still use the primary farm. Then reconnect content DBs, switch users + content DBs at the same time to DR farm once the DR farm is upgraded. Once everyone’s on the DR farm, upgrade primary farm. Finally upgrade content-databases in PowerShell.

Long version:

  1. Leave users going to primary site and meanwhile patch disaster-recovery/secondary site.
    • Important: detach the content database from the DR farm 1st, as PSConfig will fail to finish the upgrade while these databases are still connected to the farm & in read-only mode (as the DR site will still be the secondary replica in SQL Server).
    • PSConfig will update all databases except the content databases.
    • Re-attach the content databases, and maybe even run an incremental crawl if you want your indexes to include updates since the content DBs went offline.
    • Verify SharePoint is happily working again on the DR site – check logs, site-access etc.
  2. Switch users to the upgraded DR site + failover SQL Server to use the secondary node as the new primary. SharePoint on the DR site will use the content-databases in compatibility mode but with read/write access.
    • There will be a service interruption while this simultaneous databases + user switchover happens. Might want to make the switch at night, for example.
  3. Patch primary farm, now that users are on the DR site.
    • Once the normal farm is verified as healthy again, failover users there again if you so wish.
    • Both farms are now upgraded with content-databases in compatibility mode.
  4. On the farm with read/write access to the content-databases, finish the upgrade with Upgrade-SPContentDatabase. This may cause some read-only access while the upgrade is happening but it’ll be much less read-only time than the safer method below.

This is the preferred way: read/write functionality still works for users almost without interruption. This full functionality is available much more than was previously possible as AlwaysOn lets us switch primary servers back & forth trivially.

Let’s go Safe: Perform Read-Only Upgrade

This method guarantees binaries will only work with the optimal content database versions by basically locking out users from changing anything until the update is done fully on at least one farm. Not ideal as users won’t be fully able to work, but it does make the upgrade slightly more controlled.

Short version: make web applications read-only on primary farm while content DBs + DR farm upgrade 1st. Switch users to DR farm once it’s fully upgraded, and then upgrade primary site.

Long version:

  1. Failover SQL databases only to the DR site. This’ll put the primarysite into read-only mode so users won’t be able to make any changes.
    • We still want users going to the primary site however just as before, but now they can’t make changes.
  2. Suspend data movement in AlwaysOn (picture below) so the upgraded content-DB schema changes don’t replicate to primary (old build) farm.
    • If the old version farm (currently the primary site) would get the new version content databases, it will break for sure until the primary binaries are upgraded so that’s why suspending data movements is very important if we care about uptime.
  3. Patch DR farm while users are still using the primary farm.
    • This will upgrade SharePoint and content databases as on the DR site the content DBs are in read/write mode, as this is the AlwaysOn primary node (from step 1).
    • Because we suspended data movement in AlwaysOn so the upgraded content databases won’t make it to the primary, still-old-build site.
  4. Switch users to use the now fully upgraded DR farm. Read/write functionality is restored and content databases aren’t in compatibility mode – they’re fully upgraded too.
  5. Important: disconnect any (read-only) content-databases from the primary farm.
    • Without doing this, when SharePoint upgrades the databases, it’ll cause a critical exception saying it was unable to upgrade the database as it’s a read-only replica only.
    • You might be able to skip this if you do step #7 before upgrading the primary farm. In my testing I've always disconnected the content-DBs 1st at least.
  6. Upgrade the primary farm now users are on the upgraded DR farm.
    • To reiterate, any read-only content databases will cause the upgrade to fail.
    • Once done successfully, both farms are now upgraded to the new SharePoint build. Hurrah!
  7. Resume data-synchronisation in SQL Server AlwaysOn.
    • The also-upgraded content databases will replicate to the primary SQL instance. With asynchronous commits, it’s impossible difficult to know when it’s finished, which is why I’d recommend synchronous normally.
  8. Reattach content databases to primary farm now the farm is fully updated.

clip_image004

Both farms are updated and you switch users & failover AlwaysOn between them without having to worry about build incompatibilities. Good work.

Wrap-Up

With SQL Server AlwaysOn we get much easier primary/secondary node switching than we ever did with log-shipping. This helps a lot with patching; it gives us a chance to allow read/write functionality for much longer for our users while SharePoint is being patched-up. This is on top of the existing benefit that having a secondary SharePoint farm has during patching – the update process is much less stressful as we have a fall-back option.

Because of the complexity of SharePoint updates & now how trivial it is to setup content-database syncing, I highly recommend having another SharePoint farm for production sites especially. You have the technology to keep SharePoint online always, even during potentially epic maintenance windows; now go use it :)

 

Cheers,

Sam Betts