SSIS with AlwaysOn


In SQL Server 2012, AlwaysOn Availability Groups maximizes the availability of a set of user databases for an enterprise. In addition, SSIS introduces new capabilities which allow customers with the ability to easily deploy to a centralized SSIS Catalog (i.e. SSISDB user database).

In this article, we explore how you can make use of SSIS and AlwaysOn Availability Groups. Using these capabilities, it allows you to provide high-availability of SSISDB and its contents (projects, packages, execution logs, etc). We also discuss how to deal with patching with service packs.

Important Note: Enabling AlwaysOn with SSISDB does create some complications when you install a cumulative update or service pack which modifies the SSISDB schema. SSISDB patches are installed when the SQL Server instance first starts up, but putting a database into an availability group causes it to be in an OFFLINE state on startup, which will cause the patching process to fail. To address this, you MUST first remove SSISDB from the availability databases, then patch each node, then recreate the availability group.

If the above warning hasn’t scared you away, keep reading!

SSISDB in Availability Groups

To use SSIS with AlwaysOn, you’ll need to add the SSIS Catalog (SSISDB) into an Availability Group. You’ll need to do the following steps:

  • Make sure you meet the prerequisites for using AlwaysOn
  • Connect to every node and create the SSISDB catalog. We need to create the catalog even on secondary nodes to create the other server-level objects (cleanup jobs, keys, accounts etc) that are used by SSIS.
  • Delete the SSISDB databases on secondary nodes.
  • Create an availability group, specifying SSISDB as the user database
  • Specify secondary replicas.

    Failover

    The primary reason to use SSIS with AlwaysOn is to leverage the failover capabilities. When a failover occurs, one of your secondary nodes automatically becomes the new primary node.

    Unfortunately, when this happens with SSIS, you will get an error (15581) when the system tries to run an package on the new primary node:

    “Please create a master key in the database or open the master key in the session before performing this operation”.

    This occurs because of the way SSISDB encrypts sensitive information (such as connection string passwords, and certain parameters. To resolve this, the database master key needs to be re-encrypted by the service master key after failover occurs. To resolve it, you’d need to re-encrypt the database master key by running this T-SQL script:

    USE SSISDB       
    OPEN master Key decryption by password = 'x' -- Password used when creating SSISDB
    ALTER Master Key ADD encryption by Service Master Key

    Once the key has been re-encrypted, the new primary node will be able to run SSIS packages.

    Auto-Detecting Failover

    Having to manually run a script every time failover occurs isn’t ideal. Unfortunately, AlwaysOn doesn’t expose any events we can plug into. To work around this, you can setup a SQL Agent job that polls the replica status, and detects when its node has become the new primary.

    Step #1: We create a table containing with a single row in a database other than SSISDB (say, X):

    USE X;
    CREATE TABLE [dbo].[replica_role](
    [replica_role] [tinyint] NULL
    )
    INSERT INTO [dbo].[replica_role] ([replica_role]) (
    SELECT [role]
    FROM sys.dm_hadr_availability_replica_states
    WHERE is_local = 1
    );

    Step #2: On every node, create an agent job (ex, “ssisdb_failover”) with a recurring step that runs below T-SQL script every certain amount of time (say, every 2 minutes):

    USE X;

    DECLARE @last_role TINYINT;
    SET @last_role = (
    SELECT TOP 1 [replica_role]
    FROM [dbo].[replica_role]
    );

    DECLARE @current_role TINYINT;
    SET @current_role = (
    SELECT ROLE
    FROM sys.dm_hadr_availability_replica_states
    WHERE is_local = 1
    );

    IF (@last_role = 2 AND @current_role = 1) -- Last time it was secondary,
    -- currently it is primary: need re-encrypt
    -- the database master key
    BEGIN
    USE SSISDB;
    OPEN MASTER KEY DECRYPTION BY PASSWORD = 'x' -- Should be encrypted! See below
    ALTER MASTER KEY ADD ENCRYPTION BY SERVICE MASTER KEY
    END

    USE X;
    UPDATE dbo.[replica_role] SET [replica_role] = @current_role;

     

    Note, in the above script, the password for the database master key is in plain text. As a security best practice, you can store it in a table using a simple symmetric key encryption, and retrieve it later. For example:

    USE X;
    -- Add a column in which to store the encrypted data.
    ALTER TABLE dbo.replica_role
    ADD ssisdb_pwd VARBINARY(128);
    GO

    --If there is no master key for X, create one now.
    IF NOT EXISTS
    (SELECT * FROM sys.symmetric_keys WHERE symmetric_key_id = 101)
    CREATE MASTER KEY ENCRYPTION BY
    PASSWORD = 'xxx' -- Your master key password
    GO

    CREATE CERTIFICATE ssisdb_cert
    WITH SUBJECT = 'SSISDB Password';
    GO

    CREATE SYMMETRIC KEY ssisdb_key
    WITH ALGORITHM = AES_256
    ENCRYPTION BY CERTIFICATE ssisdb_cert;
    GO

    -- Open the symmetric key with which to encrypt the data.
    OPEN SYMMETRIC KEY ssisdb_key
    DECRYPTION BY CERTIFICATE ssisdb_cert;

    -- Encrypt the value in the column using the symmetric key.
    UPDATE dbo.replica_role
    SET ssisdb_pwd = EncryptByKey(Key_GUID('ssisdb_key'), N'x');
    GO

    To retrieve the password, replace the line “OPEN MASTER KEY DECRYPTION BY PASSWORD = ‘x’” in the agent job script with below lines:

    USE X;
    OPEN SYMMETRIC KEY ssisdb_key DECRYPTION BY CERTIFICATE ssisdb_cert;
    DECLARE @pwd NVARCHAR(MAX);
    SET @pwd = (SELECT TOP 1 CONVERT(NVARCHAR,DecryptByKey(ssisdb_pwd)) FROM dbo.replica_role);
    DECLARE @sqlString NVARCHAR(1024);
    SET @sqlString = 'OPEN MASTER KEY DECRYPTION BY PASSWORD = ''' + @pwd +'''';
    USE SSISDB;
    EXECUTE sp_executesql @sqlString;



    Patching with Service Packs

    SQL Server service packs usually run in single-user mode, while an availability database must be a multi-user database. Therefore, during installing a service pack, you may find all availability databases including SSISDB will be taken OFFLINE and thus fail to patch. To address this, you can first remove SSISDB from the availability databases, then patch each node (Failover Cluster Instances have to be in the same patch level), then add SSISDB back.

     

    Other Considerations

    When an unplanned failover occurs, it is important that the status of SSIS package executions reflect the most up-to-date state. You can restart the instance so that a default SSIS cleanup job will be executed to fix the status of any packages/ deployments that were running at the time the instance went down. The status will be changed to Terminated; temporary data created during project deployment will be removed.

    Note that your SSIS packages will still need to have their own failover logic in them. This article provides a good starting point for creating highly available packages.


    This post was originally written by Ke Yang and posted back in June – we took the article down when we found the issue with patching SSISDB databases while they are in an availability group. We’ve since updated and re-posted the information!

  • Comments (21)

    1. Jovialn says:

      Hello Matt, thanks for the informative blog.

      Have a query around this,

      We are setting up column level encryption on a databse server with always on. For this also we are using symmetric key.

      We are facing the same error in case of a fail over – "Please create a master key in the database or open the master key in the session before performing this operation"

      Does that mean, the only solution for us is to re-encrypt the service master key after failover occurs. And there is no automated way to do this other than the Job method you described ?

    2. mmasson says:

      Hi Amit,

      You might get look for a more official response from the AlwaysOn team (blogs.msdn.com/…/sqlalwayson), but yes, I don’t believe there are any hooks you can use to detect when failover occurs. At the moment, the agent script we provided here is one way to do this.

    3. Stanislav Novoseletskiy - MSFT says:

      Hey Matt, great article as always !!

    4. Yong says:

      Hi Matt

      Thank you for great article. The following statement caught my attention.

      "You can restart the instance so that a default SSIS cleanup job will be executed to fix the status of any packages/ deployments that were running at the time the instance went down."

      It sounds like that I'd better run SSIS cleanup job after the failover.

      1. Should I consider combining a task to run SSIS cleanup job along with "re-encrypt the database master key" script?

      2. Is "[catalog].[startup]" SSIS cleanup job?

    5. Peter Phipps says:

      Hi Matt, I like what you are trying to do here but struck an issue. I implemented your "secure" version such that the job has the code below.

      IF (@last_role = 2 AND @current_role = 1) — Last time it was secondary,

                                               — currently it is primary: need re-encrypt

                                               — the database master key

      BEGIN

      USE [Infastructure];

      OPEN SYMMETRIC KEY ssisdb_key DECRYPTION BY CERTIFICATE ssisdb_cert;

      DECLARE @pwd NVARCHAR(MAX);

      SET @pwd = (SELECT TOP 1 CONVERT(NVARCHAR,DecryptByKey(ssisdb_pwd)) FROM dbo.replica_role);

      DECLARE @sqlString NVARCHAR(1024);

      SET @sqlString = 'OPEN MASTER KEY DECRYPTION BY PASSWORD = ''' + @pwd +'''';

      USE SSISDB;

      EXECUTE sp_executesql @sqlString;

      ALTER MASTER KEY ADD ENCRYPTION BY SERVICE MASTER KEY

      END

      However, The code will not execute when SSISDB is the secondary as SQL tries to parse (create an execution plan for) the code before execution and therefore fails on the command "USE SSISDB;" with the error …

      Msg 976, Level 14, State 1, Line 38

      The target database, 'SSISDB', is participating in an availability group and is currently not accessible for queries. Either data movement is suspended or the availability replica is not enabled for read access. To allow read-only access to this and other databases in the availability group, enable read access to one or more secondary availability replicas in the group.  For more information, see the ALTER AVAILABILITY GROUP statement in SQL Server Books Online.

      To work around this I changed the code to …

      SET @sqlString = 'USE SSISDB; OPEN MASTER KEY DECRYPTION BY PASSWORD = ''' + @pwd +'''; ALTER MASTER KEY ADD ENCRYPTION BY SERVICE MASTER KEY ';

      –USE SSISDB;

      EXECUTE sp_executesql @sqlString;

         –ALTER MASTER KEY ADD ENCRYPTION BY SERVICE MASTER KEY

    6. Todd Jewell says:

      If you backup the service master key from your primary server and then restore it to the secondary server before adding an encrypted database to the AG, you don't have the problem with having to run the  open master decrypt routines on failover.

    7. Ola Bello says:

      @Peter Phipps,

      Thanks for the modified script, had similar issues and the modification worked for me.

    8. Anonymous says:

      Your database X in "We create a table containing with a single row in a database other than SSISDB (say, X)", it must *NOT* in the Availability Group, and each secondary node MUST also have this Database X correct? Otherwise, when you change the value replica_role in the primary node, all the secondary will get the same value, which defeat your purpose. Let me know if that was the case. Thanks

    9. Anonymous says:

      Ron, I believe your right. That's the way I'm implementing this. Thanks for a great article.

    10. Anonymous says:

      Excellent post Matt!

      Did anyone tried the option Todd mentioned?

    11. Anonymous says:

      Hi,

      I just tried the option Todd mentioned and it works perfectly fine. It saves a lot of work!! Thanks Matt for writing an excellent article and Todd for an out of box solution.

      Thanks,

    12. Anonymous says:

      Can someone enlighten me as to what Todd Jewell suggested?

      "restore it to the secondary server before adding an encrypted database to the AG" – so what is the sequence?

      On the Secondary Replica, I would:

      1 – Create a blank New SSIS Catalog

      2 – Then DROP the actual SSISDB from Secondary Replica

      3 – Go back to Primary Replica then add the SSISDB there to AG and restore it to Secondary Replica.

      What am I missing?

      Thank you!

    13. Anonymous says:

      Hello Matt,

      This article is Excellent and helps a lot for me on SQL Server 2012.

      Same steps i have to do on SQL Server 2014 as well.

      This doesn't fixed in SQL Server 2014.

      Thanks,

    14. Anonymous says:

      I don't understand Todd Jewell's post. Todd, would you please provide more detail?

      Thanks!

      Chris

    15. Anonymous says:

      Hi Chris,

      Assuming SSISDB is already a part of the existing 'AlwaysON availability group', you need to follow the below listed steps to get it in a working state . Works great for me! Let me know if you have any issues.

      1. Take SSISDB database out of the availability group

      2. Backup the Service master key from the Primary server and restore them on to the secondary replicas

      3. Add the SSISDB database back to the availability group.

      Thanks,

      Saurav

    16. Anonymous says:

      Hi Johnson,

      You have missed an important step which Todd has mentioned. Backup, the service key and restore it on to the secondary replicas. The nest step would be to add the databases back to the availability group.

      Let me know if you have any issues.

      Cheers,

      Saurav

    17. Anonymous says:

      Hi – I was wondering if you have AAG with several cluster nodes and SQL 2014 if (like SSRS) a SSIS package would make use of the different nodes to enhance/improve performance?  We are looking for a way to speed up our nightly ETL dataload process and we have already broken out as many parallel activities in the SSIS pkg as we can – thanks!

    18. John Peterson says:

      Dear Saurav (and others),

      I've tried the suggestion and steps that you (and Todd) have described, but I cannot get my database to successfully decrypt on failover.  The primary (after failover) always returns NULL for the decryption routines, unless I perform the ALTER MASTER KEY as per this blog.

      I've even tried it *again*, and when I try to restore the Service Master Key, it says that they're identical, and no work was performed.  But, when I add the database back to the Availability Group, after failover, the primary no longer is successful when using decryption routines.

      Do you have any additional thoughts?

    19. Simha says:

      I am looking for similar setup configuration with Database mirroring soultion on SQL 2012. Please let me know how i can do that.

      Thank You

    20. GuidoGTKP says:

      Hi, great post Matt, and Todds work around is identical to that which you have to use when using service broker and the Service master key.   This configuration removes the need to run the fudge on the key to get it working.  

      I am interested in this statement: "Unfortunately, AlwaysOn doesn’t expose any events we can plug into. " and how perhaps the solution I used that may get around this.

      If you do need to run a statement on failover – regardless of what it is actually required to do setup a SQL Server Agent Alert on:

      Error number 1480 Availability Group Role Change

      When configured, the alert can execute a job.

      If you have multiple databases in the AG, then configure the job to only respond once by altering the time between responses in the options.

      So when a failover occurs – the job is called, you can put relevent logic in the job as required.

    21. Ben Thul says:

      To be clear on the restoring master key solution that Todd Jewell advanced, the direction is backwards. That is to say that backing up the SMK on the secondary and restoring it on the primary will be the least hassle solution. The reason for this is that the databases on the secondary are read-only and thus cannot have their DMKs re-encrypted.