Windows Vista Security Series: Adding a Cipher Algorithm to Windows Vista

Dan Griffin
JWSecure, Inc.

May 2007

Applies to:
 Windows Vista

Summary: This whitepaper and accompanying sample demonstrates how to create and us a plug-in to support a new cipher algorithm in Windows; specifically TwoFish.  Using a plug-is ideal, as existing crypto-agnostic applications can continue to use the existing messaging and encryption APIs without modification (15 printed pages). Download XPS version.

Download the associated sample code, CNGSample.exe.

Dependencies
You must have the CNG SDK and the Windows SDK for Windows Vista. If you do not install these components to the default locations, which are listed below, you must update the Visual Studio Include and Library paths in the sample code accordingly.

Download the CNG SDK
The CNG SDK installs to %ProgramFiles%\Microsoft CNG Development Kit.

Download the Windows SDK for Windows Vista
The Windows Vista SDK installs to %ProgramFiles%\Microsoft SDKs\Windows\v6.0.

Introduction

This is the first in a series of articles about some of the new security-related features in Windows Vista.  My goal is highlight these areas for the Windows developer community.  However, I’ll only be talking about stuff that I think is cool!

This article is written for any developer familiar with programming on Windows.  Most of the topics I’m addressing relate to new functionality in Windows Vista that’s only available with native APIs (as opposed to with the Microsoft .NET Framework) as of the time of this writing.  However, whenever I can tie into the managed code side of things, I will.

In summary, if you’ve previously done some Windows programming with C, C++, and/or C#, you’ll find this series to be accessible and hopefully interesting as well.

Summary of Topics in This Series

As a teaser for what’s coming, here are the topics that I plan on covering in this series.  Of course, until all is said and done, this list is subject to change:

· CNG (Crypto API: Next Generation)

· Windows Filtering Platform API

· IPsec (including AuthIP)

· Smart Card Module (“mini-driver”) plug-in interface

· IPv6

· Fuzz Testing

As I mentioned above, these are topics that really interest me right now – I think they’re cool. But more than that, these topics represent areas of significant investment and demand in the computer industry right now--namely, data encryption, network security, strong authentication, and software security testing.  Therefore, I consider it important to be as informed about them as possible.

So, let’s proceed to the first topic – Crypto API: Next Generation

Crypto API: Next Generation

Crypto API: Next Generation (CNG) is a new suite of Win32/native cryptography-related interfaces present in Windows Vista. CNG offers a number of attractive features for application developers:

· Crypto Isolation – Long-term private key material can now be hosted in a secure system process.

· Better Extensibility – For example, a custom random number generator can now be installed, as well as multiple distinct implementations of a given cryptographic algorithm.

· Ease of Plug-in Creation – Developers need no longer implement a complex plug-in just to add a new symmetric encryption or cryptographic hash algorithm to Windows.

· “Crypto Agnostic” Support – CNG allows crypto-aware applications, including APIs built on top of cryptographic primitives, to not be bound to specific algorithms, signature schemes, or key agreement protocols.

Each of the above was difficult to accomplish by using the previous version of Crypto API.

That said, my goal in this article is not to promote the high-level benefits of the new API, but rather to buckle down and show it in action. My example focuses on the second two points above: the creation of a crypto plug-in and its smooth incorporation into the application stack. To motivate this discussion, I want to first consider a sample business problem facing a software company that produces an application with cryptographic messaging capabilities.

Sample Business Problem

In this example, imagine that I’m the lead architect at a software vendor. The company I work for, we’ll call it Contoso, is working on a standards-based application that uses the Windows Cryptographic Message Syntax (CMS) APIs. An example application would be an S/MIME-capable e-mail client like Outlook. A large, important customer of ours has been demanding that we provide support in the application for a new symmetric encryption algorithm. For this example, I’ve chosen Twofish, a block cipher that was one of the contenders a few years back for the Advanced Encryption Standard (AES) competition. This type of customer need might be compliance-driven, or even based on government-originated national security requirements. Regardless, if we want to keep their business, we had better get to work!

In summary, my goal is to use a plug-in to support a new cipher algorithm in Windows; specifically TwoFish in this example.  Using a plug-in would be ideal, so that the application can continue to use the existing messaging and encryption APIs. In fact, my secondary goal is to completely avoid changing my application code.  That said if “crypto-agnostic” wasn’t an original requirement of the application architecture; some additional application code changes are likely to be necessary the first time around.  On the upside, once I’ve done those changes correctly, I won’t have to touch the application code at all the next time a customer requires such an extension.

Historically, adding a cipher algorithm to Windows used to be a painful thing to accomplish from an engineering standpoint.  Applications could generally be made to support new symmetric encryption algorithms in a pluggable way, but the scenario required implementing a plug-in known as a cryptographic service provider (CSP) for previous versions of Crypto API.  This was a huge undertaking, requiring support for a suite of asymmetric, hashing, and symmetric encryption functionality, not to mention enumeration and metadata-related interfaces for all of the above. To give a rough idea of the work required: implementing and testing a new CSP for the purpose of satisfying the customer requirement in this example could easily have taken six to twelve months of engineering cost for an experienced team! And that’s just for this one cipher algorithm.

In Windows Vista, by using CNG, writing a cipher plug-in is relatively trivial. However, the CNG plug-in alone is not sufficient to meet our customer’s needs. We must implement plug-in support for the new algorithm at the messaging layer as well. I’ll discuss both plug-in layers below.

Sample Solution

As discussed in the previous section, my customer wants support for the TwoFish symmetric cipher to be built-in to my PKI-based messaging application.  From an engineering perspective, this problem can be relatively easily solved in Windows Vista by implementing the following three interfaces:

· BCRYPT_CIPHER_FUNCTION_TABLE – This is the interface that allows me to enable Twofish as a symmetric cipher to any CNG-aware application (or system component, such as the Windows cryptographic messaging API; see the next two bullets).

· CMSG_OID_CNG_GEN_CONTENT_ENCRYPT_KEY_FUNC – This is the interface that allows me to enable Twofish as a content encryption algorithm in the context of the ASN.1 and X.509 standards-based messaging APIs. These use CNG.

· CMSG_OID_CNG_IMPORT_CONTENT_ENCRYPT_KEY_FUNC – This is the interface, closely related to “GEN_CONTENT” above, that allows me to recover the key material from a decrypted message.  I then return the Twofish key handle so the rest of the message can be decrypted.

Admittedly, as I am writing this, there are some important real-world considerations to take into account:

· My theoretical customer is unlikely to be satisfied with a solution that only works for Windows Vista. The customer’s minimum requirement could be support for Windows XP and Windows Server 2003, especially in the case of existing large deployments.

· Microsoft has announced that there are no plans to port CNG down-level.

· Microsoft has indicated that the previous version of Crypto API is likely to be removed in future versions of Windows.

As mentioned above, supporting earlier versions of Windows in this scenario is feasible; it’s just more work.  However, it should be noted that, if engineered properly, the Windows Vista portion of a down-level compatible solution is pretty much a strict subset of the overall effort, due to the common characteristics of pluggable crypto interfaces.

Solution Architecture

This section presents the architectural details of the solution and provides an architectural diagram, which you’ll find below.  The overall Windows crypto architecture has some interesting cross-dependencies, many of which are not depicted in the diagram.  However, note that my plug-in module implements interfaces at two different architectural layers.  The higher layer is so-called Crypto API 2.0 (CAPI2), where the certificate and message encoding logic lives.  This is where I plug-in the capability to generate and import a Twofish content encryption key.  But both capabilities – key generation and importation – are actually implemented at the CNG layer, which is where the other logical half where my plug-in is defined.  Therefore, twofish.dll is used at both architectural layers.

Why does CAPI2 require a special messaging plug-in to handle importing and generating a symmetric key when it could instead use the algorithm identifier to call directly into CNG with BCryptGenerateSymmetricKey?  The problem is that there are a variety of characteristics that define the use of a given cipher in the context of a given application.  These include key size, cipher mode, salt, initialization vector (IV), and others.  In addition, the proper wire encoding of parameters like the IV, which must be included with the message, must also be handled at this layer (as opposed to the low-level algorithmic layer – CNG).  Therefore, defining even the common encryption and encoding options without the use of a general callback scheme becomes burdensome.

The following diagram and bullets describe the logical plug-in points for the sample twofish.dll from the perspective of the test application that accompanies it.  In summary, the test application (TFTest.exe) creates a certificate and RSA key pair, encrypts a message with that certificate by using Twofish, and then decrypts the same message.

Figure 1 - Solution Architectural Diagram

The numbered steps below correspond to the numbers in the diagram:

1. The first step taken by the test is to create a persisted RSA key pair via CNG.  In general, persisted keys are handled with the NCrypt API layer.

2. Based on the RSA key pair, the test uses the X.509 digital certificate manipulation APIs in CAPI2 to create a temporary recipient certificate.  This certificate represents the party to whom I’m pretending to send an encrypted message.  Once the certificate is created, I call CryptEncryptMessage, specifying the recipient certificate as well as a custom cryptographic algorithm object identifier (also known as OID) representing Twofish.

3. The specified custom object identifier gets mapped to twofish.dll through the system registry, resulting in a call to my Twofish-specific implementation of the CMSG_OID_CNG_GEN_CONTENT_ENCRYPT_KEY_FUNC interface.

4. In order to create a Twofish content encryption key handle, I must call into CNG. Cryptographic primitives, including my implementation of the Twofish cipher algorithm, are handled by the BCrypt API layer.

5. CNG maps the custom Twofish algorithm identifier (separate from the above object identifier) back to Twofish.dll through the registry.  CNG then calls the BCRYPT_CIPHER_FUNCTION_TABLE interface I’ve enabled.  The specific BCrypt cipher functions that are used in this case include querying for the key object size (for the purpose of dynamically allocating memory for the key context), querying for the cipher block length (for the purpose of generating an IV), and then instantiating the key object.

6. Once the key handle has been returned to the underlying logic of CryptEncryptMessage, it’s used for its primary purpose: encrypting the message.  Therefore, the Twofish cipher function interface is exercised again with CNG/BCrypt.

Although the diagram above and corresponding description are more focused on encryption, message decryption works much the same way. The first difference is that, in step 2, the custom algorithm object identifier is embedded in the encrypted message, rather than specified as an explicit parameter to CryptDecryptMessage.  The second difference is that is that, in step 3, the CMSG_OID_CNG_IMPORT_CONTENT_ENCRYPT_KEY_FUNC is used to reconstitute the key object and decode the IV.

Implementation Details

In this section, I’ll highlight some of the important implementation details of the Twofish messaging cipher plug-in described above.  Architecturally, I’ll start from the bottom – the CNG plug-in – and end at the top – the test application.

CNG Cipher Plug-in Implementation

See twofish\CNG.cpp in the sample code that accompanies this article.  That’s the file that implements the BCRYPT_CIPHER_FUNCTION_TABLE interface.  In other words, it’s the crypto primitive plug-in for 256-bit Twofish.

There are a few interesting points to note about the cipher plug-in.  When an applications attempts to use the Twofish cipher, the first call to be received by this plug-in will be in response to BCryptOpenAlgorithmProvider.  Based on the information that the plug-in adds to the system registry, BCrypt will load twofish.dll and call GetCipherInterface (which must be provided as a named dll export).  In response, the plug-in returns a BCRYPT_CIPHER_FUNCTION_TABLE structure initialized with the addresses of its functions that implement the cipher interface (create a key, encrypt, decrypt, etc).  As an aside, note that the function table routines need not be implemented as DLL exports (my preference is to avoid doing anything that won’t be directly tested).  Moving on, my naming scheme for the function table routines is to replace the “BCrypt” at the beginning of each function table member with “TF” (for Twofish).  As a result, the second routine called (implicitly by the BCrypt dispatch layer) after GetCipherInteface is TFOpenAlgorithmProvider.

You’ll notice that most aspects of the Twofish CNG plug-in are quite simple.  TFOpenAlgorithmProvider is no exception.  It simply confirms that the requested algorithm name is “TWOFISH256” (see the #define for that string in the TFProv.h file), since that’s the only one it supports.  The plug-in doesn’t even require any context information to be stored in the returned BCRYPT_ALG_HANDLE, so it simply uses it as a pointer to the TWOFISH256 algorithm name string.  Downstream, for other interface routines that take that BCRYPT_ALG_HANDLE as input, I confirm that the handle still points to the string.  This is not a requirement, but it might help to catch otherwise subtle application bugs.

The next notable aspect of the CNG plug-in is the TFGetProperty routine, corresponding to BCryptGetProperty.  This function confused me at first, since one of its inputs is of type BCRYPT_HANDLE.  Depending on the scenario, the actual type of the handle provided by the caller in that parameter could be a BCRYPT_ALG_HANDLE or a BCRYPT_KEY_HANDLE.  This is different from previous versions of Crypto API, which provided separate CryptGetProvParam and CryptGetKeyParam functions for the same purpose.

To see one of the important uses of BCryptGetProperty, observe that BCryptGenerateSymmetricKey takes a buffer as input, allocated by the caller.  This is intended to be used to store whatever key context structure is required by the plug-in.  In other words, a typical symmetric cipher implementation (Twofish is in this category) need not make a single dynamic memory allocation!  This is a critical difference between CNG and legacy Crypto API.  Using the latter, crypto-intensive high-performance applications would frequently be artificially governed by contention on the process heap.

Anyway—moving on, in order to make a valid call into BCryptGenerateSymmetricKey, the application must first know how big to make the key object buffer.  In this situation, BCryptGetProperty is called with a BCRYPT_ALG_HANDLE and the BCRYPT_OBJECT_LENGTH property string.

In fact, the Twofish plug-in doesn’t implement a case in which BCryptGetProperty would require a BCRYPT_KEY_HANDLE (although see my implementation of TFSetProperty, which requires a BCRYPT_KEY_HANDLE in both cases).  What’s an example of such a case for GetProperty?  Consider anything that queries dynamic key context information, such as the current IV string or cipher mode.  I readily admit that I only implemented as much of the CNG cipher interface as was absolutely required to get the messaging sample to work!

Now take a look at TFEncrypt and TFDecrypt, which are really the whole point of a cipher plug-in!  In fact, these functions would be simple to implement, if it weren’t for the BCRYPT_BLOCK_PADDING flag.  That flag allows a calling application to optionally provide a plaintext input buffer to BCryptEncrypt that is not a multiple of the cipher’s block size in length (the flag has no meaning for anything other than a symmetric block cipher; however, Twofish happens to be one of the latter).

In order for the BCRYPT_BLOCK_PADDING flag to work correctly with BCryptDecrypt, at least one additional byte of padding must have been applied.  As a result, if the flag is specified, and the input plaintext is an exact multiple of the cipher block size in length, a full block of padding must be added during Encrypt.  The Twofish plug-in uses a standard padding scheme wherein the value of the padding bytes is equal to the length of the padding itself.  For example, Twofish uses a block size of 16 bytes.  A plaintext of length 6 bytes would be padded with 10 bytes of 0x0A if the BCRYPT_BLOCK_PADDING flag is specified (otherwise, TFEncrypt would return STATUS_INVALID_PARAMETER).  In summary, to support the padding flag, as well the general API usage of dynamically querying for the required output buffer size, parameter checking becomes somewhat complex for the Encrypt and Decrypt routines.

Rounding out the implementation of the CNG cipher plug-in are a few snippets of code in twofish\DllMain.cpp.  The BCryptRegisterProvider and BCryptAddContextFunctionProvider routines are used to register the implementation of the "TWOFISH256" algorithm.  This is done with DllRegisterServer, in support of “regsvr32.exe twofish.dll”.  Conversely, BCryptRemoveContextFunctionProvider and BCryptUnregisterProvider are called in response to DllUnregisterServer (i.e. through “regsvr32.exe /u twofish.dll”).

CMS Plug-in Implementation

Now that I’ve discussed the low-level foundation of the Twofish plug-in, I’ll move up the stack to the CMS portion of the plug-in. Overall, much like the CNG plug-in discussed above, the CMS piece seemed pretty obvious once I had it working, but there are just a few subtle tricks required in order to get to that point.

The twofish.dll binary exports just three functions:

· GetCipherInterface (required by CNG; see above)

· TFCngGenContentEncryptKey

· TFCngImportContentEncryptKey

The second two are the topic of this section, and are required in order to support use of any new cipher algorithm with CryptEncryptMessage and CryptDecryptMessage (as opposed to only enabling the algorithm with CNG, and not further up the stack, which would be kind of boring).  See the twofish\Capi2.cpp file– so named because these routines correspond to the Crypto API 2.0 portion of the application stack, of which CMS is a component.

TFCngGenContentEncryptKey

The first routine, TFCngGenContentEncryptKey, is of type CMSG_OID_CNG_GEN_CONTENT_ENCRYPT_KEY_FUNC, and allows a Twofish key and IV to be created and returned for bulk message encryption. The trick here is that the IV must accompany the message in order for the recipient to be able to decrypt it.  Therefore, the IV must be properly ASN.1 encoded as an OCTET STRING in order to adhere to the messaging standard.  Actually, this is just a matter of creating the 16 byte random IV, attaching it to the key context, and then encoding the 16 bytes with CryptEncodeObjectEx to be returned in the caller’s PCMSG_CONTENT_ENCRYPT_INFO-type input/output parameter.

TFCngImportContentEncryptKey

The second routine, TFCngImportContentEncryptKey, is of type CMSG_OID_CNG_IMPORT_CONTENT_ENCRYPT_KEY_FUNC.  It allows the encoded IV and decrypted Twofish key bytes to be restored as a BCRYPT_KEY_HANDLE, which is then used by crypt32.dll to decrypt the caller’s message via CNG.  First, the IV is decoded with CryptDecodeObjectEx.  Then the key is recreated by using the input key bytes, which have already been decrypted by crypt32.dll with the recipient’s private (RSA, in this example) key. A helper routine, _TFCngGenerateSymmetricKey, is shared between the two interface routines and is used to create a Twofish key handle from key bytes and an IV.

Finally, rounding out the CMS portion of the plug-in is the object identifier registration logic in DllMain.cpp. In terms of the code and structures required, this is only slightly more complex than the CNG registration logic discussed in the previous section.  First, the content encryption object identifier – szOID_TWOFISH256 (see #define in TFProv.h) – is added by using CryptRegisterOIDInfo.  Then, the two functions mentioned above are registered with CryptRegisterOIDFunction to map to twofish.dll.  The corresponding functions – CryptUnregisterOIDInfo and CryptUnregisterOIDFunction – are used in response to DllUnregisterServer.

Test Application Implementation

The previous two sections round out the discussion of the production portion of my customer deliverable.  This means that 256-bit Twofish can now be supported through my theoretical S/MIME application simply by registering twofish.dll on the client system.  However, let’s hope we never go into production without testing the new code first!

The TFTest.exe project that accompanies this article is such a test.  This test project performs three simple operations:

1. Create a test RSA key pair and recipient certificate.
Note: CryptDecryptMessage will fail in this scenario unless the key pair is CNG (as opposed to legacy Crypto API) based.  Why?  Well, in this scenario, it seems that Microsoft could have supported decryption of “unknown” symmetric key types with “raw” RSA decryption through the legacy API, but that they chose not to.  As a result, in the case of CNG-only plug-in symmetric ciphers, asymmetric key handles used for key agreement must also be CNG-based.

2. Encrypt a test message using CryptEncryptMessage, the test recipient certificate, and the szOID_TWOFISH256 content encryption identifier registered by twofish.dll.

3. Decrypt the test message using CryptDecryptMessage.

I have one final comment about the implementation of the test, although it is not directly related to the Twofish plug-in.  While CryptEncryptMessage takes an array of recipient certificates as input, CryptDecryptMessage takes an array of certificate stores as input. For the latter, each is searched for a matching recipient certificate.  Once the matching certificate is found, it must then be linked to the corresponding private key, which is used to decrypt the Twofish key in this case.  To accomplish this private key mapping, CryptDecryptMessage requires that the recipient certificate have either the CERT_KEY_PROV_INFO or CERT_KEY_CONTEXT property set.

However, the flow of the test program is such that this step is done for me.  In summary, the following APIs are used in sequence:

1. NCryptOpenStorageProvider – Open the MS_KEY_STORAGE_PROVIDER.

2. NCryptCreatePersistedKey – Initiate the creation of a new RSA key pair.

3. NCryptFinalizeKey – Create and store the new key pair.

4. CertStrToName – Encode a subject name for the test recipient certificate.

5. CertCreateSelfSignCertificate – Create the test recipient certificate and sign it with the NCRYPT_KEY_HANDLE created above. Private key mapping information, as described above, is attached to the returned certificate context.

6. CryptEncryptMessage – Described above.

7. CertOpenStore – Create an in-memory certificate store.

8. CertAddCertificateContextToStore – Add the recipient certificate context to the store.

9. CryptDecryptMessage – Described above.

Test Run-Time Analysis

I want to make a few comments about the behavior of the plug-in interfaces discussed above, based on my observations while debugging Twofish.dll and TFTest.exe.  I’ve mentioned that one of the major run-time enhancements in CNG versus the legacy API is its improved memory management.  In contrast, there were two things that surprised me about the performance of the CMS interface.

First, due partly to the fact that the CMS interface predates CNG, the former doesn’t fully implement the improved memory handling characteristics of the latter.  Specifically, in the test scenario implemented above, it would be nice if a high-performance solution could directly control all interaction with the heap, even at the messaging layer.  However, creation of the CNG key object is handled within crypt32.dll, and malloc- and free-type callbacks are not available to the application.

Second, I noticed that TFCngGenContentEncryptKey is called twice during a single call to CryptEncryptMessage.  The first call appears to be part of a length calculation in anticipation of building the encrypted message.  The call stack (from WinDbg.exe) follows:

 0:000> k
ChildEBP RetAddr  
0012f908 754b6937 twofish!TFGenerateSymmetricKey
0012f938 10030f2a bcrypt!BCryptGenerateSymmetricKey+0x8a
0012fa74 10031247 twofish!_TFCngGenerateSymmetricKey+0x12a
0012fbbc 75793c97 twofish!TFCngGenContentEncryptKey+0x127
0012fbe0 7578da72 CRYPT32!ICM_CNGGenContentEncryptKey+0x34
0012fbf8 757891a9 CRYPT32!ICM_InitializeContentEncryptInfo+0xf2
0012fc7c 75789482 CRYPT32!ICM_LengthEnveloped+0x121
0012fca0 75766ef4 CRYPT32!CryptMsgCalculateEncodedLength+0x80
0012fcd4 75767241 CRYPT32!CryptHashMessage+0x6d6
0012fd50 0042eb4c CRYPT32!CryptEncryptMessage+0x64
0012ff34 00431042 TFTest!wmain+0x32c

A second call is made to TFCngGenContentEncryptKey, and this key is apparently used for the actual encryption.

 0:000> k
ChildEBP RetAddr  
0012f920 754b6937 twofish!TFGenerateSymmetricKey
0012f950 10030f2a bcrypt!BCryptGenerateSymmetricKey+0x8a
0012fa8c 10031247 twofish!_TFCngGenerateSymmetricKey+0x12a
0012fbd4 75793c97 twofish!TFCngGenContentEncryptKey+0x127
0012fbf8 7578da72 CRYPT32!ICM_CNGGenContentEncryptKey+0x34
0012fc10 7578f593 CRYPT32!ICM_InitializeContentEncryptInfo+0xf2
0012fc80 75752352 CRYPT32!ICM_OpenToEncodeEnvelopedData+0x328
0012fca0 75766f34 CRYPT32!CryptMsgOpenToEncode+0x66
0012fcd4 75767241 CRYPT32!CryptHashMessage+0x716
0012fd50 0042ebde CRYPT32!CryptEncryptMessage+0x64
0012ff34 00431042 TFTest!wmain+0x3be

Better performance would be to build the key only once, or at least to expose a mode in the callback indicating that only size information should be returned and not a real key.

For completeness, I’ll mention that, for CryptDecryptMessage, the plug-in entry point is only exercised once, per the following call stack.

 0:000> k
ChildEBP RetAddr  
0012f808 754b6937 twofish!TFGenerateSymmetricKey
0012f838 10030f2a bcrypt!BCryptGenerateSymmetricKey+0x8a
0012f974 100315a3 twofish!_TFCngGenerateSymmetricKey+0x12a
0012fab8 75793dff twofish!TFCngImportContentEncryptKey+0x123
0012fadc 7578e901 CRYPT32!ICM_CNGImportContentEncryptKey+0x34
0012fb04 7578ea76 CRYPT32!ICM_ImportContentEncryptKey+0x115
0012fb6c 757306de CRYPT32!ICM_ControlCmsDecrypt+0xb8
0012fba0 75767ca1 CRYPT32!CryptMsgControl+0x208
0012fc2c 7576811d CRYPT32!CryptVerifyMessageSignatureWithKey+0x816
0012fd10 75768295 CRYPT32!CryptSignAndEncryptMessage+0x392
0012fd54 0042eda6 CRYPT32!CryptDecryptMessage+0x2a
0012ff34 00431042 TFTest!wmain+0x586

The final overall debugging comment stems from some strange behavior I saw during my initial attempt to observe the test application from within WinDbg.exe (I don’t recall if the Visual Studio debugger had the same problem).  In summary, I wasn’t able to set breakpoints in twofish.dll.  The reason is this: as a pluggable object identifier handler for CMS, twofish.dll is imported with LoadLibrary using the standard path search logic – namely, first check for the DLL in the current directory, and then check system32. This behavior is Microsoft’s own best-practice recommendation from a serviceability perspective.

However, CNG will only load plug-ins from the system32 directory!  In other words, the current directory is never checked.  During the initial operation of my test, I had deployed twofish.dll to system32, but forgotten that a copy also existed in the test directory.  As a result, during the test run, CMS loaded the local copy and CNG loaded the one in system32.  And even though they were the same binary, the debugger was just as confused as me.  In summary, when testing your own applications in this scenario, don’t forget to first remove any local copy of the plug-in, and to always deploy an up-to-date version to system32.

Conclusion

Returning to the business problem that I started with, this implementation of the CNG and CMS plug-in points should address the customer’s need.  I can now support an arbitrary encryption algorithm (and specifically Twofish) in my secure messaging application.

One final note regarding the readiness of this sample to be turned into a truly production-quality deliverable: it has indeed been tested somewhat, but only with the CMS layer.  This means that no direct testing of the CNG cipher plug-in has been done.  While I’ve omitted this for the sake of brevity in the sample, no crypto algorithm implementation should go live in production without a barrage of direct testing, including with known-good plaintext/ciphertext pairs obtained from a separate implementation.

Acknowledgements: I’d like to thank Andrew Tucker and Kelvin Yiu at Microsoft for their technical feedback pertaining to this article.

Appendix A – CNG: Kicking the Tires

Let me speak philosophically for a moment about Windows APIs in general.  When, during the course of a project, I need to ramp up on a new API set, I tend to mentally classify the APIs in two ways.  The first class consists of those routines that report or enumerate existing data.  The second class consists of those that create or change something.  Not rocket science, right?  In my experience, the first class – reporting and enumerating data – is generally easier to get working correctly.  But the second class – creating or changing something – is the reason I’m learning the API in the first place.  Taking the time to learn about the first class of APIs tends to bring some insight into how the second class works, potentially shortening the learning curve in the long run.  This is a leap of faith, though, because what I really want to do is just get my project – whatever it is – working.  Hence, I dive directly into the harder APIs and get stuck!

So for CNG, I started by creating a test program that enumerates everything made available by the API.  The test creates some things, too – namely, persisted private keys – but only to make the enumeration part more interesting.  The test program is available for download. Here’s the output from my Windows Vista laptop:

 >CngEnum.exe
Microsoft Primitive Provider - CNG registration information
Interface = 1 (BCRYPT_CIPHER_INTERFACE)
 User-mode:
  Functions: RC2, RC4, AES, DES, DESX, 3DES, 3DES_112
 Kernel-mode:
  Functions: RC2, RC4, AES, DES, DESX, 3DES, 3DES_112
Interface = 2 (BCRYPT_HASH_INTERFACE)
 User-mode:
  Functions: MD2, MD4, MD5, SHA1, SHA256, SHA384, SHA512
 Kernel-mode:
  Functions: MD2, MD4, MD5, SHA1, SHA256, SHA384, SHA512
Interface = 3 (BCRYPT_ASYMMETRIC_ENCRYPTION_INTERFACE)
 User-mode:
  Functions: RSA
 Kernel-mode:
  Functions: RSA
Interface = 4 (BCRYPT_SECRET_AGREEMENT_INTERFACE)
 User-mode:
  Functions: DH, ECDH_P256, ECDH_P384, ECDH_P521
 Kernel-mode:
  Functions: DH, ECDH_P256, ECDH_P384, ECDH_P521
Interface = 5 (BCRYPT_SIGNATURE_INTERFACE)
 User-mode:
  Functions: DSA, ECDSA_P256, ECDSA_P384, ECDSA_P521, RSA_SIGN
 Kernel-mode:
  Functions: DSA, ECDSA_P256, ECDSA_P384, ECDSA_P521, RSA_SIGN
Interface = 6 (BCRYPT_RNG_INTERFACE)
 User-mode:
  Functions: RNG, FIPS186DSARNG
 Kernel-mode:
  Functions: RNG, FIPS186DSARNG
 
Microsoft Software Key Storage Provider (Comment: (null))
Algorithms:
 Name: RSA  [Class = 3, Operations = 20, Flags = 0]
 Name: DH  [Class = 4, Operations = 8, Flags = 0]
 Name: DSA  [Class = 5, Operations = 16, Flags = 0]
 Name: ECDH_P256  [Class = 4, Operations = 24, Flags = 0]
 Name: ECDH_P384  [Class = 4, Operations = 24, Flags = 0]
 Name: ECDH_P521  [Class = 4, Operations = 24, Flags = 0]
 Name: ECDSA_P256  [Class = 5, Operations = 16, Flags = 0]
 Name: ECDSA_P384  [Class = 5, Operations = 16, Flags = 0]
 Name: ECDSA_P521  [Class = 5, Operations = 16, Flags = 0]
Implementation Type: 2
User keys:
 Name: CNG_Test_RSA_Key
  Alg = RSA, Flags = 0, KeySpec = 1, Size: 1024
 Name: CNG_Test_ECDH_Key
  Alg = ECDH_P521, Flags = 0, KeySpec = 0, Size: 521
 Name: CNG_Test_ECDSA_Key
  Alg = ECDSA_P521, Flags = 0, KeySpec = 0, Size: 521
 Name: TFTest_Key
  Alg = RSA, Flags = 0, KeySpec = 1, Size: 1024
 Name: CNG_Test_DSA_Key
  Alg = DSA, Flags = 0, KeySpec = 2, Size: 1024
 Name: CNG_Test_DH_Key
  Alg = DH, Flags = 0, KeySpec = 1, Size: 768
Machine keys:
 Name: CNG_Test_RSA_Key
  Alg = RSA, Flags = 0, KeySpec = 1, Size: 1024
 Name: CNG_Test_ECDH_Key
  Alg = ECDH_P521, Flags = 0, KeySpec = 0, Size: 521
 Name: CNG_Test_ECDSA_Key
  Alg = ECDSA_P521, Flags = 0, KeySpec = 0, Size: 521
 Name: CNG_Test_DSA_Key
  Alg = DSA, Flags = 0, KeySpec = 2, Size: 1024
 Name: CNG_Test_DH_Key
  Alg = DH, Flags = 0, KeySpec = 1, Size: 768
 
Microsoft Smart Card Key Storage Provider (Comment: (null))
Algorithms:
Implementation Type: 11
User keys:
 
Microsoft Enhanced Cryptographic Provider v1.0 - User Keys:
 CNG_Test_RSA_Key
 bf1eed19-d3cf-40ce-b500-15ae65a347b7
 le-User-df0836fb-bc75-4c74-b453-1cbc002778a2
 a7baa3a8-7561-4972-be93-53c1960181c5
 dan
 TestContainerForCngEnum
 
Microsoft Enhanced Cryptographic Provider v1.0 - Machine Keys:
 CNG_Test_RSA_Key
 TestContainerForCngEnum
 
Microsoft Enhanced DSS and Diffie-Hellman Cryptographic Provider - User Keys:
 dan
 TestContainerForCngEnum
 
Microsoft Enhanced DSS and Diffie-Hellman Cryptographic Provider - Machine Keys:
 dan
 TestContainerForCngEnum
Success

Appendix B – System Components Using CNG

When I started working on this article, I wasn’t certain if many components of the operating system had been ported to use CNG, or if most of the crypto consumers were still using the previous version of the API.  A few simple scripts and debugging tools helped to answer that question.

Based on the CNG documentation, I knew that any component calling CNG would have a dependency on either bcrypt.dll or ncrypt.dll. The first step was to locate the processes running in Windows Vista with either of those DLLs loaded.  Having recently seen a presentation by Mark Russinovich of Sysinternals, I knew that his Process Explorer tool would be able to provide this information.  However, my preference in this case was for a command-line based tool.  It turns out that Sysinternals provides one: ListDlls (see the Resources section for download information). Using that tool, here’s what I found:

 >c:\temp\Listdlls.exe -d ncrypt.dll | findstr /i pid
lsass.exe pid: 608
svchost.exe pid: 944
svchost.exe pid: 1080
svchost.exe pid: 1504
sqlservr.exe pid: 516
taskeng.exe pid: 2352
taskeng.exe pid: 3760
OUTLOOK.EXE pid: 2588
cmd.exe pid: 3664
iexplore.exe pid: 2908
iTunes.exe pid: 5756

In analyzing the tool output, I knew I was discovering something interesting when I realized that even iTunes is using CNG!  I then realized that there must be a more common dependency among those applications, which in turn has a dependency on CNG.  What’s the common dependency?  Starting iTunes in a debugger shows it easily enough.

 >windbg.exe -g -c "sx eld:ncrypt.dll ; k20" -y srv*c:\Symbols*https://msdl.microsoft.com/do
wnload/symbols "c:\Program Files\iTunes\iTunes.exe"
 
ChildEBP RetAddr 
…
0012ebcc 7575ed3d kernel32!LoadLibraryA+0xb7
0012ec10 7575edad CRYPT32!__delayLoadHelper2+0x55
0012ec4c 75763b44 CRYPT32!_tailMerge_ncrypt_dll+0xd
0012ec70 7575ee20 CRYPT32!I_CryptGetCNGAlgorithmProvider+0x5b
0012ec84 7575f0e1 CRYPT32!I_CryptGetCNGHashProvider+0x11
0012ec9c 7575f90c CRYPT32!ICM_CreateHash+0x34
0012ecc0 7575fb89 CRYPT32!ICM_CreateHashList+0x6d
0012ed20 7575fda6 CRYPT32!ICM_UpdateDecodingSignedData+0x8c
0012ed74 74f88837 CRYPT32!CryptMsgUpdate+0x1d6
0012eda8 74f8373c WINTRUST!_GetMessage+0x18b
0012edc0 74f83615 WINTRUST!SoftpubLoadMessage+0x75
0012eed8 74f8346c WINTRUST!_VerifyTrust+0x22d
0012eefc 005c1695 WINTRUST!WinVerifyTrust+0x50
0012ef38 775d17ab iTunes+0x1c1695

Now we know the common dependency: WinVerifyTrust, the Win32 function that provides certificate-based digital signature checking.  That routine loads crypt32.dll, which in turn takes a delay load dependency on ncrypt.dll, a fact that can also be confirmed via your favorite “Portable Executable” binary dumper:

 >link.exe /dump /imports c:\Windows\System32\crypt32.dll
…
 Section contains the following delay load imports:
…
    ncrypt.dll

While it’s interesting that the operating system’s signature checking logic is now based on CNG rather than legacy Crypto API, I was also interested to learn that Microsoft has not yet exposed pluggable signature algorithm support via WinVerifyTrust (i.e. that API is not crypto-agnostic).  So why did they bother to port it over the CNG then?  I don’t know for sure, but I speculate that one goal of the Vista release was to make the entire crypto stack CNG-based by default.

Appendex C – Additional Resources

About the Author

Dan Griffin is a software security consultant in Seattle, WA.  He previously spent seven years at Microsoft on the Windows Security development team.  Dan can be contacted at https://www.jwsecure.com.

this is draft content and is subject to change