MSDTC: The Magic of Phase Zero (Phase0) – Or – When Using 2PC Transactions Is Not Enough

The most known technique of implementing distributed transaction is the "two-phase commit" (2PC). Here is a quick summary of how this technique works:
   - Active phase: After the transaction is created and before the "commit" message is issued by the creator, the transaction is in the "active" state. During this phase, resource managers can "enlist" and become part of the transaction. When "commit" is issued, the transaction moves to Phase1.
   - Phase1 (preparing phase): in which the superior transaction manager (TM), after receiving the Commit message from the creator of the transaction, "asks" the enlisted resource managers or the subordinate transaction managers to "prepare" for commit; if all respond with a successful "prepared" answer then the transaction moves to Phase 2, otherwise the transaction is aborted
   - Phase 2 (committing phase) the superior TM writes a durable log entry about the status of the transaction and then starts sending "commit" messages to all the resource managers and subordinate TMs part of the transaction. In the case of any type of error (communication, computer crash etc), the superior TM will continue to send the "commit" messages until a successful "committed" message is received from all the participants. When all the participants responded "committed", the log entry corresponding to the current transaction is removed from the log and the transaction ends.

In addition to the "two-phase commit" technique, MSDTC supports another phase called Phase Zero (Phase0) which occurs after the creator of the transaction calls Commit and before Phase1 starts. To participate in Phase0 you need to enlist as a Phase0 participant: https://msdn.microsoft.com/library/?url=/library/en-us/cossdk/htm/transactionphase0enlistment_6j04.asp?frame=true and https://msdn.microsoft.com/library/?url=/library/en-us/cossdk/htm/pgdtc_dev_8ldf.asp?frame=true. MSDTC will always start with Phase0 and remain there until all the Phase0 participants responded.

The great benefit of Phase0 is that during this phase, any type of enlistment (including Phase0) in the transaction it is still allowed. Once the transaction moves to Phase1, the new enlistments will be denied. Since you can create a new Phase0 enlistment during Phase0, MSDTC executes this phase in waves. After "commit" is issued, the TM will first send "Phase0Request" messages (https://msdn.microsoft.com/library/en-us/cossdk/htm/itransactionphase0notifyasync_8ywk.asp) to all known Phase0 enlistments. If all of them successfully replied Phase0Done (https://msdn.microsoft.com/library/en-us/cossdk/htm/itransactionphase0enlistmentasync_5pt1.asp), the TM will look for new Phase0 enlistments that occurred meanwhile and it will start another Phase0 wave if any is found and so on. Phase0 will end when no other new Phase0 enlistment is found after a wave. After Phase0 ends, the transaction moves to Phase1.

Phase0 allows the following three scenarios: caching resource managers, protection against "early" commits and "safe" propagation of transactions in async programming.

Caching Resource Managers
Using a Phase0 enlistment, a middle-tier component can delay the connection to the database(s) as long as possible, acting as a cache to the requests from the client and thus reducing network traffic and database locks. This works great for those scenarios in which the client does many changes to the same data before it finally decides to end and save the work. When the client "ends" the work, the transaction is committed, and the TM issues a Phase0Request to the caching component. Before replying Phase0Done, the component will open all the necessary connections to the databases that need to be involved and it will persist the final data. These databases will now be "durably" enlisted in the transaction. After Phase0Done is received, the TM will continue with the 2PC and commit the transaction. See also https://msdn.microsoft.com/library/?url=/library/en-us/cossdk/htm/pgdtc_dev_0y2b.asp?frame=true

Protection Against "Early" Commits
In general, 2PC transactions help to ensure integrity of data during different types of system errors. Using Phase0 you can also protect from badly written software.

Let me give an example: let's say a transaction is created on the client side and then propagated to a middle tier component that is asked to do work as part of that transaction. The work that this component needs to do involves two databases, DB1 and DB2, as follows:

int DoTransfer(int amount, transaction tx)
{
  DB1.OpenConnection(tx);
  DB1.DoDebit(amount);
  DB2.OpenConnection(tx);
  DB2.DoCredit(amount);
}

The component is using the received transaction to communicate with the two databases to ensure the integrity of its data. Both Debit and Credit operations need to be successful or none should occur. This looks great until now. But what if the client doesn't wait for DoTransfer to finish and calls Commit right after the DB1.DoDebit returned and before DB2.OpenConnection started? A badly written client can do that, maybe the programmer wanted to use multi-threading but didn't quite get it right. What will happen is that the component is left with inconsistent data in the databases, a debit operation that occurred without a credit operation. There is nothing wrong with the transaction itself; from the point of view of the 2PC transaction, only one enlistment occurred and that one responded successfully to both "prepare" and "commit" phases. It is the application logic that is wrong here.

Can you protect against this type of situation? Yes, you can. One way is to write bug-free code, but mistakes happen. If you don't own the client code, the problem is even harder. The best solution is to use a Phase0 enlistment and modify the middle tier code as follows:

int DoTransfer(int amount, transaction tx)
{
 EnlistPhase0(tx); //Creates a Phase0 enlistment with the transactions

 DB1.OpenConnection(tx);
 DB1.DoDebit(amount);
 DB2.OpenConnection(tx);
 DB2.DoCredit(amount);

 SignalWorkCompletedToPhase0Enlistment(tx); // signals the Phase0 enlistment that the work was completed
}

Your Phase0 enlistment should start with a "WorkCompleted" flag set to false. The method SignalWorkCompletedToPhase0Enlistment mentioned above will set this flag to true. If a Phase0Request is received and the flag is still false, it means that the Commit was issued before the DoTransfer finished its work. At this point, you have two options:
- one option (the recommended one when "early commits are not expected) is to abort the transaction and log some error; if you own the client code you might want to catch and fix these "early" commits
- the other option is to hold the Phase0Request, received while DoTransfer is still doing work, until the flag becomes true and only then let the transaction continue by replying with Phase0Done; use this option when you expect to receive "early" commits as part of your system logic and flow.

Asynchronous Programming
This is very similar to the early commits scenarios mentioned above, but this time, the early commits are "by design". One can encounter scenarios where a piece of the application starts a transaction and delegates the work asynchronously to other parts of the application (a different thread of execution, a remote location) and without waiting for a response continues with committing the transaction. In these scenarios, we will choose the second option from above, the one that blocks the commit until work is finished. The code is similar:
 Client code:
  Transaction tx = StartTransaction();
  AsyncDoTransfer(tx); // like create a separate thread and let it do the transfer operation etc
  SomeOtherAsyncWork(tx);
  CommitTransaction(tx);

 Server code:
  // inside AsyncDoTransfer that receives the transaction from the client
  EnlistPhase0AndBlock(tx); // if we get a Phase0Request we will block and wait until SignalWorkCompletedToPhase0Enlistment is called and only then reply with Phase0Done
  
  DB1.OpenConnection(tx);
  DB1.DoDebit(amount);
  DB2.OpenConnection(tx);
  DB2.DoCredit(amount);

  SignalWorkCompletedToPhase0Enlistment(tx); // signals the Phase0 enlistment that the work was completed
  return;

Looks nice and safe... Well, there is a problem even with this code. What if, Commit is called even before we are able to enlist in Phase0 in the AsyncDoTransfer method? We can end up again with some inconsistent data. The solution is to create some sort of acknowledgment from the server to the client saying that at least it got to the phase0 enlistment: "hey client, I know I'm slow and I will do your requested work later but here is my ack that you can Commit your transaction safely any time from now on". To accomplish this we will have to use phase0 enlistments on both the client and server side as follows:

 Client Code:
  Void MainFunction()
  {
    Transaction tx = StartTransaction();

    EnlistPhase0AndBlock(tx); // if we get a Phase0Request we will block and wait until SignalWorkCompletedToPhase0Enlistment is called and only then reply with Phase0Done
    AsyncDoTransfer(tx, AckFromDoTransfer); // like sending an async message to a server and let it do the transfer operation etc
 
    CommitTransaction(tx); 
  } 
  void AckFromDoTransfer()
  {
    SignalWorkCompletedToPhase0Enlistment(tx); // signals the Phase0 enlistment that the work was completed 
  }

 Server Code:
  void AsyncDoTransfer(tx, AckFromDoTransfer)
  {
    EnlistPhase0AndBlock(tx); // if we get a Phase0Request we will block and wait until SignalWorkCompletedToPhase0Enlistment is called and only then reply with Phase0Done
  
    AckFromDoTransfer(); // the client can Commit safely the transaction from now, since I'm protected by phase 0
   
    // doing the work that takes a lot of time
    DB1.OpenConnection(tx);
    DB1.DoDebit(amount);
    DB2.OpenConnection(tx);
    DB2.DoCredit(amount);

    SignalWorkCompletedToPhase0Enlistment(tx); // signals the Phase0 enlistment that the work was completed
  }

The phase0 enlistment on the client side will make sure that we keep the transaction in phase0 until we get all the acks from our servers. Look at this as a method to pass phase0 "ownership" from client to servers.

All these work due to the "magic" of phase0 that allows "infinite" number of enlistments, until no new phase0 enlistments are being created and the existing ones replied with a successful Phase0Done. I compare the Phase0 enlistment to the AddRef method from the good old COM lifetime management. A Phase0 enlistment will call "AddRef" on the "active phase" of the transaction, while a Phase0Done will call "Release" on the counter. While the counter is higher than zero, the transaction will not be allowed to enter Phase1.