two phase commit

You want to go out to with three of your friends (Bob, Alice and Picard) to some place for dinner. All of you just agree that you want to eat at 7pm that night but have not agreed on a location yet.

How do you actually manage to get your gang of no good friends to actually meet up at the same location?

Being the person who originally thought of going out, you have the privilege of selecting some spot. But then you have the onerous task of making sure everyone knows the place and commits to it.

You call your first friend Bob and tell him that tentatively you want to meet up at Earls at 7pm. He says OK. Then you call your friend Alice and tell her that you want to meet up at Earls at 7pm, she also says Yes. But when you call Picard and tell him about Earls, he says he hates the place. You say ugh, and then call back Bob and Alice to cancel Earls.

You now pick up a new spot, maybe Jon Howie’s steakhouse, because you really really want to hang out with your friends. You call Picard, he is cool with it and so is Bob but Alice says it sucks. So you call Picard and Bob and cancel yet again.

Attempt #3, you now try with Daniel’s Broiler. Third time’s a charm and everyone tentatively agrees to meet at Daniel’s Broiler at 7pm that night. Problem solved! But just one caveat.. in the first call to your friends you had only told them the location tentatively. Now you have to call all three friends again to confirm the location.

And now you are all set!

But what if instead of calling them in the first tentative round, you had sent them text messages? Imagine Alice and Bob had responded in ‘Yes’ but Picard, being the asshole he is, is just not responding to your text. Your whole evening is potentially in ruins and Alice and Bob are waiting for the confirmation. At this point, you have to enforce some sort of time out. E.g, after 1 hour of no response from Picard you will assume that the plans are off and will send a cancellation message to both Bob and Alice.

No dinner tonight but at least you don’t leave your other two friends hanging.

Or did you leave them hanging?

Imagine that your text messaging service is unreliable and you can’t tell whether your friends actually received your texts. So e.g, Bob did get your text for cancelation but Alice did not. Would she still show up at Daniel’s Broiler and dine alone while cursing you and your friends but mainly you because you were supposed to be the esteemed coordinator?

She won’t because she would not assume that she needs to get to Daniel’s Broiler unless she receives a second text from you confirming the plan.

So now let’s think what else could go wrong.

Ah yes, what if all three friends do accept your proposal but your second text to confirm the proposal reaches Alice and Picard but does not reach Bob. At this point we have a real problem. Because now you, Alice and Picard would indeed show up at Daniel’s Broiler but Bob would not.

This, my dear furry friends, is not a solvable problem because of the physics involved. You and your friends can complain and bicker about Bob all through the night while enjoying some juicy steaks. The only way out is to record the fact that something went wrong and then have another dinner the following week.

This is the basic flow of the two phase commit protocol. You are the transaction coordinator and your friends are the participants. This is one of the mechanisms by which disjoint computer systems that potentially span lossy communication networks are kept in sync with each other.

And the unsolvable problem mentioned above is in fact a variation of the Byzantine fault in distributed systems with lossy communication.

Leave a comment