ramblings on PHP, SQL, the web, politics, ultimate frisbee and what else is on in my life
back

Optimistic locking

I am currently touring the US as part of the CodeWorks 09 conferences, where I am holding a 3 hours tutorial on database driven PHP development. So far my sessions have consistently been less attended than the talks on testing, OO and security that are running as well. So it goes. I guess the plus side for the attendees is that I can spend more time addressing their personal needs. My tutorial essentially is based on all my previous talks on database for which I have updated all the relevant slide sets. I have also added an introduction to PDO to get things started. Since all of that will never fit into 3 hours, I let the audience pick which talks they want to hear. Unfortunately so far the Doctrine ORM talk hasn't been picked, but I am getting better and better at holding the talk about the telephone directory application and I think the work I put into making this talk very much code based is really playing off. However the slide that seems to get the most "aha, that is really cool!" is the slide about optimistic locking.

Many of our web applications require data entry forms and often times they need to handle concurrent changes. Imagine for example an address database. Multiple admins are maintaining the data and corrections and changes can happen in parallel. So if someone opens up a record, but gets side tracked, how do you prevent that person from overwritten someone else's changes when they come back after a while? One approach would be to lock the record when the edit form is opened. Of course you cannot keep that lock until forever. So the problem is not really solved at all. Worse yet, you prevent people from working concurrently together. This approach is called pessimistic locking, since you are expecting concurrent access to happen and therefore you lock things explicitly.

A much better approach in many situations is to be optimistic. Instead of locking the record when you are editing the form you simply store the time of the last change (or some counter that ensures that you can easily generate unique values that will never repeat) in your edit form. This way you can check if the record has been changed between opening and submitting the data record, enabling your application to handle this case gracefully. Like your app can present a screen showing the new current values in the database as well as those that were submitted along with some tools to merge the records.

So here is how it plays out:

Stpe one the record is created, note the "cntr" column which in this case is filled with the current unix timestamp


<?php
// id = 23, unix_t() = 1179145598 = $_SERVER['REQUEST_TIME']
$dbh->exec("INSERT INTO addr (id, cntr, street, city) VALUES (nextval(‘addr’)), unix_t(), ‘Foo Street’, ‘Bar Town’)");
?>

In the next step any number of people are currently opening the edit form for this record. Note that we fetch the data of the record, the id and also the cntr column. The id and cntr are stored in hidden form fields:


<?php
$dbh->query("SELECT id, cntr, street, city FROM addr");
?>

Now when the first users submits the page we update the data in the database as well as the cntr column. Also note that in the WHERE statement we filter by both the id and the cntr:


<?php
if ($dbh->exec("UPDATE addr SET street = ‘Foo Ave’, cntr = unix_t() WHERE id = 23 AND cntr = 1179145598") {
  // record updated
} else {
  // there was a concurrent data change
 // present screen with the current and old data
  $dbh->query("SELECT id, cntr, street, city FROM addr");
}
?>

The trick is to check the affected rows after the update. Conveniently the PDO::exec() method returns this value. If the affected rows is 0 we know that there was concurrent access. Note that this also works not only for concurrent updates, but also for deletes.

Comments



Re: Optimistic locking

The way i used to do it is almost the same. Keep auto updatable field in the table and get ti with the row itself.

Then on submission just check the time in the row you are updating in comparison to the time in the row you read.

For single rows updates that works very well as collision window is very very small (if you have db transaction there is none).

The only problem is that if you want to do mass update you would have to check earch rows time first or do a where update>max of read times or something ... any way it might be impractical at this stage.

Art