ramblings on PHP, SQL, the web, politics, ultimate frisbee and what else is on in my life
back

Musings on ordered lists inside RDBMS

On my current project my team had to develop a portlet interface. Users can load portlets and organize them in multiple tabs with 3 columns per tab. They can reorganize the order of their tabs and move portlets within a tab an also move them to new tabs. Portlets are always placed at the top left when they get added or moved to a tab. Furthermore portlets and tabs can be removed, though the last delete operation can always be undone. All of this essentially required me to devise a plan for how to manage ordered lists inside an RDBMS.

Note that while this was written for MySQL (it makes heavy use of MySQL session user variables), I am using the Oracle style named placeholder support that PDO emulates for MySQL. So do not get confused by ":foo" in the SQL statements. This is just like the "?" you should know if you ever used prepared statements with MySQL. Furthermore I am using pseudo code control logic around the SQL. I think it should be sufficiently self explanatory.

Adding a tab requires fairly simple SQL. A new tab is placed at position zero if there are no tabs for the given user yet. Otherwise the position is one larger than the current largest tab position for the given user. For those who have never used COALESCE(), it simply returns the first non NULL parameter.

INSERT INTO user_tabs (user_id, label, pos)
    (SELECT :user_id, :label, COALESCE(MAX(pos)+1, 0)
        FROM user_tabs
        WHERE user_id = :user_id GROUP BY user_id
    );

Moving a tab is already a bit more tricky and I guess I shoud fold the first SELECT into the UPDATE statements, because the "FOR UPDATE" is probably a futile attempt at controlling concurrency with InnoDB's MVCC implementation.

START TRANSACTION;
$pos = SELECT pos FROM user_tabs WHERE user_id = :user_id AND id = :id FOR UPDATE;

if ($pos > new pos) {
    // move tab to the left
    UPDATE user_tabs
        SET pos = CASE WHEN id = :id THEN :newpos ELSE (pos + 1) END
        WHERE user_id = :user_id AND pos BETWEEN :newpos AND :oldpos;
} else {
    // move tab to the right
    UPDATE user_tabs
        SET pos = CASE WHEN id = :id THEN :newpos ELSE (pos - 1) END
        WHERE user_id = :user_id AND pos BETWEEN :oldpos AND :newpos
}
similar
COMMIT;

Now adding a portlet follows a similar pattern as adding a tab. With the JOIN hidden within the SELECT inside the INSERT, I also make sure that the portlet is being added to a tab that is actually owned by the current user and that it even exists to begin with. Notice that I am adding the portlet at position negative one (or actually one lower than the current minimum). Remember that portlets are added to the top left corner, so I need to push all the other portlets down, which I cannot do inside the INSERT. When I SELECT the portlets to display (the query is not shown here) I only read those that have a positon greater or equal to zero. This way I make sure that I get a consistent view because the added portlet will not show up until the subsequent UPDATE pushes the portlets up one level.

START TRANSACTION;

INSERT INTO user_tab_portlets (tab_id, type, subtype, pos, col, is_open, config)
    (SELECT :tab_id, :type, :subtype, COALESCE(MIN(utp.pos)-1, -1), :col, :is_open, :config
        FROM user_tabs ut
            LEFT JOIN user_tab_portlets utp ON (ut.id = utp.tab_id AND utp.col = :col)
        WHERE ut.id = :tab_id
        GROUP BY utp.tab_id, utp.col);

$id = SELECT LAST_INSERT_ID();

if ($id is empty) {
    $msg = 'Tab "id" does not exist';
    ROLLBACK;
}

// move things to 0 as the start again
UPDATE user_tab_portlets SET pos = pos + 1
    WHERE tab_id = :tab_id AND col = :col;

COMMIT;

Next up is moving a portlet to a different tab. This means the portlet needs to be removed from the current tab and added to the top left of the new tab. Again my initial SELECT should probably be folded into the UPDATE. With this fancy CASE statement I am dealing with all of the cases in a single UPDATE statement. I hope Peter approves of this use of a control statement.

START TRANSACTION;

$portlet = SELECT upt.tab_id, upt.col, upt.pos FROM user_tab_portlets upt
    INNER JOIN user_tabs ut ON (upt.tab_id = ut.id) WHERE upt.id = :id FOR UPDATE;

UPDATE user_tab_portlets SET
    tab_id = (CASE WHEN id = :id THEN :newtab_id ELSE tab_id END),
    pos = (CASE WHEN id = :id THEN 0
        ELSE (CASE WHEN tab_id = :newtab_id THEN pos + 1 ELSE pos - 1 END) END),
    col = (CASE WHEN id = :id THEN :newcol ELSE col END)
    WHERE (tab_id = :oldtab_id AND pos >= :oldpos AND col = :oldcol)
        OR (tab_id = :newtab_id AND col = :newcol)
    ORDER BY pos ASC;

COMMIT;

The other move operation is putting a portlet in a different position inside the current tab. Once more the SELECT statement should probably be folded into the UPDATE statements.

START TRANSACTION;

$portlet = SELECT upt.tab_id, upt.col, upt.pos FROM user_tab_portlets upt
    INNER JOIN user_tabs ut ON (upt.tab_id = ut.id) WHERE upt.id = :id FOR UPDATE;

if (column changed) {
    UPDATE user_tab_portlets SET
        pos = (CASE WHEN id = :id THEN :newpos
            ELSE (CASE WHEN tab_id = :tab_id THEN pos + 1 ELSE pos - 1 END) END),
        col = (CASE WHEN id = :id THEN :newcol ELSE col END)
        WHERE tab_id = :tab_id AND ((pos >= :newpos AND col = :newcol)
            OR (pos >= :oldpos AND col = :oldcol))
        ORDER BY pos;
} else if (positon inside a column changed) {
    if (old pos > new pos) {
        // move portlet up
        UPDATE user_tab_portlets
            SET pos = CASE WHEN id = :id THEN :pos ELSE (pos + 1) END
            WHERE pos >= :pos AND col = :col AND tab_id = :tab_id;
    } else {
        // move portlet down
        UPDATE user_tab_portlets
            SET pos = CASE WHEN id = :id THEN :newpos ELSE (pos - 1) END
            WHERE pos BETWEEN :oldpos AND :newpos AND col = :col AND tab_id = :tab_id;
    }
}

COMMIT;

Now for the last operation I am still wondering the most if I am on the right path: pruning tabs/portlets that have been marked for deletion on the next page load or delete operation (remember only one step can be undone). What in theory should be a simple delete on two tables, in the end forced me to do some cleanups to ensure that the positions are always just one apart. This is important in order to ensure that my move operations work. Otherwise the hardcoded "-1" and "+1" would not be enough to ensure that a tab/portlet is moved.

Of course I could tweak the above queries to not have these hardcoded additions/substractions, but that would mean that I would have to somehow dynamically compute the "distance" between two items in these ordered lists. I guess it could be possible to do this if I do the UPDATE's in the moves with an ORDER BY pos and use MySQL variables to remember the position of the previous row. What do you guys think?

Also I have another problem with the current implementation is that the SELECT .. FOR UPDATE will not really be ideal to handle concurrency, since it does a LOCK of course, but worse does not even lock all the relevant rows. Folding it into the UPDATE does not however seem possible like in the above cases.

/*
is the cleaning up needed? probably yes, because otherwise I need to take into
account varying "differences" when moving tabs/portlets (-1 and +1 will not
ensure that a tab/portlet moves up/down)
*/
START TRANSACTION;

// get all tabs that were marked to be removed or that contain portlets that
// were marked to be removed for the current user
// we need this so that we know in what tabs we need to clean up the positions
$tabs = SELECT ut.id FROM user_tab_portlets utp, user_tabs ut
    WHERE utp.tab_id = ut.id AND ut.user_id = :user_id
        AND (utp.is_removed = 1 OR ut.is_removed = 1) FOR UPDATE;

// delete all portlets whos tab was marked for removal or that were marked for removal directly
DELETE utp FROM user_tab_portlets utp, user_tabs ut
    WHERE utp.tab_id = ut.id AND ut.user_id = :user_id
        AND (utp.is_removed = 1 OR ut.is_removed = 1);

if (affected rows > 0) {
    // clean up portlet positions
    foreach ($tabs as $tab) {
        SET @poscolleft := -1, @poscolmiddle := -1, @poscolright := -1;
        UPDATE user_tab_portlets SET pos =
            (
            CASE
                WHEN col = 'left' THEN (@poscolleft := @poscolleft + 1)
                WHEN col = 'middle' THEN (@poscolmiddle := @poscolmiddle + 1)
                ELSE (@poscolright := @poscolright + 1) END
            )
            WHERE tab_id = :tab_id ORDER BY pos;
    }
}

// delete all tabs that were marked for removal
DELETE FROM user_tabs WHERE is_removed = 1 AND user_id = :user_id;

if (affected rows > 0) {
    // clean up tab positions
    SET @pos := -1;
    UPDATE user_tabs SET pos = (@pos := @pos + 1)
        WHERE user_id = :user_id ORDER BY pos;
}

COMMIT;

Would love to get some feedback on these queries from you all!

Update [17/06/2008]:
Actually folding the SELECT into the UPDATE is not possible (which is why I probably did not do it back when I made the initial implementation of all of this a few weeks ago):

mysql> UPDATE user_tabs SET pos = pos WHERE pos BETWEEN 1 AND (SELECT pos FROM user_tabs WHERE id = 1);
ERROR 1093 (HY000): You can't specify target table 'user_tabs' for update in FROM clause

Comments



Re: Musings on ordered lists inside RDBMS

Regarding the FOR UPDATE for the user_tabs transaction....

why even bother at all? Would the exact same user be changing their tab/portlet preference at the same time? In my mind, there is no need for a transaction at all...

-Jay

Re: Musings on ordered lists inside RDBMS

What you're doing is similar to dealing with a nested set (hierarchy), there's an abundance of sample code available for that (see http://www.openwin.org/mike/wordpress/wp-content/uploads/2007/04/managinghierarchies.ppt), including stored procedures.

Also consider making the position a float, that makes it easier to insert without shuffling.

Re: Musings on ordered lists inside RDBMS

Actually, you can 'fold the SELECT into the UPDATE', but not in the way you think. Enter the multi-table UPDATE syntax. For example for moving the tabs:

UPDATE user_tabs u
JOIN user_tabs s
ON u.user_id = s.user_id
AND s.id = :id
AND s.pos != :pos
AND u.pos >= LEAST(s.pos, :pos)
AND u.pos <= GREATEST(s.pos, :pos)
SET u.pos = CASE
WHEN u.id = :id THEN :pos
WHEN s.pos > :pos THEN u.pos + 1
WHEN s.pos < :pos THEN u.pos - 1
END

Note #0: If this is executed in a (implicit or explicit) transaction, there is no need for a FOR UPDATE select, because the statement will be executed atomically.
Note #1: The portlet moves can be written using the same technique - just variations on the same theme.
Note #2: I guess some people would rather move some of the conditions to a regular WHERE clause but in this case I like to keep all params together
Note #3: The multi-table UPDATE is non-standard and supported by MySQL and MS SQL - not AFAIK Oracle and Postgres

For the deletion/cleanup problem, you can probably use some multi-table UPDATE and DELETE magic as well. I have a few sketches but it's taking me too long right know to write it out in full. If you really, really need it, send me a pm and I'll work with you.

kind regards,

Roland Bouman
http://rpbouman.blogspot.com/

Before you can post a comment please solve the following captcha.
your name