ramblings on PHP, SQL, the web, politics, ultimate frisbee and what else is on in my life
back

OOP<->RDBMS

It seems the slides that generated the most interested in my recent "Database meets OOP" talk were the ones on how to model inheritance inside your RDBMS. I wanted to mention two (and a half) additional approaches.

The first one I have no real experience with and is native inheritance support that can be found in Object-Relational DBMS (not to be confused with Object-Oriented DBMS, which are not relational). PostrgreSQL is a shining example of this, so they have native support for inheritance. Actually the PostgreSQL docs mention that since SQL99 inheritance is part of the spec, not sure if anyone has implemented it though. From my reading it looks like its just a convenience feature to spare you from having to writing joins (when fetching data from a child table) and also instead of 2 inserts, you can do it all in one insert (although you have to specify the proper child table, it does not magically figure out that if you insert into the parent table with additional fields for a specific child table to insert into the child table). You can get a lot more convenience out of an ORM however and I am not sure if you are getting much of any performance a gain.

The second (and a half) approach is breaking the relational setup a bit. Recently XML support has been coming increasingly powerful inside RDBMS. DB2 Viper has been at the fore front of this, allowing you to define indexes on XML data. PostgreSQL and MySQL are also working on beefing up in that department. You can validate your XML against a schema in order to prevent total chaos in your neatly organized RDBMS as well. With XPath support you can elegantly fetch and search inside your XML. This way you can essentially store all specializations in the same table. Of course you should apply reasonable constraint and not get into the idea of storing everything under the sun in a single table. But if you only manage your frequently read fields as classic columns and move all the less frequently queried fields that are dependent on specializations into XML, you can do some pretty cool stuff.

The "and half" for this solution should actually be called "minus half", since its actually a downsized solution to the XML approach that I have employed in production quite successfully (not really for inheritance, but sort of .. as the point was to store arbitrary attributes that I do not intend to query on in a CMS). For this I simply serialize the given attributes into a string and dump that into a text/clob field. With some magic inside MDB2, I could make the (un)serialization an automatic step. Using an ORM you could just as well use this to handle your inheritance, using the one table for all specializations approach. However you spare your poor database the tons of optional columns that are not relevant for all specializations. I guess you could even query on this serialized string to do some filtering. If it happens rarely enough you might be willing to go with this trade off.

I am sure there is still a lot for me to learn in this area, so I invite my readership to expand my horizon :)