REPLACE INTO is little bit annoying in case of the usage of InnoDB foreign keys. Triggers interpret it as ON UPDATE, but foreign keys do ON DELETE.
What about full transaction support for DDL changes in PostgreSQL?
Andreas you missed the topic or are you trying to imply that PostgreSQL having support for transactional DDL is a disadvantage ;)
For my last application, that heavily uses bayesian statistics, I choose MySQL because of INSERT...ON DUPLICATE UPDATE... It's brilliant for incrementing counters and keeping caches.
I couldn't find anything in Postgres that would work as nice and worked for hundreds of rows in one go.
Absolutely, INSERT .. ON DUPLICATE UPDATE is awesome. Its also available in MySQL since quite some time. However PostgreSQL 8.4 will get MERGE support, which will allow you to do the same thing (and more).
Some of these points are valid. A few of them aren't really.
"Pluggable storage engines (even without a server restart)"
This does not make sense as something that puts MySQL "ahead" of PostgreSQL. PostgreSQL intentionally prefers consistency over the jungle of conflicting limitations, behaviour and policy in MySQL. In MySQL, columns and indexes behave vastly differently depending on storage engine.
For example, InnoDB does not support fulltext, nor does it do R-tree indexing of OGC geometry data; conversely, MyISAM does not support transactions and won't store "text" and "blob" fields in the row.
PostgreSQL *always* provides transactions, full-text indexing, R-trees (actually GiST), huge text fields and so on. No exceptions, no inconsistencies, no confusion, no gotchas.
"Better 24/7 operations support due to less reliance on cleanup tasks (aka VACUUM)"
Ahem, autovacuum has been integrated since 8.1 or so, and it's completely transparent. It's not an issue.
"More control over the strictness via SQL_MODEs"
This is, unfortunately, bollocks. Unlike MySQL, PostgreSQL has a single, well-defined ANSI implementation that does not suffer from legacy cruft. There's no *need* for more than one SQL mode.
For example, inserting a wrong date, the wrong datatype or a too-long value into a column will always result in an error being thrown: MySQL has traditionally been extremely lax with checking (the February 31 issue is classic), so to fix the old cruft it has to offer a "strict" mode which is now the default. As a result, however, MySQL has to maintain the old non-strict mode, which means the codebase is still littered with crap. PostgreSQL has essentially no such cruft.
I don't see why this is mentioned as a feature. New, non-legacy MySQL apps should *always* use "strict" mode.
"A larger community (everybody knows a bit of MySQL and its gotchas, there are more books and best practices blog posts etc. .. though PostgreSQL has less quirks to begin with)"
PostgreSQL, fortunately, does not really have any significant quirks. Its community may be smaller, but it's component, and not infested with newbies who need to set up the latest phpBB.
Some time ago I wrote a post on MetaFilter about some advantages that PostgreSQL has over MySQL, such as transactional DDL (you can use "create table", "drop table" etc. inside transactions), PostGIS and the fact that PostgreSQL actually has (gosh!) a query planner and optimizer. It's worth a read: http://ask.metafilter.com/92162/MySQL-vs-PostgreSQL#1351559
@Alexander: Your points are all valid to some extend. Like I said in my comment about the SQL_MODE, I prefer flexibility. So if a tool can give me choices, I welcome this. If this leads to buggy code, then I slap the tool for bugs, which is certainly something that MySQL can be slapped for.
The same applies to the storage engines. It gives me flexibility to get a storage engine that is highly optimized for just one very specific task. For example someone is working on a storage engine that implements tree structures efficiently. There will be algorithms that do not fit into a generic storage engine. This includes PostgreSQL, this is where the storage engine concept can shine. For example MyISAM has fast COUNT(). If that is something you care for more than transaction safety, you might appreciate the better performance MyISAM will give you.
That being said, I am the first to say that multiple storage engines bring its baggage (quite significant). The first is obviously choosing one. Then you need to tune each of them separately. You end up with separate buffer pools etc.
As for autovaccum. I know it exists, but it is still going to have an effect if you are running a true 24/7 database that needs constant performance. Autovacuum obviously requires ressources, so whenever it runs there are less ressources for other things. If the system is under constant stress, then you will end up with non constant performance. This is less of a problem for MVCC'less storage engines (again, choice is good) and even for InnoDB, which automatically cleans up everything. So InnoDB cleanup costs are obviously factored into any benchmark you see that does DML.
Finally I am not sure what "MyISAM .. won't store "text" and "blob" fields in the row" is supposed to mean. MySQL's BLOB handling leaves much to be desired (Mark is working on this), but of course you can use TEXT and BLOB datatypes with MyISAM.
Thanks for not being overly defensive. :-)
"It gives me flexibility to get a storage engine that is highly optimized for just one very specific task."
I would buy your argument if MySQL had a rich and varied set of specialized sub-databases, which is not the case. In practice, I think you would choose the tool most appropriate for the task; and MySQL offers a decent semi-relational storage (InnoDB), a shoddy tuple storage (MyISAM), a shoddy in-memory tuple storage (MEMORY) and not much else. That's not much flexibility, and at worst comes across as indecisive and schizophrenic.
I would rather have a relational database that has a clear goal, a consistent personality and a well-defined feature set, and which leaves other technologies (Memcached, CouchDB, Amazon SimpleDB, Hadoop/HStore, Neo, what have you) to implement alternate non-relational concepts like trees and non-transactional key/value storage.
Incidentally, PostgreSQL has pretty good support for trees. There's a plugin giving you Oracle-type recursive queries, as well as a plugin called ltree (http://www.sai.msu.su/~megera/postgres/gist/ltree/) that indexes trees and let you do rally fast path queries like "Countries.Europe" to match leaf nodes like "France" and "Germany".
"For example MyISAM has fast COUNT()"
PostgreSQL's is pretty damn fast nowadays, but yeah, it's not as fast as MyISAM's. But that fast counting comes at the expense of many other things, so it's not like your ordinary use case is "I need something that I can use to count really quick". Your requirement would be something like "I need something that can count really quick, doesn't have transactions and corrupts tables occasionally out of sheer incompetence". That's not a requirement that I come across very often, and when I do, I tend to prefer flat files or BerkeleyDB-type storages or other tools that let me build exactly what I need. As you say, every MySQL brings baggage. For example, MyISAM only has table-level locking, so even if you have fast counting, it comes at a price: horrible concurrency.
"As for autovaccum. I know it exists, but it is still going to have an effect if you are running a true 24/7 database that needs constant performance. ... less of a problem for MVCC'less storage engines (again, choice is good) and even for InnoDB"
But InnoDB is MVCC-based, too, just like PostgreSQL. InnoDB is also doing vacuuming -- except it's called "purging". It's not advertised very often, and like PostgreSQL it's automatic. When you update the database, the database files grow until the next purge, at which point you suffer a performance hit. I believe deletes and truncates actually vacuum on the fly, unlike PostgreSQL which delays the vacuuming until later.
MySQL's counterpart to PostgreSQL's manual vacuum command is actually this:
alter table tbl_name engine=innodb;
(That's not terribly obvious or self-documenting, but oh well, this is MySQL, where there is a "grant" command that actually creates users.)
Keep in mind that autovacuuming is incremental; PostgreSQL distributes the cost of vacuuming over time instead of up front, so it's not like your "24/7" system is going to suddenly feel sluggish because somebody deleted a milion rows.
"Finally I am not sure what "MyISAM .. won't store "text" and "blob" fields in the row" is supposed to mean."
Here's what I meant: MyISAM stores blobs separately from the row, in some separate file or table. Consequently, when selecting or updating rows, MySQL has to go off and deal with that separate blob storage. This applies to both "text" and "blob" types. InnoDB, on the other hand, stores part of the column inside the row so that when you fetch the row, the data is probably there to be found.
PostgreSQL has virtually unlimited text and binary fields, and they are treated as normal columns. If you store 1GB of text in a column, part of it (usually 2K, but you can change the storage strategy, eg. to fill up the row as much as possible) is stored in the row, and the rest is stored separately and compressed (or not; this is optional). These columns are not hampered in any way -- they're indexable, they don't need an artifical limit (so you declare a string column as "text", not "text(100)"), they support full-text indexes etc.
PostgreSQL also has blobs, or "large objects". These are partitioned into chunks and streamable, and are quite fast for random access (eg., seeking halfway in and reading a chunk).
First of there is no reason to be defensive. I have appreciation for both databases (as well as others like SQLite and Firebird).
I think the idea behind 5.1 is to grow the world of storage engines because they are now pluggable, at runtime. And its interesting that you mention Memcache. There is a storage engine that can work as an interface to Memcache too. Unfortunately IBM does not seem to have interest to maintain the SolidDB storage engine after the aquisition of Solid, but these are the kinds of unique engines that I think are quite cool to get behind the same interface. But like I said, it does bring its challenges in administration and ever so subtle differences (storage engine authors could really do with a comprehensive test suite).
As for the storage of TEXT/BLOB in MySQL. I have to check this in detail. I was under the impression that MyISAM stores everything along with the rest of the row data. Which IMHO is often not ideal, since for the most part I see especially BLOBs just put into the database for referencial integrity and easy replication/backup. So the data itself is seldom read, since you tend you cache it in the file system. So storing the BLOB along with the rest of the data means that you need more I/O to read the same amount of data. This would push people to having to separate out these columns with 1:1 relations.
As always its hard to compare versions of two different databases, because their releases rarely coincide. So my estimate was that MySQL 6.0 would not be released this year, but 8.4 would be. But I guess I remembered the 8.4 release cycle incorrectly, which plans to come to a close in March 2009. Also MERGE support might not make it after all. Of course its anybodies guess when MySQL 6.0 will come along. But I guess that MySQL 6.0 will not really bring new features (I really do not consider the Falcon storage engine as a new feature relevant for this comparison) and instead will continue some of the performance improvements to subqueries etc. The big feature addition is online backup, which is already available in PostgreSQL. However I do not know how each solutions stack up to each other in a side by side comparison, but given the more advanced transaction capabilities from PostgreSQL I would assume that MySQL will have a hard time to out run the PostgreSQL solution.