And now for the final post in this series, unless I start to get some serious questions :)
But if other NoSQL databases are not trustworthy, why are JCR implementations?
First up, plenty of people run very success large scale mission critical applications on top of MongoDB and friends. The interesting bit is that JCR is an API and the implementations are free to choose any database for final persistence. Jackrabbit for example can be installed using any RDBMS for storage or even the file system. Our own PHP backend implementation can easily support MongoDB, CouchDB or any RDBMS.
Wait a second, so why couldn't we have gone with a NoSQL database again?
Again JCR is an API that is intended to solve the CMS problem and exactly that. Emphasis on API. Because its an API you can implement it on top of different storage layers maintaining whatever preferences you might have. So if you also need an RDBMS to storage reporting data, then you might prefer to store the CMS content also inside the same RDBMS. In the same way in case you are already using another NoSQL solution for some other content, then you might appreciate using it also for the CMS content. So with JCR we simply ensure we have a high level API that enables this choice while leverage a common code base in the frontend enabling the entire community.
Ok, so how do JCR implementations and databases play together?
Lets take Jackrabbit as an example, which basically uses two "databases", one is used for final persistence and the other for full text search. The choice of final persistence layer is mostly depended on what sort of tools one requires. For example transaction capabilities are depend on the storage layer. Same goes for replication and clustering. Backup and failover are also at least to some extend depend on the storage layer. Again JCR is an API (I know I am repeating myself). Now Jackrabbit basically stores the contents of nodes as a blob indexes by its PK and some ways to efficiently provide tree traversal. It delegates to Lucene for the implementation of the normal search. Again this model works well because content in a CMS isn't used in reporting queries. JCR implementations can use these assumptions to focus on efficiently solving the CMS problem rather than trying to be general purpose solutions.
But I have user stats, orders and inventory, that I do need to reporting queries on! What now?
Simple: Do not use the JCR API for this data. Simply store references where needed. Use the best tool for the job. So in a webshop you might use Symfony CMF to store the production information, while using an RDBMS to manage orders and inventory. Eventually we will likely be able to provide a solution that will help manage these references so that for example it would be possible to lazy load references in both directions.
All dandy, but I really do not want to install Java
Fair enough. Right now work hasn't started on a PHP backend. That being said a simple implementation shouldn't take too long to write. Adding versioning might not make that initial implementation, though with CouchDB one could come up with a solution for simple versioning fairly easily. So just start it if you need it. People will be ready to help. Liip for example doesn't have that big of a need for this, so we haven't started it. Most of our projects use Java based Solr anyway, so there isn't any reason to not simply use Jackrabbit. But we do recognize that its important for the long term success of the Symfony CMF to give as much flexibility in terms of the requirements as possible. After all this was one of the main reasons for choosing JCR in the first place!
So when will this all be ready?
I am hoping that we can have a first release this summer that provides a basic toolchain and by the end of the year I hope we have a decent PHP backend implementation. Then again all of this could come sooner if we get more contributions from the community. The recent increase in progress on PHPCR ODM makes me quite hopeful.
Perhaps you could say a word or two on Adobe? Is there any news about them integrating JCR stuff into their publishing products, after having swallowed Day?
I think we (aka. you ;)) have to prototype a robust PHP backend for Jackalope solution first because we can interest more people in contributing. I suggest either using the rather primitive MySQL EAV Schema that Jackrabbit also uses, or using CouchDB. Should be a weekend worth of work of 2-3 people.
Yeah, it would give one less excuse. Then again, everybody has Java installed on their dev machines and all it takes is "java -jar jackrabbit-snapshot.jar" which is no harder than starting MongoDB or MySQL. So I fear once we have solved this irrational fear, the next one will come up about JCR. Which could mean that our strategy will fail because of these irrational fears, which would of course suck.
Anyway, OSS contributions to the ODM are picking up, meaning the reasons I can present to Liip for investing into a PHP backend are increasing. But I still think that a pure PHP backend should be spear headed by the community with support by Liip and not the other way around.
I agree about getting jackrabbit running on a dev machine. It is simple.
The risk I see is running it in production. I have to aquire skills about performance optimization, backup and monitoring strategies, that i just don't have (or want to aquire) at the moment.
Its not about Liip sponsoring this alone. We just have to have a prototype and then find someone that picks up the torch.
I think as soon as you mention the Java backend to a PHP developer they immediately think
1. Is it easy to install?
2. Do I need to learn Java?
3. How do I debug it if the Java goes wrong?
I asked those questions when I started using Solr, but was surprised at how easy it all was. It's strange how just the mention of a Java impementation puts off many PHP developers, but they don't have any problems diving into using MongoDB, MySql, memcache etc.
Maybe it would help if you wrote a blog post about how easy it is to install/run Jackrabbit on different OSs?
@beberlei: Yeah of course running Jackrabbit in production is another story. Not rocket science of course. Its really all about people contributing and thanks to Jackrabbit you can contribute on frontend bundles, the middleware like the ODM or by implementing a PHP backend. I just do not see Java aversion as legitimate reason for not joining now.
Then again, I think I am starting to drive home this point, so there is hope :)