The observant reader of the internals PHP mailinglist might have noticed me crying about the lack of support for getting associative arrays instead of stdClass from ext/soap, which in turn causes issues with caching soap replies into APC.
I am working in a setup where we have a backend JBoss server that provides the PHP frontend with information via a SOAP API. Depending on what information I am fetching, the backend can set a caching strategy (as in what parameters are relevant to determine the cache key) as well as an expiration time. For most services the expiration time is set to 1 hour or more. Some even going up to a week. We are expecting quite a lot more reads over writes (though our preliminary user tracking analysis is still a bit lacking to get proper numbers). Now all I am looking for is a decently fast way to write into a cache, with a very fast solution to read from the cache.
So in an effort of trying to understand the performance implications of the various options I wrote up a micro benchmark to give me a better idea. What I found are some surprising results. So surprising that I am actually wondering if I screwed up with the actual benchmark. As you can see in the code I tested the following options:
(*) Note that using var_export() on an object (including stdClass instances) does not produce proper PHP code you can dump into a file and then include.
Now here are the results (all writes were performed 1000 times to give a better idea of the importance of write versus read performance):
Now there are a number of surprises here. First writing a nested array without prior serialization with APC takes quite some time. Less surprising is that reading this data is then the fastest. I also did not expect the read performance for the include to be this slow. This seems to imply that if you are using php arrays for configuration that its faster to store them serialized in a file.
At any rate, the solution that best fits my needs atm is writing serialized nested instances of stdClass into APC, since I have no way of easily casting all the stdClass instances returned by the various SOAP services into arrays.
It is also clear that if ext/soap could get an option to serialize arrays, it would yet be faster, but Dmitry is not convinced that this a good idea. But the best solution would be if APC could treat stdClass instances as glorified associative arrays. Gopal hinted on IRC that he might be inclined to do this, though he is currently more concerned about QA'ing APC, than adding features like this.
UPDATE 11/06/07 I also added JSON, as you can see its slower than PHP's native serialize, but still not too shabby for something a lot more portable. I have updated the entire benchmark result listing. Generally I should not that I did not take down other services on the machine while running these tests, but while the numbers jump around a bit, the order of what is fast and slow remains unchanged. They are just a bit closer now and then.
Hmm, I just realized, that I probably need to revisit the var_export() solution to ensure that the file that was generated is properly handled as a file to cache via the byte code cache. But I am actually pretty certain that it should have ended up in the byte code cache on the first read iteration.
BTW: I should also explain that in the micro benchmark, I wrote once to all variants before running the benchmarks, because I did not want to have the write performance screwed up by file or cache entry creation, since usually there will be a prior version that will be updated.
It doesn't surprise me that include is slow. I didn't test with any cache engaged, but unserialize is faster than json_decode, which is in turn faster than include, for reading a bunch of nested arrays in 5.1.x.
I'm pretty sure unserialize will always win because it stores string-lengths, so that it doesn't have to parse every character. Although that may change for PHP6 when you need to unescape the Unicode.
Have you tried to send PHP serialized fromat from the Java App as a response to the soap request instead of a structured response? In Java, assembling a PHP-serialized string should be as fast as assembling an XML SOAP response. Then the return value of your soap request would only be a string that you can directly store into whatever caching solution you want.
Gaylord
Well not exactly, but it might be worth investigating a bit, since this way we might be able to get arrays instead of objects out of the soap replies. The serialization is not the expensive or difficult challenge, since we do much fewer writes than reads. The problem is therefore some way to get the data out quickly. But if we only get arrays, we could unserialize the reply and stick the nested arrays into APC, which is the fastest solution for reads.
Well I just talked to the Java guys and they said that it would require hooking into the AXIS framework in order to change the serialization. Instead of PHP serialization we would probably then explore JSON, so I might try to add JSON to my little micro benchmark.
I'm not sure I'm following what you are trying to do but is not using the ext/soap classmap support a way to get out of the stdClass stuff you get back?
--Tony
Well with the classmap, I can force it to use other classes. However I do not want classes, I want arrays. On top of that I am not sure if I can force everything to be casted to a single specific class that would then implement some magic to get me arrays (not sure what that magic would be at any rate).