Posted on Saturday 16 December 2006
I just read Mike Potter's blog post on his speed comparisons between amfphp, JSON and REST-style vanilla XML, and I'd like to clear some FUD, as I believe it is unfair both to amfphp and JSON (was on vacation at the time it was published). I am sure that Mike had no bad intentions when he wrote this, as he has invited me to write an article on amfphp at DevNet and steadily covered progress in amfphp over the past months, but I think his article is a bit misguided.
The key thing here is that there are several elements to consider when comparing speed between RPC models. The post, while being well-intentioned, doesn't differentiate between these different steps, and so gives a skewed view of what influences responsiveness in an app.
In most any RPC model, there are the following steps:
- Serialization of the request on the client-side
- Uploading of the message to the server
- Deserialization of the request on the server-side
- Dispatching on the server-side
- Serialization of the response on the server-side
- Downloading of the message to the client
- Deserialization of the message on the client-side
So, first things first. The client may need to serialize a request to the server-side. In the AMF case, the serialization is done in C in the Flash player, so it's extremely fast. In the JSON and XML cases, it's a little more complicated. If the sent data is "simple", then trivial serialization needs to be done through the query string. If it's not "simple", then in the JSON case it would need to be encoded using an ActionScript class, and in the XML case it would need to be constructed manually.
Then comes the uploading. The question is: how efficient is data encoding in AMF vs. JSON vs. REST-style XML? Well, if we examine the data being sent through the MochiTest.getTable example, which is behind this page, we find that the result is about 800 bytes when coming back from AMF and about 400 bytes when coming from JSON. But there's a catch here. When AMF is used, it sends back a few headers and a surrounding body to deal with such things as batched calls, while JSON doesn't. This surrounding data accounts for about 300 to 400 bytes per request. So if instead of sending back 5 results, we sent back 500, we would get a dead heat between JSON and AMF with about 40KBs each, give or take 2-3 KBs. In some cases AMF is more efficient: for long strings, for example, say 100K long, AMF would beat JSON by about 400 bytes, assuming that one character in 256 is a double-quote. AMF also has a reference counter for strings and objects, so if you repeat a lot of the same data, it will win. If it sounds like I'm splitting hairs, it's because I am. Both formats are very, very efficient and granted that you use an appropriate data structure for both, they will almost always stay within 10% of the size of each other.
What about XML? Well, since I'm using a MochiKit example, I looked at the equivalent XML furnished in the toolkit and it's about 800 bytes (after removing whitespace), so it's twice as big as the other two. Of course there are many ways in which the schema could be optimized, but it begs the question: should I really be thinking about optimizing xml schemas instead of features? Admit however that we have some smart cookie of a developer who does think about such things, then we can reasonably expect XML to be about 50% more verbose than its equivalent JSON or AMF representation.
Then comes the deserialization of the request on the server-side. If we are passing arguments through the query string in JSON or XML, then there is practically no processing involved. If not, then the data will need to be deserialized manually by the developer in the XML case or through a function using JSON. In the AMF case, the gateway will deal with the details of deserialization, which will certainly be slower than dealing with query-string arguments, but it should be about the same speed as deserializing JSON using a PHP class such as PEAR JSON (more on this later).
Now, as for dispatching, the Rest-style XML usually has a one-to-one correspondance between remote methods and php pages, so it's a very simple system that doesn't have any overhead. The JSON case is similar as well, if you write the code yourself. Amfphp has a dispatching mechanism which can deal with batched calls, debugging info, security and instantiating your classes which adds some overhead. How much overhead? Somewhere in the range of 1-5 milliseconds per call. This overhead is the same in AMF mode and with the new JSON mode.
Now, for the serializing. If you use Rest-style XML, you have the speed advantage, as you know the data being sent, and therefore you don't have to if..else clauses to determine how data is being written. In the AMF case, then the serializing is done through PHP, and it determines types for you, so it's slower. The PEAR JSON case is very similar to the AMF case. However, if you use a native JSON extension, such as is built-in in PHP 5.2, serializing is going to be very, very fast. Faster than both the XML and the AMF case. The JSON extension page states that you can expect it to be 50 to 100 times faster than the equivalent PEAR JSON case. That's really, really fast kids.
The downloading is the same issue as the uploading, and here AMF and JSON are significantly more efficient.
The deserializing on the client-side is native in the AMF case, so it's very fast. In the XML case, if you're using AS3, then it should be fast, although trying to use XPath on 5000 records like in Mike Potter's example might slow the player to a crawl if you're using AS2. JSON here lags behind because it's not native.
Interpreting Mike's results
I hope that I've been sufficiently "fair and balanced" here. I don't want readers to feel that I've used loaded terms or discredited the other sides for partisan reasons. With that in mind, let's interpret Mike's results.
First off: JSON. Why is JSON significantly slower than the other two? Two things: first of all, serialization on the php side. Zend JSON was used, and according to this test, it's about 20-50 times slower on encoding than the PHP C extension. Using json_encode will yield much closer results. Second of all, decoding on the client-side is done through ActionScript, and for that much sent data, that is going to be a major bottleneck, IF there is lots of data being sent, as in Mike's test case. For reasonable amounts of data, such as that you would receive from Flickr for example, decoding is not going to be such a bottleneck, especially if using AS3.
Second of all: REST-style XML. Of course it's faster than the other two solutions. It's native in the player, so it doesn't suffer from the decoding bottleneck. The tests were done through localhost, so the significant difference in message size is not taken into account like it would be in the real world. In fact, one of the readers commented:
"in my tests, the AMFPHP was faster of what the XML the test remotely (my emphasis) it made the difference =)
Best Regards Leonardo França [This matches what Adobe consulting told me as well... I suggest that people do their own tests on their own applications. - Mike]"
Finally, for amfphp, the main bottleneck is definitely the encoding step on the php side. Decoding and dispatching take a very low amount of time, ncoding/decoding on the client-side take almost no-time, and the test is through localhost, so any difference in message size will not have been noticed. So all what we're testing when comparing the speed of Rest-style XML vs. amfphp is the speed of the encoder. Well amfphp seems to be about 25%-40% slower in this case, and I think considering all the internal processing that goes on to encode the arrays and type everything automatically, amfphp does a really good job in this case.
All the XML code does is loop through the array and output a string. In the amfphp case, it goes through the serializer's writeData method. The writeData function then asks: is it a string? is it a boolean? is it an array? Yes. Is it an associative array? Yes. Then encode an object. Then for each element in the associative array, it asks: is it an int? Is it an object? Is a string? Yes. Then encode a string. And so on for 5000 iterations. That it would hold its own in such a stress test shows that I must have done some good ;)
In fact, it's a bit too fast to accurately represent what it really does. In fact the DataGrid component must be taking a good amount of CPU time to render all the elements, and that's probably why XML doesn't beat amfphp by a large margin in this case.
So, while I think the example is contrived, amfphp does a really good job here, as others have verified that the slower encoding is made up by the native support in the player and by the significant difference in message size. So even though this example puts amfphp in the worst possible light and exploits its bottlenecks, the fact that when the test is run live through an internet connection, it still beats xml, hands-down, is quite impressive. But...
It's not a realistic test case
Would you really send 5000 rows back to a client for the fun of it? Well, I hope you're smarter than that. Real-world use of amfphp, xml and json is very different. It's possible to send that much data back to a client, for various reasons, but the data will look very different. Let's see a real-world example.
The app: Zillow. This app sends a load of data back and forth through the wire to update the house lots and such. The first time I looked at it through ServiceCapture I had noticed it sent most of its results through XML. Now it looks like they are slowly migrating to JSON, which I think is a smart move. Let's look at a service invocation to a page called RetrieveParcelResults:
<parcelresults>
<messages>
<warning>
<message>There are too many homes to display. Please zoom in.</message>
</warning>
</messages>
<query>
<region>[mapped area]</region>
</query>
<results></results>
<map>
<regions>
<zindexregion>
<name>Kitsap</name>
<fullName>Kitsap</fullName>
<regionType>County</regionType>
<zindexValue>$276K</zindexValue>
<geoPageUrl>/local/Washington/Kitsap</geoPageUrl>
<regionBounds>
<bottomLeft>
<latitude>47.403172</latitude>
<longitude>-123.036507</longitude>
</bottomLeft>
<topRight>
<latitude>47.938976</latitude>
<longitude>-122.443352</longitude>
</topRight>
</regionBounds>
<location>
<latitude>47.671074</latitude>
<longitude>-122.73993</longitude>
</location>
</zindexregion>
<zindexregion>
<name>King</name>
<fullName>King</fullName>
<regionType>County</regionType>
<zindexValue>$421K</zindexValue>
<geoPageUrl>/local/Washington/King</geoPageUrl>
<regionBounds>
<bottomLeft>
<latitude>47.084343</latitude>
<longitude>-122.539398</longitude>
</bottomLeft>
<topRight>
<latitude>47.780499</latitude>
<longitude>-121.064598</longitude>
</topRight>
</regionBounds>
<location>
<latitude>47.432421</latitude>
<longitude>-121.801998</longitude>
</location>
</zindexregion>
</regions>
</map>
</parcelresults>
I'm sure you can imagine the code required to generate that. If it was PHP (it's most likely Java as there is a JSESSIONID in the cookies, but for the argument's sake), it would likely look like this:
<?php
//Assuming the data has been received beforehand
$xml = "<parcelresults><messages>";
foreach($messages as $message)
{
$xml .= "<" . $message['level'] . "><message>" . htmlspecialchars($message['message']) . "</message></" . $message['level'] . ">";
}
$xml .= "</message>";
$xml .= "<query><region>" . $_GET['regionType'] . "</region></query>";
//and so on and so and so forth for another 50 lines or so
$xml .= "</parcelresults>";
echo $xml;
?>
Of course, if this were amfphp or JSON, that whole xml serialization step would be skipped. On the Flash side, since they are using AS2, it would be as painful as on the php side, unless they are using XPath, in which case it would slow it down a bit. Also, everything returns as a string, so they will have to wrap location.latitude and location.longitude in Number(). As far as size goes, well, specifying only two locations yields about 1KB of data. If they had 100 locations (as could definitely happen with house lots), that would be 50KB right there, while the equivalent JSON or AMF would surely be under 20KB. Yes, they could shrink the size of the XML by changing schemas, but they wouldn't have to think about schemas if they had used AMF or JSON.
So, think to yourself: How much time was wasted by the developers to think about the current schema? How much time did they have to play with the code so they wouldn't forget the occasional htmlspecialchars or \"? How much time was wasted on debugging that thing? How much time was wasted while the client was waiting for the XML file being downloaded? How much time was wasted while the client processed the info using XPath or childNodes.childNodes.childNodes?
It is in these real world situations that JSON and AMF shine. And when I talk about AMF, I don't mean just amfphp, I mean SabreAMF, WebOrb, OpenAMF, Fluorine, Adobe's offerings, Red5 (soon to come), you name it. That I can take the same PHP file and make it run as a service in SabreAMF, WebOrb or amfphp with practically no modifications speaks volumes. Similarly, I can take a Flash file, change the gateway location and plug my RIA that was developed with amfphp into Fluorine.
In those real world situations, JSON and AMF will not only be faster to develop, to deploy, and to debug, they will be faster to run thanks to the efficient message encoding and built-in language support (meaning, JSON in JavaScript and AMF in ActionScript).
Conclusion
Don't get me wrong, Rest-style XML has its uses. If you want to read an RSS feed for example, I wouldn't suggest passing it through amfphp (as some have suggested in the past). If you're using a ready-made web-service that is only available through SOAP, then by all means do it.
If a provider, like Flickr or Yahoo!, gives you the choice between JSON and XML in an AJAX app however, I would definitely suggest using the more efficient JSON. If you have the choice between JSON and XML for a Flash app, I would say it's a toss-up (decoding being non-native in the JSON case, XML being more bloated in the other case).
Now if you're developing the back-end services yourself, then please, by all means, use AMF if you're using Flash, and JSON if you're using AJAX. You don't have to choose between the two, now that amfphp and WebOrb are moving towards unified platforms for both. In the end, using what's native makes the most sense.


