Extra Cheese

A Blog


When JSON isn't JSON

Jul 23, 2007

JSON is so simple that you can specify it on an index card, but we still can't get it right. For example, here's what happens when simplejson and python-cjson talk about slashes:

# simplejson correctly decodes cjson's data
>>> print simplejson.loads(cjson.encode('/'))
/
# cjson fails to decode simplejson's data
>>> print cjson.decode(simplejson.dumps('/'))
\/

In this case, the problem is that cjson doesn't handle backslashes correctly. There are two ways to say "/" in JSON: "/" and "\/". When encoding, simplejson always escapes slashes, but cjson never does:

>>> print simplejson.dumps('/')
"\/"
>>> print cjson.encode('/')
"/"

The reverse is also true: simplejson knows how to decode "\/", but cjson decodes it incorrectly:

>>> print simplejson.loads('"\/"')
/
>>> print cjson.decode('"\/"')
\/

So there you go: simplejson and cjson don't interoperate. This bit me when I tried to move BitBacker from simplejson to cjson for performance reasons. The live alpha server had a few thousand records encoded with simplejson, all of which included slashes. When I switched to cjson, everything broke because every "/foo/bar" entry in the database came back as "\/foo\/bar".

As far as I'm concerned, this problem with JSON is actually an argument for simple data formats like JSON. If we can't get full interoperability between something as stupidly simple as JSON, how did anyone ever expect WS-* to work?



Showing 2 comments

Posted by Mark Ramm at Thu Dec 27 11:31:41 2007

Were you using the version of simplejson with the optional c-speedups included? 

I haven't done any formal benchmarking, but it seems like that significantly reduces the performance difference between simplejson and cjson.


Posted by Matt Billenstein at Fri Feb 1 13:51:54 2008

A fix here:  http://www.vazor.com/cjson.html


Name:


E-mail:


URL:


Comment: