A blog by Gary Bernhardt, Creator & Destroyer of Software

When JSON isn't JSON

23 Jul 2007

JSON is so simple that you can specify it on an index card, but we still can't get it right. For example, here's what happens when simplejson and python-cjson talk about slashes:

# simplejson correctly decodes cjson's data
>>> print simplejson.loads(cjson.encode('/'))
# cjson fails to decode simplejson's data
>>> print cjson.decode(simplejson.dumps('/'))

In this case, the problem is that cjson doesn't handle backslashes correctly. There are two ways to say "/" in JSON: "/" and "\/". When encoding, simplejson always escapes slashes, but cjson never does:

>>> print simplejson.dumps('/')
>>> print cjson.encode('/')

The reverse is also true: simplejson knows how to decode "\/", but cjson decodes it incorrectly:

>>> print simplejson.loads('"\/"')
>>> print cjson.decode('"\/"')

So there you go: simplejson and cjson don't interoperate. This bit me when I tried to move BitBacker from simplejson to cjson for performance reasons. The live alpha server had a few thousand records encoded with simplejson, all of which included slashes. When I switched to cjson, everything broke because every "/foo/bar" entry in the database came back as "\/foo\/bar".

As far as I'm concerned, this problem with JSON is actually an argument for simple data formats like JSON. If we can't get full interoperability between something as stupidly simple as JSON, how did anyone ever expect WS-* to work?