Friday, June 17, 2011

Jackson and I

Now I need a JSON parser library suitable for App Engine. My needs are pretty modest, a few arrays of simple objects, mostly to handle message lists. I do the usual thing, and look around the internet for a library.

I've written parsers before, lots of them, and there are a couple of ways to go about it. Some parsers care about speed, some care about good error messages, some care about helpful data conversion. It's important to pick one that matches your problem space. In this case, I'm intending to parse lots of JSON coming both from the Internet over http, as well as strings stored securely in the database. That means speed and robustness (against attacks) while being lax about things like error reporting. If the AJAX string doesn't parse because of a network burp, it just retries: it's not going to bother the user with debug screens. In fact, too much user error reporting is a security leak for what I'm going to be doing.

I'd like something well maintained, with all the bugs worked out. That's always the great wish.

So I come across Jackson: [ Jackson JSON Processor | Jackson In Five Minutes ]

It's been released for about two years now, and has gone through the expected flurry of patches as it settles in. A couple of other major OSS projects (eg. Spring) use it as their JSON library, so it's presumably running in a lot of places already. The features read like a personal wishlist:
  • Streaming (reading, writing)
  • FAST (measured to be faster than any other Java json parser and data binder)
  • Powerful (full data binding for common JDK classes as well as any Java bean class, Collection, Map or Enum)
  • Zero-dependency (does not rely on other packages beyond JDK)
  • Open Source (LGPL or AL)
  • Fully conformant
  • Extremely configurable
I downloaded the version 1.8.2 tgz file, used Eclipse to import the packages to my App Engine project (straight out of the archive file, nice one Eclipse!) and was instantly grateful I grabbed the source release rather than pre-packaged JARs because there are some methods on the parser which read and write FileStreams, and GAE blacklists those classes.

At the source level, it's an obvious compile error/warning. I just comment those few constructors and methods out, and everything comes up green in the editor. I wouldn't have been able to do that with the JARs. And the Google tools for Eclipse integrate so well that this was all caught by the IDE before I even did an explicit compile. Probably saved me an hour of screaming frustration right there.

The File-based methods excise cleanly, and the one missing dependency ( the "org.joda.time" libraries, so much for "zero dependency" :-) just needs renaming to "com.google.appengine.repackaged.org.joda.time" now that we're inside GAE, a fix suggested by the editor itself.

I like these new tools. I'm never going back to vi.

So Jackson passes my 'can I get it to compile' test in less than five minutes. Good. Therefore, it's time to read the documentation and try some basic operations.

All software has a personality. Mostly it comes from the programmer, but some comes from the problem itself. Jackson, to me, is an elder Australian tradesman, with a wide toolbelt and depth of experience:
"What's yer problem, mate?"
"I've got some JSON I need to decode."
"Ah, yeah. Much to do?"
"Just a few object lists."
"No worries. There's three ways we can do this."
"Three?"
"Yup. The first is, you tell me everything in advance, fill out the names of the all classes you expect to get, and I'll build up all the objects from scratch. It's called 'mapping'. Good stuff."
"What if I don't know what I'm going to get in advance?"
"Ah, that's option two. I just parse all the data into generic collections, maybe a tree if you like, lots of people like trees these days, and I give you a DOM thing when it's done."
"What's option three?"
"That's for our 'advanced' customers. I don't think ya want that one, not yet."
"What is it?"
"Well, third option is; I just shovel it at you as fast as it comes in, and it's up to you. No storage fees. Good for bulk jobs."
"I see. And you can encode back the other way?"
"Of course, mate. That's the easy part."
I think Jackson and I are going to get along just fine.

Because I'm used to PHP's server-side JSON, I made up a couple of wrappers to emulate those functions:


protected String json_encode(Object x) {
if(mapper==null) mapper = new ObjectMapper();
try {
return mapper.writeValueAsString(x);
} catch(Exception e) { }
return null;
}


protected Object json_decode(String x) {
if(mapper==null) mapper = new ObjectMapper();
try {
return mapper.readValue(x, Object.class);
} catch(Exception e) { }
return null;
}

This generic encode function turns out to be useful for debugging Java state. Not so useful for direct encoding of JDO data however, as there are a lot of intermediate objects that clutter up the serialization. Every 'Text' property gets a sub-object with a 'Value' property rather than just a simple string. I think Jackson knows how to deal with this, and the default serialization behavior can be overridden with some java attributes. That would be useful.

I'll have to read some more.  In fact, now that it's installed and working I can justify the time to read the documentation. Pity it's never the other way around.

No comments:

Post a Comment