Inefficient Android JSON parser popularized by Stack Overflow

Below is a rant / tip I posted on my school's internal Computer Science Facebook page. It was popular enough that I've decided to immortalize it here. If you find in inaccuracies with it, or have further suggestions, please let me know.

If you're just interested in what a good way to load JSON would be, OkHttp + GSON is probably decent.

Real quick Java tip, for those of us who have to, or choose to, keep using Java after our time at UCF (thanks Android). This wasn't covered in my Java classes at UCF, but maybe I just got unlucky (read: javax.swing all semester). But I think this is really important to keep in mind, especially since I keep seeing it in code I'm working on.

So first off, think about this example:

String str = "Hello World";
String newString = "";
for (int i = 0, int len = str.length(); i < len; i++) {
    newString += str.charAt(i);

What's the runtime? O(n^2), where n = len. Why? Because we're allocating a new String object for every letter of the input string, and copying the new string letter-by-letter into the new string instance. By the time we reach the end of the string, we have to do another n operations. Ok, so technically we're not strictly doing n * n operations, but it's still approximately quadratic when it should be linear.

Even further, what happens to all those old String instances that we're no longer using? They get GC'd. And we can end up allocating so many of them throughout that loop that we freeze the main thread to do a GC operation, dropping frames in an Android app. Not great.

But there's a ton of sample code out on StackOverflow that I keep finding, and once used, in Android applications, that has horrible performance. It looks something like this:

HttpResponse response = httpclient.execute(httpget);
InputStream inputStream = response.getEntity().getContent();
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
String line;
String result = "";
while ((line = bufferedReader.readLine()) != null)
    result += line;
return new JSONObject(result);

Not only do we have the same problem as before (but with lines instead of characters), but we're also generating an entire string from a stream, and then piping that string into a parser to turn it into an object. That's a waste of CPU since we could just stream the input stream into a JSON parser and have our JSON object built as we're getting data streamed over the network.

There's a slightly improved version of the above example floating around Stackoverflow, too, that uses a StringBuilder instead of a String. A StringBuilder is basically an array-backed data structure that you can pipe characters into and construct into a String once all the characters are added. It's pretty awesome and is used to solve the problem in the first example. However, it's worth keeping in mind that the StringBuilder is backed by an array, and the initial size is usually 16. That's tiny in terms of JSON data. And it seems to grow mostly linearly, too:

Capacity: 16, Length: 0
Capacity: 65, Length: 65
Capacity: 132, Length: 130
Capacity: 266, Length: 195
Capacity: 266, Length: 260
Capacity: 534, Length: 325

Like an ArrayList, every time the capacity is exhausted, the array needs to be trashed and reallocated with a new length, and with it's default size and appending long strings, we're pretty much reallocating the array on every append, as seen above.

So three things:

  1. Use a stream, don't build a String unless you need it (JsonObject from GSON can be constructed with an input stream)
  2. If you have to use an StringBuilder, set the initial capacity to something you expect your input to be (hint: HTTP response headers should have a 'Content-length' field)
  3. Everywhere in life, think about how an underlying data structure works, whether it's Python, JavaScript, etc. Here's a neat table about Python operations and their time complexities:

Ok, whew, got through all that. I don't have a high enough rank on StackOverflow to fix the answers in this question, but if someone does, pleaseee do so. That code needs to die.

Also, I genuinely mean for this post to be helpful, not a show-off or a rant. If I'm totally wrong about something, please let me know. Trying to get much better at Java and this ended up being one of the major bottlenecks in several Android apps I've worked on.

Alright, peace!

Original source, for those with UCF Facebook access.