« OCPtr - a Smart Pointer for Objective C | Main | The Ultimate C++ Unit Test Framework »
Friday
Aug132010

We Don't Need No Stinking Garbage Collection

iPhone developers are sometimes called names or excluded from polite circles for not having Garbage Collection. Recently I gave a presentation at the London iPhone Developers Group meeting which covered this matter from a few angles - including some techniques of my own. There was enough interest that I thought I should write it up in a more widely distributed format - where I could also expand on some points I glossed over due to time constraints (it was 20 minute slot).

Koviks avfallsdeponi.jpg

What's that Smell?

So what's the stink about? Well, if you develop for the iPhone you're going to have to use Objective-C (at least for some of your code - e.g. the UI). Until recently Objective-C has not had any form of Garbage Collection. Memory must be managed manually (although helped by a reference counting system). As of Objective-C 2.0 there is now a GC facility but this is only available on Mac OSX - from Snow Leopard on.

Most languages in common use today have Garbage Collection. There are three stand-out exceptions: Objective-C (on the iPhone), C and C++. As it happens this is precisely the same trio that are sanctified for use on the iPhone (I'm deliberately avoiding the issue of front-end languages/ platforms such as Monotouch - especially while their status with respect to the developer licence remains in doubt)!

So why no GC on the iPhone? Before we look at that I think it's worth a quick review of why anyone would think it was needed in the first place.

Remembering to forget your memory usage

Consider this typical snippet of Objective-C code:

NSString* str = [[NSString alloc] initWithFormat: @"One: %d", 1];

// ...

[str release];

The use of NSString here is not that interesting. What we're looking at is that, because we used alloc to get the string we need to remember to call release when we're done. Note that release doesn't necessarily deallocate the memory for the string there and then. The Objective-C runtime uses a reference counting (or retain counting, in Obj-C parlance) system to track when the last reference is released - at which point dealloc will be called - giving the object a chance to clean up its own resources.

This is hardly rocket science, and already has many advantages over the raw C way of doing things where ownership can be passed around like a hot potato and often it is difficult to know if you should be responsible for freeing or not (yes - conventions exist to mitigate this - but they are not standardised).

But when you start putting stretches of code in the middle, maybe with multiple exit points (unless you're a SESE fanatic - and eschew exceptions) it already starts to add mental overhead. Not much, maybe, especially to an experienced developer - but spread across thousands of instances it adds up. When you're writing code you want to focus as much as possible close to the domain level of abstraction - and these language mechanics issues can detract from that.

As well as the pattern shown in the example above there are other variations to consider. If the object is an instance variable you have to put the release in the dealloc method (or set it to nil via a property if you prefer). If the object is obtained from a static convenience constructor (such as [NSString stringWithFormat:]), or if you send it the autorelease message, then you should not call release yourself - but you do need to be aware the the object will live beyond its last use - which may be significant.

If you are given an object, but not passed ownership, and need to keep hold of it then you will need to send it the retain message (then later remember to call release again).

Whichever case it may be, doing it wrong has consequences. Failing to call release will result in leaks (which may lead to a crash - but much later). Failing to call retain could mean that you are using an object that may have been deallocated. This will likely lead to a crash sooner. If you're really lucky it will crash close enough to the source of the problem to help you find it - unless your users find it first.

Leaks and crashes are serious problems. Even if 99% of your code is correct in its memory management, just one or two such bugs can ruin your user experience (not to mention your app store ratings).

Deodorant

There are some tools that can help. For leaks we now have the LLVM static analyser integrated with XCode. This will work with the compiler to analyse your code paths, looking for misplaced releases or retains. It does a pretty good job and I highly recommend using it regularly. But its not perfect. It misses quite a few cases - especially if the code paths get complex. It can also give false negatives.

At runtime we can also use the Leaks instrument, which will track your references and see when objects are still alive after their last use. I suspect that internally it implements garbage collection, or something like it, to do the reference tracking. Instruments itself has got very good lately - making it much easier to filter out all the noise and see just your bits of the code. Again I highly recommend using this tool. But again it won't catch everything. In particular it will only test the code paths you use while its running.

For over releases, or releasing too early, we can check stack dumps. This may help us to track down the source of a crash - if it's not too far from the suspect code. A better tool, though, is NSZombies - enabled with the NSZombiesEnabled environment variable. With this in operation objects that would normally be dealloc'ed get transmuted into zombie objects instead. These will detect if you send any messages to them and log the fact. There are also a handful of other tools and techniques for tracking down leaks and over releases after the event.

So we can cover up the smell to some extent - but just as with real smells, masking is not the same as removing. We have additional mental overhead distracting you from your task - and extra tools and techniques required to apply after the event.

That's rubbish

So why is Garbage Collection not provided for iOS, given that it's been available on the Mac for about three years?

The short answer is: performance. The slightly longer answer is that GC has an overhead. It's an overhead that you pay every time you run your app. Much of the time the overhead may be invisible, or at least barely noticeable. Sometimes, though, it becomes very noticeable. Common tasks such as scrolling through long lists are famously smooth user experiences on iOS devices. Not so on other platforms, which can tend to be glitchy and jerky.

While bad code is something that can plague any platform (and there are other factors, such as hardware acceleration)- at least it is something you control. Issues caused by GC, though, are outside your immediate control. Depending on the GC implementation you may have access to an API that allows you some degree of control - but this is usually in the form of hints.

Whether you are willing to pay the cost or not, whether you think it's an issue or not, others clearly do have issues with it (and they may be your users). So Apple, as platform provider, has taken the design decision to hold back on providing garbage collection.

This leaves us with managing memory in our own apps. Notice I didn't say "manually managing memory" there. Any good developer - faced with implementing the same patterns of code over and over - even small snippets of code - but especially where missing them is easy and dangerous - find ways to factor out the duplication.

This is a problem that was solved years ago in C++, employing a concept known as RAII.

RAII of light

What is RAII, and how can it help us in Objective-C?

RAII stands for Resource Acquisition Is Initialisation - which is a long and complicated sounding name for a simple but powerful concept. It's also an unfortunate name as it sounds like the focus is on "acquisition" and "initialisation", yet the most interesting part is at the end of the lifetime. To explain what I mean let me cast it in Objective-C terms (if you're familiar with C++ and this concept feel free to skip ahead):

An Objective-C class has one or more initialisers, and a dealloc method. We use these methods to manage our resources (usually pointers to other objects).

In our init method (or designated initialiser) we typically either allocate other objects, or retain objects passed in. Either way we own references to them that we must release later, in our dealloc method.

Of course, we never call dealloc directly - it is called by release once the last retain count has gone.

In C++ the equivalent of init methods are called constructors and the analog to dealloc is the destructor. Where C++ differs is that instead of all objects being reference counted heap objects we have either plain heap objects or value types (which typically live on the stack or, if they are member (instance) variables they are scoped by the lifetime of the object they are a member of).

The nice thing about value types is that they are destroyed immediately, and predictably, at the end of their scope. This feature is so important it has its own name: Deterministic Destruction.

{
	MyClass myObject;

	// ...

} // destructor will be called here

Now because the the destructor is called automatically at the end of the scope - regardless of how it leaves that scope (so it allows for early returns or exceptions), and we can write arbitrary code in the destructor, this gives us a hook. We can use this mechanism to write code that will be be called as the execution leaves a scope. In fact this technique, along with templates and a bit of operator overloading, has been used for years to implement wrapper classes for raw C++ pointers that look and feel like pointers but have some sort of automatic memory management mixed in. These classes are known, collectively, as smart pointers. Reference counting smart pointers are just one possibility but there are many smart pointer traits that can be captured this way - such is the flexibility (and complexity) of C++.

This is all well and good - but we don't have custom value types, templates or operator overloading in Objective-C. So what use has this discussion been?

Objective C++

Apple's officially sanctioned languages for iPhone development are Objective-C, C and C++. But actually there is a fourth language, not listed there because it's a hybrid. That "language" is Objective-C++. I use the word language lightly here because Objective-C++ is a more of a glue language than a language you would use in its own right. Typically it is used to bridge the worlds between pure Objective-C and pure C++ in order to keep them at arms length (pun intended) from each other.

But there is no technical reason that you couldn't write an entire application, top-to-bottom, in Objective-C++. The reason you wouldn't typically do so is that they have very different syntaxes, strengths and weaknesses, and design trade-offs. Mixing the two just doesn't look like code written in a single language. C++ is oil to Objective-C's water (and we'll resist the temptation to point out how big the leaks can get if you mix oil and water!).

We're going carve out a new niche here. We're not going to freely mix the languages without constraint. But we're not going to keep them siloed either. Instead we're going to use some judicious helpers from the C++ side to assist our otherwise pure Objective-C. I don't suggest this lightly. There are certainly some that would take issue with this approach. We'll discuss these trade-offs later.

In the follow-on article I'm going to put this all together and show you my Objective-C++ solution: OCPtr.

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (7)

Amen brother!

No garbage collection also helps weeding out lame developers ;-)

Btw, your font size is a bit too large though...

Thanks

August 13, 2010 | Unregistered CommenterSasmito Adibowo

I wouldn't go quite that far. GC is a perfectly legitimate technology - which has other advantages too (such as defragging or compacting memory). But it does have some drawbacks in embedded and HPC environments.
As for the font, I'm having some issues with certain browsers (I think Safari is one of them) which I'll be trying to recitfy shortly.

August 13, 2010 | Registered CommenterPhil Nash

On GC I do had to admit it saved me on the Mac when Apple's NSXMLDocument has a double-free issue – it looks like a certain kind of HTML document causes it to overly release some object although the HTML flag was set when parsing the document. Since I don't have access to the code (and currently haven't found a more suitable HTML parser), I just turned on garbage collection to solve the problem. Turning it on isn't the only one that's needed, I needed to audit all my dealloc methods and see which ones that releases CoreFoundation objects and created finalize methods for those (CoreFoundation objects are not really garbage collected except the toll-free bridged ones).

The memory defragmentation advantage may be moot since the growth of address space is usually a lot faster than the growth of normally installed RAM. That is, when most computers comes with 4GB standard, the address space have already expanded to 64-bit. Since the virtual memory system already "move" memory blocks around, the defragging benefits of a GC is rather moot.

I suppose mobile devices will follow a similar growth curve as desktops in terms of processing power and RAM capabilities. Nowadays 32-bit ARM smartphones have 256MB-1GB of RAM. Let's say in a 1GB RAM smartphone, so the theoretical maximum that an app can use is significantly less than 1/4 of the 32-bit address space available since there isn't any paging file.

October 4, 2010 | Unregistered CommenterSasmito Adibowo

I think GC on iOS will make more sense after:

(1) Multi-core iOS devices are released;
(2) GPU becomes a standard on high-end mobile devices;
(3) Android and Microsoft Windows Phone 7 prove to be serious competitive threats to iOS.

GC interleaved on one CPU with the main thread pretty much kills the CPU cache efficiency, which results in the UI jerkiness, so it is indeed very helpful to have a second CPU to run the GC thread.

Even the first iteration of Windows Phone 7 already demonstrated that one can indeed have both garbage collection and smooth UI, if said UI is mostly rendered by the GPU.

Only competitive pressure from other players will compel Apple to introduce a more modern app dev platform, based on, say, Scala, which should be much easier to integrate with Cocoa libraries than C# or Java, due to Scala's immense flexibility and expressive power.

January 3, 2011 | Unregistered CommenterSergei Lopatin

@Sergei: You are quite likely correct. I never said that jittery scrolling was inevitable with GC - just that they do seem to currently be strongly correlated.
I'm not sure that the GPU is enough, though (although it certainly helps). Even when scrolling there is a lot more going on than just GPU operations. New cells have to be created/ reused and populated - and that information loaded into the GPU on the fly. Even with the GPU acceleration such code often has to be optimised to be smooth - so it can be a fine line - and one that non-deterministic garbage collection may push you over. An extra core might make the difference, though (as would faster CPU and/ or more memory).

I fully expect iOS to get optional GC within the next two years (hey, it's the time of year for finger-in-the-air predictions, right?).

January 5, 2011 | Registered CommenterPhil Nash

I come from python programming background!
I recently started iOS development. In my code I always get these 'EXC_BAD_ACCESS' and memory leaks. Its very-very-very annoying!

I think, not having GC makes it difficult for novice iOS programmers like me to write programs. Also makes development very slow coz u r always thinking about optimizing the code rather than the output.
No GC is a characteristic of an obsolete language. A GC is very much needed.

September 28, 2011 | Unregistered CommenterAnish Kumar

@Anish Kumar

If by 'novice programmer' you mean not a computer scientist, i.e. you have no concept of how to manage data and thus memory, then its probably for the best if you stay away from writing iOS apps anyway...

November 12, 2011 | Unregistered CommenterN R

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>