Monday, August 10, 2009

iPhone tutorial: Memory management - the basics

In this tutorial I will try to explain how to manage dynamically allocated memory in Objective C, or more specifically in Cocoa, which is the object oriented programming environment ("API") used for most iPhone applications.


I will start with a long introduction to general memory management so if you already know everything about that, feel free to skip to the Cocoa specifics. The reason for the long introduction is that it's memory management in general that is complex, not the particulars of Cocoa memory management.


A long introduction to general memory management


When you need a block of memory for something in a computer program you normally allocate it by making a call to a memory manager. The memory manager might be provided by the operating system, a library, a framework, or something similar. When you don't need the allocated block of memory any longer you normally free it - also by calling the memory manager - to make it available for allocation again.


In most languages with manual memory management, that is, where you, the programmer, has to decide when to allocate and free memory there is one thing you have to remember: free the memory that you allocated! If you forget to free a block of memory, the computer will sooner or later run out of memory, which will cause most programs to stop function properly. Forgetting to free allocated memory is often referred to as leaking memory.


In simple programs, it's quite easy to decide when to allocate and when to free memory, since these programs usually do things in a linear fashion; you allocate memory, do some processing and then free it.


memoryBlock = allocateMemory(10); // allocate a 10 byte big memory block

process(memoryBlock); // do some processing

free(memoryBlock); // free the allocated memory block


However, as the complexity of a program increases, you might have to keep the allocated memory around for longer periods of time and the linear chain of events (allocate, process, free) might be broken. Furthermore, you will probably have to allocate multiple blocks of memory and perhaps free them in a different order than the one you allocated them in.


When you start interacting with code you don't control yourself (libraries, classes, operating systems, frameworks, etc.) things can get complicated even if the underlying principle is very simple (free what you allocated). It can get especially nasty when you allocate a block of memory and then "give it away to someone else" or if you get a memory block from someone. Who should free that memory block and, just as importantly, when should it be freed? To answer those questions, there has to be a way of deciding who owns a block of memory and/or when it is not needed any longer - something we will dive deeper into later in this tutorial.


Addresses and pointers


When you allocate a memory block from a memory manager you normally get the address of the memory block as the result. The address is simply an offset into the computer's (or program's) memory space, where each byte has a unique address starting from 0 (zero) and increasing upwards.


Let's say that we have a very simple computer system, with only 256 bytes of memory. That is, there are memory address ranging from 0 to 255. If we start by allocating 10 bytes of memory, you would get the address 0 (zero) as the result. If you then allocate 20 bytes you would get the address 10 (0 + 10) as the result since addresses 0 to 9 are already allocated. Allocating yet another 30 bytes would get the address 30 (10 + 20), and so on.


(Memory address: Memory block size)

0...9: 10 bytes

10...29: 20 bytes

30...59: 30 bytes


If you store the address returned by the memory manager in a variable, that variable is often called a pointer, since it "points to" a memory block. In most languages you normally specify that the variable is going to be used as a pointer, since this will allow you to perform "pointer operations" on that variable. It will also enable sanity and error checks. Furthermore, if you can inform the language what type of information the memory block will contain - a number, a string, a specific data structure, an array of integers, etc. - you often get access to even more operations and error checking.


Note that if you don't store the address of the allocated memory block in a variable (or somewhere else) it will be impossible the free the memory block, and you will leak memory!


It's important to understand that a pointer isn't something magical - it's just a value! This means that you can use pointers anywhere you can use variables and values; passing them as arguments to a function or a method for example. You can also create copies of pointers without any strange side effects - it won't copy the memory block! - all that happens is that you now have two variables pointing to the same memory block.


pointer = allocate(10); // allocate a 10 bytes big memory block and store the address in a variable called 'pointer'

pointerCopy = pointer; // assign the value (address) stored in 'pointer' to a variable called 'pointerCopy'


If you want to free the allocated memory block you can do it either by calling free(pointer) or free(pointerCopy) since they both point to the same memory block, but it's very important to free the block only once! Calling both free(pointer) and free(pointerCopy) could have catastrophic consequences! Therefore, once you start copying pointers or passing them as arguments to functions or methods you have to be very careful to ensure that the memory is freed only once!


Ownership of dynamically allocated memory


The simplest way of deciding who should free a block of memory and when is to let the owner of the memory block decide it. In simple programs, the code allocating the memory block is the owner, but in more complicated programs ownership of a memory block can be transferred between different parts of the system (library, operating system, framework).


In some cases it's obvious who should free the memory block. For example, if you have an arrangement where a producer task allocates memory and hands it over to a consumer task, the consumer will free the memory when it is done with it. The same arrangement is often seen when a program interacts with a library or a framework. The library or framework allocates memory for you, but it's your responsibility to free it. In both these arrangements, ownership of the memory block has been transferred from the one that allocated it to someone else.


But what happens if the producer would like to keep the memory block around for its own use after passing it to the consumer? Or what if the producer wants to send the same memory block to a logging task as well as to the consumer task? In a situation like this, the consumer and logger are probably not aware of each other and therefore both would think that they would be responsible for freeing the memory block given to them by the producer.


producer: pointer = allocateMemory(10)

producer: process(pointer)

producer: sendToConsumer(pointer)

producer: sendToLogging(pointer)

...

logger: pointer = getMemoryBlockFromProducer()

logger: process(pointer)

logger: free(pointer)

...

consumer: pointer = getMemoryBlockFromProducer()

consumer: process(pointer)

consumer: free(pointer)

...


If the logger is executed before the consumer, like in the scenario above, the memory block would already be freed by the logger once the consumer gets it. This is catastrophic since anything could have happened to that memory block after it was freed. The worst scenario is that someone else could have allocated the same memory block and changed the contents of it.


The problem here is that it is not clear who owns the memory block - the producer, consumer or logger - and therefore it is not clear who should free it. In fact, one can say that there are multiple owners of the memory block!


Copying memory to avoid ownership problems


A simple way of avoiding ownership problems is to ensure that each memory block only has one owner and that that owner is responsible for freeing it. This can be achieved by allocating a new memory block, copy the contents of the original block into the new block and then hand over the copy to the one needing it.


producer: pointer = allocateMemory(10) // allocate a 10 bytes memory block

producer: process(pointer) // do some processing


producer: pointer2 = allocateMemory(10) // allocate a second 10 bytes memory block

producer: copyMemory(pointer2, pointer) // copy the contents of the first memory block into the second memory block

producer: sendToConsumer(pointer) // send the second memory block to the consumer


producer: pointer3 = allocateMemory(10) // allocate a third 10 bytes memory block

producer: copyMemory(pointer3, pointer) // copy the contents of the first memory block into the third memory block

producer: sendToLogging(pointer3) // send the third memory block to the logger

...


producer: free(pointer) // free the first memory block since we are the owner of it


The consumer and logger both free their respective memory blocks once they are done with it. This works very well, but there are two things to consider; more memory is used since we make a lot of copies and a lot of time is wasted with copying the information from the original block into the copy, which will make the program run slower. On modern computers with plenty of memory and fast CPUs, this is probably no catastrophe, but still a little wasteful. On the other hand, the wastefulness might be a cheap price to pay for avoiding memory leaks or other nasty memory related problems!


Another thing to consider is that the information in the memory blocks (the original and the copies) start "living their own lives", that is, when the information in the original block is changed the information in the copies remains unchanged and vice versa. In some situations this doesn't matter or might even be required, but in others it is catastrophic. The computer doesn't know which behaviour you want/require, so you as the programmer need to inform the computer by writing code that produces the required behaviour.


Copying memory that contains pointers


We have previously hinted that a pointer doesn't have to be stored in a variable - it can also be stored "somewhere else". This other place is normally in a memory block. We have emphasized that a pointer is just a value, but we haven't mentioned anything about what type of value it is. It's easiest to consider a pointer to be a positive integer with a value between 0 (zero) and some maximum value. The simple computer system used above only has 256 bytes of memory, that is addresses between 0 and 255, which means that the maximum value for a pointer in this system is 255. This, in turn, means that a pointer can be represented using a single byte.


Deep and shallow copies


Let's say we have the following arrangement. We allocate three 1-byte memory blocks, starting at address 50 and fill the blocks with the integers 1, 2 and 3 respectively. After that we allocate a 3-byte memory block at address 53 (50 + 3) and fill the block with the values 50, 51 and 52.


(Address: Memory block size: Contents)

50: 1 bytes: 1

51: 1 bytes: 2

52: 1 bytes: 3

53: 3 bytes: 50, 51, 52


What we have created in the 3-byte memory block is effectively an array of integers, but instead of containing the integers themselves, the array contains pointers to the integers.


What if we want to make a copy of this array? If the array had contained the integers themselves, there wouldn't be much to think about; just allocate a second 3-byte memory block and copy the contents of the first 3-byte memory block into the second. But now the array contains pointers to the integers instead of integers, so if we try the simple copy just described we would end up with the following:


(Address: Memory block size: Contents)

50: 1 bytes: 1

51: 1 bytes: 2

52: 1 bytes: 3

53: 3 bytes: 50, 51, 52

56: 3 bytes: 50, 51, 52


What happens is that we have copied the pointers. Nothing else. This is similar to what we did above when we copied a variable containing a pointer. Just as above, we now have two pointers pointing to the same memory block. If we change the integer at address 50 from 1 to 10, this change will be "seen" by both the original array and the copy.


Sometimes this is what one needs or requires, but sometimes a "real" - fully independent - copy is required, that is, a copy which won't be affected by changes in the original.


To accomplish this, we need to copy the three 1-byte memory blocks instead of the 3-byte memory block containing the pointers. That is, we have to allocate three new 1-byte memory block and copy the contents of the first three 1-byte memory blocks into the three new ones. After that we have to allocate a second 3-byte memory block and put the addresses of the three new 1-byte memory blocks in it.


(Address: Memory block size: Contents)

50: 1 bytes: 1

51: 1 bytes: 2

52: 1 bytes: 3

53: 3 bytes: 50, 51, 52

56: 1 bytes: 1

57: 1 bytes: 2

58: 1 bytes: 3

59: 3 bytes: 56, 57, 58


Now we have made a truly indepent copy of the array. This is called making a deep copy. Just copying the pointers as we did initially is called making a shallow copy. If we now change the integer at address 50 from 1 to 10, this will only be "seen" by the original array, not the copy.


As mentioned above, the decision to make a deep or shallow copy can only be made by the programmer, since the computer doesn't know what kind of behaviour your require.


Reference counting


As we have mentioned before, it's quite common to get into situations where a memory block has several owners or users. We have also mentioned that it's hard to decide when and who of the owners that should free such a block. A technique called reference counting solves both these problems. Provided that all the owners play by the same rules, that is.


A simple set of rules that is quite easy to follow and also quite easy to implement in the memory manager is a technique called reference counting. As the name implies, the technique works by counting how many references (pointers) there are to a certain memory block, that is how many owners the block has.


When the memory block is first allocated, by the first owner, the counter is set to 1. If the first user "gives away" the block to someone else, for example by passing it to another task, the counter is increased by one 1 to signal that there now are two owners of the block - or two references to it.


When any of the owners frees the block, the counter is decreased and when it reaches zero it is freed. If the counter is greater than zero, nothing happens! This means that the memory block is "kept around" until all users are done with it - very simple and very efficient!


In normal memory management all you do is allocate memory, passing and copying pointers, and freeing memory. The only thing that is different from "normal" memory management when you use reference counting is that someone actively has to increase the reference counter. Also, you won't be able to call the normal free function when you're done with a memory block; you instead have to call a special kind of free that decreases the reference counter and don't call the normal free until the counter reaches zero.


To summarise, when you use reference counting you have to inform the computer when you want to add an owner (increase the counter) or remove an owner (decrease the counter) to a memory block. When it's appropriate to do this is completely up to the programmer to decide since the computer has no way of knowing what type of behaviour you want, and that is where the complexity of reference counting lies - calling the add/remove owner functions at the right time and place.


So how could reference counting be implemented by the memory manager? A very simple solution is to allocate a little more memory each time someone requests memory and use this extra memory to hold the reference counter for the memory block. In our simple 256-byte computer, this could mean allocating one extra byte for each memory block. If some requests 3 bytes of memory, the memory manager actually returns 4 bytes of memory but instead of returning the address to the first byte in the memory block, it returns the address to the second byte.


50: 1 (this address holds the reference counter and is initialised to 1 by the memory manager)

51: ??? (this is the address returned by the memory manager)

52: ???

53: ???


When someone calls the "remove owner" function on the memory block above, the argument to the function will be a pointer containing the value 51 (not 50!). However, the function knows that there actually is an extra byte stored before the memory block, so it decreases the value stored in the extra byte by one and checks if it becomes zero (no more owners left). If it reaches zero, the 4-byte memory block is freed "for real."


The best thing with this very simple solution is that it can be implemented "on top of" a normal, non-reference counting, memory manager. This makes it possible to write a reference counting memory manager in a library (or even in the application itself!). After that you could start writing applications that use the library and these applications would then get support for reference couting memory management even though the underlying memory manager (for example an operating system) doesn't support it.


So how big an impact does reference counting have on the code that you write? Do you have to start thinking in completely new ways to use it properly? Well, you do have to start using at least two new function calls; one for adding an owner to a memory block and one to remove an owner from a memory block. You also have to starting thinking a little differently since you need to call these two functions at appropriate places in the code.


In an example above (in the section about copying memory blocks to avoid ownership problems), we had the producer copy a memory block before giving it to the consumer and logger respectively. Could we have made it the "other way round", that is, having the consumer and logger copy the block given to them? Yes, that would have worked just as well! Who should make the copy is just a matter of perspective, that is, who should be aware of the ownership problems - the giver or the receiver?


The same reasoning applies to reference counting, but instead of copying memory blocks an owner is added to a memory block. Adding an owner can be made either by the giver or the receiver. However, often it's "nicer" or more convenient to have the receiver add itself as an owner instead of having the giver add the receiver as an owner. Later, when the receiver is done with the memory block it can remove itself as an owner. That way, the memory management issues can be completely "contained" in the receiver (application, class, library, framework, etc.).


Objective C memory management


In object oriented languages like Objective C, memory is usually allocated in order to create an object. Therefore, one usually talks about objects instead of memory blocks and references to objects instead of pointers even though a reference is nothing else but a variable containing the address of an object. Just as with pointers, references are normally "typed" to inform the language what type of object that is "referenced", but there is also a general reference type called 'id' which can reference any type of object.


As with all posts in this blog, remember that I am writing about stuff I have recently begun to understand myself and this is particularly true for the memory management parts of Objective C and Cocoa. That said, as far as I understand, Objective C doesn't have any standardised memory management scheme apart from functions malloc() and free from the standard C library.


Cocoa memory management


Cocoa can be explained as a "programming environment", API or class library intended to radically simplify the development of complex (graphical) applications. In this tutorial, we're going to focus on what it has to offer when it comes to memory management.


Cocoa uses reference counting to manage memory ownership problems and it's usually the receiver of a reference that "adds itself as an owner" to an object. In Cocoa, the reference counting scheme is accessed through the Cocoa root class NSObject, or actually the NSObject protocol. Since almost all objects in Cocoa implement this protocol, the reference counting scheme is available to "everyone". Furthermore, a set of rules have been developed by the Cocoa developers, meaning that almost all Cocoa classes are compatible with each other with regard to memory management. These rules also simplifies the life for the application developer (you!).


Allocating an object


When you want to allocate memory for an object you call the 'alloc' method of the class.


MyClass *myClass = [MyClass alloc];


This line of code first declares an object reference called 'myClass' of the type 'MyClass *'. The asterisk (*) is a syntax inherited from the C language, where it declares a variable as a pointer. In this case a pointer to an object of the type MyClass.


[MyClass alloc] sends the message/calls the method 'alloc' of the class MyClass, which allocates a sufficient amount of memory for an object of the type MyClass. It also sets the reference counter of the memory block used for the object to 1 (one) to indicate that there now is one owner associated with the memory - the one calling 'alloc'.


Initialising an object


Before an object can be used it has to be initialised, which normally is done right after memory has been allocated for the object. Initialising an object puts it into a known initial state; assigning default values to member variables, etc. You do this by calling the 'init' method of the NSObject protocol.


MyClass *myClass = [MyClass alloc];

[myClass init];


Note that 'init' is called on the object 'myClass' and not the class 'MyClass', that is, it uses the an object/instance of the type MyClass rather than the class MyClass itself.


Init also returns an object reference (pointer) and since an 'init' method is allowed to return another object than the one it was called on, it's very important to use the pointer returned by init:


MyClass *myClass = [MyClass alloc];

myClass = [myClass init];


Since both 'alloc' and 'init' return an object reference and you have to use the one returned by 'init', it's often more convenient to use "cascading" method calls to allocate and initialise an object with a single statement.


MyClass *myClass = [[MyClass alloc] init];


Freeing an object


When you are done with an object you need to free it, or actually "remove yourself as an owner" as Cocoa uses reference counting rather than manual memory management. You do this by calling the 'release' method of the NSObject protocol.


[myClass release];


This method decreases the reference counter of the memory associated with the object and if the counter reaches zero, the memory is freed. However, before the memory is freed, the 'dealloc' method of the class is called to make it possible for the object to "clean up" before it's freed. Cleaning up basically boils down to releasing all references held by the object. If this step is skipped, memory will leak since no one will take care of releasing the references held by the freed object since it no longer exists!


Adding an owner to an object


When an object should be shared/used by multiple users, the additional users need to add themselves as owners. This is done by calling the 'retain' method of the NSObject protocol.


MyClass *myClass = [[MyClass alloc] init]; // allocate and initialise an object

[myClass retain]; // add an owner (increase the reference counter)

MyClass *myClassReferenceCopy = myClass; // copy the reference


Now the same object has two references; myClass and myClassReferenceCopy. Furthermore, the object has a reference count of two indicating that it has two owners (or references). Conveniently, 'retain' returns a reference to the object, which makes it possible to perform the operations above in two statements instead of three.


MyClass *myClass = [[MyClass alloc] init]; // allocate and initialise an object

MyClass *myClassReferenceCopy = [myClass retain]; // retain the object and copy the reference


Balancing retain and release


A very important rule with reference counting is to balance the addition and removal of ownership, or in Cocoa terms, balancing the calls to retain and release. Balance means that for each call to 'retain' a call to 'release' is required. If this balance isn't maintained, the reference counter of some object will never reach zero and thus will never be deallocated. This effectively creates a memory leak!


Setter methods


One of the most common scenarios for calling 'retain' on a reference is in a "setter" method of a class, that is, when a user of an object wants to assign a value to one of the objects member variables. For example, say that we have two classes A and B, where objects of type A contains a reference to an object of type B.


(ClassA.h)

@interface ClassA : NSObject {

ClassB *classB;

}


-(void)setClassB:(ClassB *)reference;


Here we see that there is a member variable called 'classB' which is a reference to an object of type ClassB. If an application allocates and initialises one instance of ClassA and one instance of ClassB and then call 'setClassB' on A, the reference to the ClassB object will be "given to" the ClassA object.


(MyApplication.m)

ClassA *classA = [[ClassA alloc] init];

ClassB *classB = [[ClassB alloc] init];

[classA setClassB:classB];


A naive implementation of a setter method which simply assigns the value of the supplied parameter to the member variable


(ClassA.m)

-(void)setClassB:(ClassB *)reference {

classB = reference; // store the new reference in a member variable

}


does indeed work in simple situations, but relies on the application not to release the reference it supplied as a paramter to the setter method. If it does, the ClassB object will be deallocated and the ClassA object will have a reference to a deallocated object, which is a catastrophe!


The ClassA object needs to be sure that the ClassB object it is using doesn't get deallocated and to guarantee this it needs to add itself as an owner.


A somewhat less naive implementation is to retain the reference passed as an argument


(ClassA.m)

-(void)setClassB:(ClassB *)reference {

[reference retain]; // retain the new reference

classB = reference; // store the new reference in a member variable

}


So, what is wrong with this implemenation? Well, what happens if setClassB is called more than once? It will lead to a situation where the 'retain' calls aren't balanced by 'release' calls. To sort this out, we have to ensure the setClassB method maintains the balance.


A quite good implementation which maintains the balance between retain and release


(ClassA.m)

-(void)setClassB:(ClassB *)reference {

[classB release]; // release the old reference

[reference retain]; // retain the new reference

classB = reference; // store the new reference in a member variable

}


So, why is this implementaion just "quite good"? What else needs to be done but balancing the 'retain' and 'release' calls? Well, there is a fringe case that isn't handled by that implementation. A case that can be triggered by calling the setter multiple times with the same reference! Doing this could lead to a situation where the object referenced by the 'reference' parameter is deallocated before 'retain' is called on it in the setter.


(MyApplication.m)

ClassA *classA = [[ClassA alloc] init]; // classA reference count == 1

ClassB *classB = [[ClassB alloc] init]; // classB reference count == 1

[classA setClassB:classB];

[classB release]; // release the old reference (nil; nothing happens)

[reference retain]; // retain classB; reference count == 2

classB = reference; // store classB in a member variable

[classA setClassB:classB];

[classB release]; // release the old reference (classB; reference count == 1)

[reference retain]; // retain classB; classB reference count == 2

classB = reference; // store classB in a member variable



Hmm, looking at the 'release' calls in 'setClassB' we see that the reference count temporarily reaches 1 before going back to 2 again, but we never see it reaching zero and then be deallocated.


What saved us this time is that the application still is an owner of the object, and that's why we only reached 1 instead of 0. But doesn't the application has to own the object in order to supply a reference to it as a parameter to 'setClassB'? Not necessarily, all it needs is the reference to the object - it does not have to own it. To protect us from this fringe case we need to reorder the operations in the setter method a bit so that 'retain' is called before 'release'. This will temporarily increase the reference count instead of temporarily decrease it when the setter is called with the same reference multiple times. Temporarily increasing the reference counter is risk free since nothing happens behind the scenes when you do it, in contrast to what can happen when you temporarily decrease.


A perfect implementation of a setter method


(ClassA.m)

-(void)setClassB:(ClassB *)reference {

[reference retain]; // retain the new reference

[classB release]; // release the old reference

classB = reference; // store the new reference in a member variable

}


Getter methods


Getter methods for member variables that are object references can usually be made quite simple since it's customary to let the one calling the getter be responsible for managing the reference counting (calling 'retain' if it wants to keep it around and later, when it is done with it, calling 'release' on it).


(ClassA.m)

-(ClassB *)getClassB {

return classB; // return the reference without calling 'retain' on it

}


(MyApplication.m)

ClassB *classB = [classA getClassB];


[classB retain];

...

[classB release];

Cleaning up


As we mentioned earlier, if a class has member variables that are references to other objects, objects of that class need to "clean up" before being deallocated. We also mentioned that cleaning up often boils down to releasing references that had previously been retained.


In Cocoa, the cleaning up method 'dealloc' gets called on an object when it is time to deallocating the object, so if you have any references retained that's the place to do your class specific cleaning up.


A broken implementation of a dealloc method


(ClassA.m)

-(void)dealloc {

[classB release]; // release the reference that was retained in 'setClassB'

}


Why did we call this implementation broken? What more could possibly be required from us than that we clean up our own mess? Don't worry, there's no more boring household chores for you to do, but you do have to tell your superclass that it should clean up its mess ;)


A perfect implementation of a dealloc method


(ClassA.m)

-(void)dealloc {

[classB release]; // release the reference that was retained in 'setClassB'

[super dealloc]; // tell your superclass that it is time to clean up

}


An alternative implementation of a dealloc method


(ClassA.m)

-(void)dealloc {

[setClassB nil]; // "unset" the reference by setting it to nil

[super dealloc]; // tell your superclass that it is time to clean up

}


By calling the setter method instead of doing an explicit call to 'release' we do all the memory management of the reference in the setter method. Since memory management is complex it's a good idea to do as little as possible of it.


Summary


This tutorial is very long, but if you understand all of it you are well equipped to understand the more complicated parts of memory management - in general and in Cocoa. Basic understanding of dynamic memory management and pointers is important for most types of software development, even though some languages claim that you don't have to worry about it. The time spent on understanding it is therefore usually a good investment for the future.


If there is one thing you should remember from this tutorial it is: free what you allocated, or in Cocoa terms, release what you retained (or allocated).


The title of this post indicates that there will be several parts of this tutorial and those will probably deal with stuff like autorelease pools and deep and shallow copying in Cocoa, and probably some words on how to do memory management when you're dealing with objects created from NIB files.


As always, feel free to suggest improvements and point out errors!