Writing code that effectively manages memory is hard. (And it also takes longer).
But it isn't just that the implementation is harder. You probably think of memory leaks when you hear "managing memory is hard". But its more then that.
Designing complicated software that effectively manages memory such that the program is maintainable is very hard.
C++ has made a lot of great strides in alleviating some of this, the newest standard in particular. Shared pointers, an explicit multithreading memory model, and a number of other things being added to the core language help a lot in this regard. But Asterisk is written in C, so we don't get that stuff. We have to do it ourselves.
So, consider, for example, a multi-threaded program where two threads have to manipulate an instance of struct foo.
foo.h:
struct foo {
int bar;
};
struct foo *create_foo(void);
void destroy_foo(struct foo *obj);
Say we have a thread running in foo.c that has to interact with an instance of foo, and a thread running in bar.c that also interacts with the same instance of foo. The classic design question is: who owns foo? Does foo.c? Or does bar.c?The easiest answer is to say "the thread running in foo.c will own the instance of foo. bar can interact with the instance, but it has to get it from foo.c."
So how do we then synchronize an instance of foo?
Well, one potential solution would be to put a mutex directly in the foo instance:
struct foo {
pthread_mutex_t *mutex;
int bar;
};But then you have to remember to lock the foo every time you want to use it. Plus, nothing prevents bar from just freeing the instance of foo whenever it wants, leaving foo.c nothing but a pointer to some memory it doesn't own! What's worse, it doesn't know that it doesn't own it - and since the mutex lived in foo, there's no way to synchronize it! As a single plus, both foo.c and bar.c get full access to the instance, so there's nothing additional you have to write and maintain in foo.h.
If we want to get around the destruction problem, we need some form of reference counting. When bar is done with its foo instance, it can decrement the ref count - if the ref count hits 0, the object gets nuked. Same thing with foo - it decrements the ref count when its done with its instance. This solves the ownership problem, but it does introduce a number of complexities:
- Ref counting leaks - bar bumps the ref count on its instance, but never decrements it. The instance never goes away. (More on this in the next blog post!)
- Circular references: struct foo has a reference to a ref counted object, and that ref counted object has a reference back to foo. Neither object will ever go away.
- Synchronization can still be tricky - you can't decrement the ref count until you unlock the object, and once you unlock the object someone else can decrement the ref count as well. Your ref counting library has to be pretty robust to survive those kinds of scenarios.
Well, we could be restrictive and only give a handle to the instance of foo back to bar. Our foo header file would end up looking something like:
struct foo_handle;
struct foo_handle *get_foo(void);
void destroy_foo_handle(struct *foo_handle);
void do_operation(struct foo_handle* obj);
All of the details of foo won't be exposed in the header - instead, struct foo will just live inside foo.c. This "opaquification" of foo (in this case, by providing an opaque object foo_handle to reference the actual foo object) is a pretty good solution, but it does mean that all of the operations on foo have to be done through its handle, which could be a lot of operations defined in the header. Still, the handle approach is probably better, and safer - its much clearer who owns the object and you can define an explicit contract between foo.c and bar.c as to what operations are safe to do on an instance of foo through its handle foo_handle.
I think that's actually understated. Think of it this way: on a major project with tens of thousands of lines of code, can you really expect yourself to remember all of the semantics of an object years after you've written it? Can you really expect someone else to learn it and remember it? How do you prevent a new coder from introducing subtle race conditions or deadlocks without a contract that they have to explicitly violate (probably by trying to sneak a change to the contract by you in a code review)?
Ever wonder why there have been so many deadlocks in Asterisk?
Consider the channel object - that little object in Asterisk (how many fields are in that struct now? 50? 100?) that is almost always interacted on by two threads: the pbx_thread servicing the channel in the dialplan, and the various thread(s) that provide the protocol for the channel technology. Those threads both interact on the channel; those threads both change the state of the channel; those threads can both cause the destruction of the channel.
This is a really long winded explanation of getting around to what we did for the channel object in Asterisk: we opaquified it. Its now no longer accessible outside of channel_internal.c - not even channel.c can see it. Neither can the channel technology implementations. All of the operations on the channel are hidden behind specific calls that define the contract for that channel.
If you need to know a property of the channel: use one of the ast_channel_* calls. If you need to change the state of the channel: use one of the ast_channel_* calls. You get the idea.
Currently, that contract is huge: while we opaquified the channel object, we did not change the semantics of what you can do with that channel. All of the "dangerous" stuff is still there - you can still change the state willy-nilly; masquerades are still in; you can still cause deadlocks by holding the channel lock while queuing an indication on the channel. Good times. But the contract is in place, and it'll probably get tighter in the future.
This is really all a slow migration towards an easier to manage, easier to use channel object. Before reference counted objects, you had race conditions between who destroyed the channel, and the issue of "who owns this channel" was a real problem. Now, we're narrowing the scope even further, from "who owns this channel" to "what is allowed on this channel".
With any luck, in a few more major versions, people won't look at the channel object and ask themselves, "do I lock the private first or the channel? Do I have to bump the ref count on the channel before I unlock it and queue this frame? Do I even unlock it before queuing this frame?"
One can only hope!




