Vlad Lazarenko

... making all this up as I go along

Automatic Resource Management in Programming Languages

Many high-level programming languages provide software developers with the ability to automatically cleanup resources — automatically closing an open file when it is no longer used, freeing dynamically allocated memory, or anything else for that matter. This article touches on automatic resource management features available in modern programming languages. It shows some examples demonstrating why having this feature make it easier to write code, or how not using this feature may lead to potentially critical failures of the program. The primary focus is on C++ and C (surprise!) languages.

C

There are a few ways to automatically clean up resources in C# — a finally block and a using statement.

A finally block is a bit lower level than a using. It guarantees that all statements inside a finally block will be executed when the try block exists, even if unexpected exception occurs. For example:

1
2
3
4
5
6
7
8
9
ResourceType resource = expression; // Allocate, initialize or acquire some resource.
try {
    // Do something with it.
    statement;
} finally {
    // Free, destroy or release a resource.
    // This is guaranteed to happen.
    ((IDisposable)resource).Dispose();
}

Another and more convenient way of doing the same is to employ a using statement which guarantees that Dispose() method is called automatically upon leaving the scope of using block:

1
2
3
using (ResourceType resource = expression) {
    statement;
}

Java

Java also provides a finally block similar to that seen in C#:

1
2
3
4
5
6
7
8
FileReader reader = null;
try {
    reader = new FileReader("/dev/null");
    // Do something with a file...
} finally {
    if (reader != null)
    reader.close();
}

Starting from version 7 Java introduces an AutoCloseable concept along with automatic resource cleanup, which are basically the same things as IDisposable and a using statement in C#:

1
2
3
4
try (FileReader reader = new FileReader("/dev/null"))
{
    // Do something with a file...
}

Other Languages

Since the concept of something being done automatically is quite popular, it is present in many other languages. Python has both a finally statement and a with statement. The same goes for Ruby. Of course, this is also available in derivative languages like Visual Basic that is built on top of .NET technology, Groovy that is built on Java, etc.

C++ — The King of RAII

As we have seen, many languages provide different syntax to do essentially the same thing. It is a great concept indeed. It changed the way programmers write code, significantly improved productivity. And behind every invention that we cannot imagine our lives without there is an author. So who is behind the automatic resource management?

The real name of this concept is Resource Acquisition Is Initialization, or simply RAII. It was invented by Bjarne Stroustrup, the original author of C++ programming language. And of course this concept first appeared in C++, way before C# or Java were created.

C++ is the king of RAII. Many concepts are built upon it. RAII is the reason why object destructors are called automatically. It is the reason why C++ developers don’t need to care much about freeing dynamically allocated memory or close a file descriptor that is no longer used. It is why, when handling an error, it is enough to simply throw an exception, or return an error code from a function, without writing tons of “cleanup” code or even thinking about the order in which resources must be cleaned up (which is usually done in opposite order of allocation). In fact, RAII should be used whenever possible to avoid serious errors or even cause a denial of service. For example, consider the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <mutex>
#include <vector>

// Data structure that is supposed to be accessed by multiple threads.
struct shared_resource {
    std::mutex       mutex;
    std::vector<int> data;
};

void foo(shared_resource & res)
{
    // Acquire an exclusive lock to protect access to
    // the object in multi-threaded environment.
    res.mutex.lock();

    // Modify the contents of the vector by adding 10 integers into it.
    for (int i = 0; i < 10; ++i)
        res.data.emplace_back(i);

    // Once done changing the object, release the lock so that other threads
    // can work with it. If we forget to unlock it, the process will basically
    // hang trying to lock the mutex again, rendering our program useless.
    res.mutex.unlock();
}

The above code is a classic example of an error that could lead to serious consequences. The problem is that anything could happen in between of mutex.lock() and mutex.unlock() statements, and the mutex.unlock() statement may not be executed shall data.emplace_back() throw an exception, leaving the program in the locked-up state.

Taking a good care of exceptions is a concept called exception safety. The above code is indeed not exception safe. However, this is not only about exceptions. For instance, if emplace_back() method was not throwing an exception and return code was used to signal an error, the erroneous code could have looked like this:

1
2
3
4
5
6
7
8
9
10
11
int foo(shared_resource & res)
{
    res.mutex.lock();
    for (int i = 0; i < 10; ++i) {
        if (res.data.emplace_back(i) != 0) {
            return -1;
        }
    }
    res.mutex.unlock();
    return 0;
}

Exception safety is out of the equation, yet the code is buggy. This is why RAII should (almost) always be used. The correct code should look like this:

1
2
3
4
5
6
void foo(shared_resource & res)
{
    std::lock_guard<std::mutex> lock(res.mutex);
    for (int i = 0; i < 10; ++i)
        res.data.emplace_back(i);
}

In the above example, std::lock_guard object is locking a mutex in its constructor and unlocks it in its destructor. Thanks to RAII, the destructor is guaranteed to be called when the object goes out of scope no matter what. C++ is full of such “guard” objects. Many of them are part of the standard library, but developers can always create their own.

1
2
3
4
5
6
7
8
9
10
11
#include <cctype>
#include <fstream>
#include <iostream>

int main()
{
    std::ifstream file("test.txt");
    char c;
    while ((file >> c))
        std::cout << (std::isalpha(c) ? c : '*');
}

In the above example, the file object will automatically close the file. In other words, C++ clearly wins this battle over RAII with all other languages since there is no need for any finally blocks, using statements or event Java’s new try() blocks.

What is interesting is that C#, Java, Python and other similar languages did not even have “using”-like statements for a long time. There was only try-catch-finally construct and programmers had no choice but to write a lot of boilerplate code. Albeit some programmers have consciously suffered from the lack of proper RAII support, many didn’t know that C++ not only supports RAII but also is the only language that does it properly. One day those guys wanted to write something in C++ and of course started to look for similar self-torture methods. Not able to find it, they asked Bjarne Stroustrup why doesn’t C++ provide a finally construct, and here is what he had to say:

Because C++ supports an alternative that is almost always better: The “resource acquisition is initialization” technique (TC++PL3 section 14.4). The basic idea is to represent a resource by a local object, so that the local object’s destructor will release the resource. That way, the programmer cannot forget to release the resource. … In a system, we need a “resource handle” class for each resource. However, we don’t have to have an “finally” clause for each acquisition of a resource. In realistic systems, there are far more resource acquisitions than kinds of resources, so the “resource acquisition is initialization” technique leads to less code than use of a “finally” construct.

Boy was he right. Most today’s languages are now trying to resemble the same.

RAII in C

Everyone knows that C neither support exceptions nor it has a concept of RAII. That is not entirely true, but for the most part it is. You will not find anything about those features in C89, C99 or even C11 language standard specifications. Therefore, C developers have to be careful, disciplined, and clean up after themselves. Oftentimes, especially in somewhat low-level code like device driver, we can run into the code like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
static int pci_probe(struct pci_dev *pci_dev,
                     const struct pci_device_id *dev_id)
{
    struct my_dev *dev;
    int r;

    dev = kzalloc(sizeof(struct my_dev), GFP_KERNEL);
    if (unlikely(!dev))
        goto on_err;
    pci_set_drvdata(pci_dev, dev);
    dev->pci_dev = pci_dev;
    r = pci_enable_device(pci_dev);
    if (unlikely(r))
        goto on_enable_err;
    pci_set_master(pci_dev);
    pci_try_set_mwi(pci_dev);
    dev->bar0 = ioremap_nocache(pci_resource_start(pci_dev, 0),
                                pci_resource_len(pci_dev, 0));
    if (unlikely(!dev->bar0))
        goto on_bar0_map_err;
    dev->bar2 = ioremap_nocache(pci_resource_start(pci_dev, 2),
                                pci_resource_len(pci_dev, 2));
    if (unlikely(!dev->bar2))
        goto on_bar2_map_err;
    r = a2gx_add_cdev(dev);
    if (r)
        goto on_cdev_err;
    return 0;

  on_cdev_err:
    iounmap(dev->bar2);
  on_bar2_map_err:
    iounmap(dev->bar0);
  on_bar0_map_err:
    pci_disable_device(pci_dev);
  on_enable_err:
    kfree(dev);
  on_err:
    return -1;
}

Many will find the above code difficult to read, hard to write, or both. Some would love to punch the author in the face for using goto. Others may not understand what is going on in there at all. But C is not just a programming language. C is a religion. For some, C code is a lot cleaner than anything else — nothing is hiding behind the scenes, what you see is what you get. The C code is usually thought trough — you have to think twice before you write it. In fact, some wouldn’t substitute it for anything else. At any rate, C developers usually cleanup after themselves.

On the other hand, the benefits or RAII are also obvious and there are people who would love to see a concept of RAII in C. But it is not a part of the language specification. However, imagine a world when programmers were not using things that are not standard. That world would have been terrible. Just imagine for a second that C++ developers wouldn’t have multi-threaded programs until C++11 standard was ratified with all of its quirks and perks like std::thread, std::mutex, thread local storage and so on.

Or how about not being able to specify symbol’s visibility or use other attributes? Luckily, being a non-standard feature doesn’t restrain people from using it. And as more people use some feature, more chances it has to be standardized. That said, C++11 now allows for generalized yet compiler-specific attributes, threading, static assertions, and tons of other stuff that was available before, was used before, too, but wasn’t a part of the standard. Well, now it is.

That being said, C also supports RAII. However, it is not part of the standard, at least not yet. Here is how it works — it is possible to specify a cleanup function for any auto function scope variable. In order to do that, a non-standard cleanup attribute must be specified. If you remember the example of a buggy non-exception-safe C++ function that uses a mutex, that code would look something like this in C:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
struct shared_resource {
    pthread_mutex_t   mutex;
    struct int_vector data;
};

struct shared_resource {
    pthread_mutex_t   mutex;
    struct int_vector data;
};

int int_vector_push_back(struct int_vector *data, int value);

int foo(struct shared_resource *res)
{
    int i;

    if (pthread_mutex_lock(&res->mutex) != 0)
        return -1; /* Oops, cannot lock the mutex. */
    for (i = 0; i < 10; ++i) {
        if (int_vector_push_back(&res->data, i) != 0) {
            /* Oops, cannot add data into the vector! */
            return -1;
        }
    }
    pthread_mutex_unlock(&res->mutex);
    return 0; /* All is good */
}

Though it is a little bit hard to imagine that C programmer could write code like that. The code would probably look more like this (which is also functionally correct this time):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int test(struct shared_resource *res)
{
    int r;
    int i;

    r = pthread_mutex_lock(&res->mutex);
    if (r)
        goto out;
    for (i = 0; i < 10; ++i) {
        r = int_vector_push_back(&res->data, i);
        if (r)
            break;
    }
    pthread_mutex_unlock(&res->mutex);
  out:
    return r;
}

Below is a RAII-like version of the code where mutex is guaranteed to be unlocked automatically, so it is possible to simply return from the function at any given point without worrying about it. This code closely resembles the behavior of a C++ example using std::lock_guard where lock_guard_ctor mimics std::lock_guard::lock_guard constructor that locks a mutex, and the lock_guard_dtor function is like std::lock_guard::~lock_guard destructor that unlocks the mutex (if it was locked) and is guaranteed to be called automatically:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
pthread_mutex_t *lock_guard_ctor(pthread_mutex_t *mutex) {
    return pthread_mutex_lock(mutex) == 0 ? mutex : NULL;
}

void lock_guard_dtor(pthread_mutex_t **mutex_ptr) {
    pthread_mutex_t *mutex = *mutex_ptr;
    if (mutex != NULL)
        pthread_mutex_unlock(mutex);
}

int foo(struct shared_resource *res) {
    int i;
    pthread_mutex_t *lock_guard
        __attribute__((cleanup(lock_guard_dtor)))
        = lock_guard_ctor(&res->mutex);
    if (lock_guard == NULL)
        return -1;
    for (i = 0; i < 10; ++i) {
        if (int_vector_push_back(&res->data, i) != 0)
            return -1;
    }
    return 0;
}

The above code can also be simplified a bit with two simple steps. Step one is to have lock_guard_ctor and lock_guard_dtor functions defined somewhere else and be reusable. Because nobody writes custom constructors and destructor for a class every time they use it. Step number two is to get rid of a lot of typing by having a macro. For example, the common code that is written once could be:

1
2
3
4
5
6
7
8
9
#include <pthread.h>

extern pthread_mutex_t *lock_guard_ctor(pthread_mutex_t *mutex);
extern void lock_guard_dtor(pthread_mutex_t **mutex_ptr);

#define LOCK_GUARD(name, mutex)                   \
    pthread_mutex_t * name                        \
        __attribute__((cleanup(lock_guard_dtor))) \
        = lock_guard_ctor((mutex))

And the code using this feature becomes a lot simpler:

1
2
3
4
5
6
7
8
9
10
11
12
int foo(struct shared_resource *res)
{
    int i;
    LOCK_GUARD(lock, &res->mutex);
    if (!lock)
        return -1;
    for (i = 0; i < 10; ++i) {
        if (int_vector_push_back(&res->data, i) != 0)
            return -1;
    }
    return 0;
}

There are two possible ways for a compiler to implement a cleanup attribute. If exceptions are not enabled, it would generate the boilerplate code similar to that otherwise written by the programmer manually. If exceptions are enabled then compiler would generate the code that executes a cleanup function during the stack unwinding (yes, there are exceptions in C, too).

Note that cleanup attribute cannot be used for global variables. constructor and destructor attributes should be used instead. The constructor attribute causes the function to be called automatically before execution enters main(). Similarly, the destructor attribute causes the function to be called automatically after main() completes or exit() is called. Functions with these attributes are useful for initializing data that is used implicitly during the execution of the program. It is also possible to manually control the order of execution by specifying priority number (optional).

The techniques described above are available at least in today’s leading C compilers — GCC and Clang. Remember, the future of the language is in hands of its users. If you really like the feature and it is not standard — use it, spread the world, write a feedback to compiler developers and language standard committee. It will definitely help to make it into the next standard revision.

Let the force be with you!