Tag Archives: C++

Book Review: 21st Century C

I recently got back into C programming as part of my day job, and I decided to pick up a few books to refresh my memory. 21st Century C by Ben Klemens wasn’t quite what I was looking for, but it caught my eye due to the unusual title, if nothing else. People assume that C is an ossified language from an earlier time, but the author’s bold contention is that much has changed in the last decade or so (primarily with the 1999 ISO standard update) and that there is plenty new to learn. It’s an intriguing idea, but does it live up to reality?

Klemens is certainly right that most people haven’t kept up to date with the changes to the C standards. However, there’s a very good reason for that: most people who are using C professionally are doing it in environments where there isn’t a choice. The only new projects for which C is an appealing choice are cases where the code needs to compile on legacy platforms. Perhaps some projects that are using C for historical reasons could update, but there’s usually a lot of resistance to changing idioms while maintaining legacy code. Perhaps more awareness of what people are missing out on could swing more projects in favour of using C99 (or even the updated 2011 standard), but I’m not holding my breath.

So if we put aside the idea that C is a much-improved language due for a renaissance, how does the book stack up? Overall I’d say not that bad, although it doesn’t seem all that clear on who its target audience is. People who use C on a daily basis will find something of value in the big picture, but too much time is spent re-treading familiar ground. People who don’t know C but want to learn (if there are any) will find it lacking in introductory material. Though it covers new standards, it’s definitely not a book for language lawyers. Perhaps it will work best for someone like me, who used to use C but haven’t picked it up for a while, although even there the usefulness is patchy.

The book is framed using a device that is all too common in technical books, and that I hate: the author takes a preferred hobby of theirs (in this case, punk rock) and proceeds to apply aspects of it to the material presented (often just as a way of setting the scene for chapters, but sometimes as direct analogies to technical concepts). This rarely works all that well. For a start, authors tend to choose things that they care about (and that they want the world to know they care about) rather than things that make for good exposition. Secondly, it tends to strike a discordant note with the rest of the content of the book. A good technical book can have personality, but rarely is it good because of the personality it presents. In this case the references to music don’t add anything of value.

The first half of the book is spent on material that has nothing to do with C as such: how to install libraries, how to use version control, and how to package code using Autotools. This works better than you might expect, but a lot of it feels like filler. Once again, the author seems vague about his target audience: some attention is paid to options for compiling code on Windows, but not enough for anyone to fully adapt the techniques shown. It would be better to abandon Windows and declare this a UNIX book, or to give comprehensive guidance for Windows best practices.

The second half settles down into the material where the book has something unique to offer: best practices for writing C, with particular reference to the 1999 and 2011 revisions to the standard. Even this part of the book I found to be patchy. There are some sensible idioms making use of variadic macros, anonymous structs and the _Generic operator that have some potential. On the other hand, the author’s attempts to make C look like Python seem to be missing the point, and his descriptions betray confusion on one or two technical points.

As one example, the author provides the following code for freeing several pointers in one line:

#define Fn_apply(type, fn, ...) {                                      \
    void *stopper_for_apply = (int[]){0};                              \  
    type **list_for_apply = (type*[]){__VA_ARGS__, stopper_for_apply}; \
    for (int i=0; list_for_apply[i] != stopper_for_apply; i++)         \
         fn(list_for_apply[i]);                                        \
}
 
#define Free_all(...) Fn_apply(void, free, __VA_ARGS__);
 
int main(){
    double *x= malloc(10);
    double *y= malloc(100);
    double *z= malloc(1000);
 
    Free_all(x, y, z);
}

The only advantage this gives is that you save yourself a couple of lines when you have a list of pointers to be freed all in one place. However, it has a significant drawback: NULL is used to terminate the variable-length list, so if any of the pointers in the list is NULL you will leak memory. This is an important change to the semantics of free() (which is deliberately defined such that free(NULL) is a no-op) and turns a safe function into a dangerous one. This kind of C-Python hybrid is cute but doesn’t solve any of the problems of C in the real world.

Edit: The paragraph above is completely wrong, as Ben points out in the comments. Mea culpa.

Probably the best way to look at a book like this is to treat it like Alexandrescu’s Modern C++ Design (which is admittedly a far better book): as a source of inspiration and ideas, without applying any of the techniques directly. It’s great to keep abreast of the changes in the world of C, even if few people can use that knowledge just now. And although this book has its weak points and a lot of padding, it might be the only game in town.

Compile-time type identification

Yesterday I was working on a class to produce natural-language translations of complex C++ types, and I said that there was no general way to map a fundamental type to a name. Of course, there is:

template<typename T>
class TypeDecryptor {
public:
    static string getName() {
        return typeid(T).name();
    }
};

Of course, not quite everything in the garden is lovely. This provides an implementation-defined string, which as far as I can see is not even required to be unique for a given type. In point of fact, it’s extremely unlikely an implementation will return a non-unique string since each type has to be mapped to a string for linkage purposes anyway. The problem is that the string isn’t necessarily human-readable.

I think that on reasonable implementations there should be a function available to unmangle this string back into a human-readable type, which means you can potentially make a completely general type decryptor.

C++ type-declaration decoder

Unfortunately, since I wrote this article the markup for the code samples got corrupted during a backup and restore cycle. I’ve put some of it back in place from memory, but I need to test it properly. For now, treat this as a sketch of what the solution might look like.

expert C programming front coverDespite the somewhat self-aggrandising title, I have to admit that Expert C Programming: Deep C Secrets by Peter Van Der Linden is the single most beneficial programming book I’ve ever read. I believe the point that I came across it marked the first step on a road from being an amateurish hacker who was happy with anything as long as it compiled, to being a software professional. Of course, by this point I’d already long been paid as if I were a software professional, simply because I have a degree in mathematics from a high-ranking university. Such is the way of things.

Anyway, the most useful section of Deep C Secrets is a section that gives a simple algorithm for understanding a complicated C type declaration. You know the kind of thing:

void (*signal(int sig, void (*func) (int) ) ) (int) ;

The algorithm, by the way, is given in a section that has the delicious title “The Piece of Code that Understandeth all Parsing.” I’d forgotten how funny that book is.

The main problem with this is that it’s a run-time operation, and has to take the type declaration as a string. Parsing it involves a whole bunch of logic that already exists in the compiler, but has to be re-implemented. When I was stumbling through some declarations in C++ Templates: The Complete Guide it dawned on me that maybe C++ can do better.

Understanding any type declaration can be broken down into two parts:

  • Knowing where to start
  • Knowing which piece to process next

The first of these is often rendered difficult by the fact that there are several identifiers in a typedef, and you have to know which is the one being defined (because this is where you start to parse the type). In an anonymous type, there may be no identifiers at all:

doStuff( static_cast< int (*) ()> foobar );

In this case you start with the *, but it’s not easy to see how to know this in general.

The second difficulty (knowing which piece of the type to handle next) is complicated by the fact that you may need to proceed left-to-right or right-to-left, which depends on precedence rules that most people understand only implicitly, and often only by instinct.

Using the C++ rules for template argument deduction, we can ignore most of these issues. The core idea is to declare a template that takes a single compound type and expresses the type in terms of one or more simpler components. For example, we can write a class that deals with a pointer:

template<typename T>
class TypeDecryptor<T*> {
public:
	static string getName() {
		ostringstream output;
		output << "pointer to "
                       << TypeDecryptor<T>::getName();
		return output.str();
	}
};

What this says is that the getName() method on a TypeDecryptor applied to a pointer type will return the string “pointer to” followed by whatever the type decryptor tells us the pointed-to type should be called. We can do something very similar for const:

template<typename T>
class TypeDecryptor<const T> {
public:
	static string getName() {
		ostringstream output;
		output << "const " << TypeDecryptor<T>::getName();
		return output.str();
	}
};

We’ve already got something useful, because it can deal with all that int const * const * stuff that people sometimes have problems with. OK, so you also need a TypeDecryptor specialisation for each of the fundamental types, which just prints out the type name:

template<>
class TypeDecryptor<int> {
public:
	static string getName() {
		return "int";
	}
};

Annoyingly, I can’t find any way to generalise this, so you need an explicit specialisation for any fundamental or user-defined class type you want to support. It could be streamlined with a macro, of course.

So at this point, our decryptor can do things like this:

cout << TypeDecryptor<const * const *int>::getName() << endl;
// Outputs "pointer to const pointer to const int"
 
cout << TypeDecryptor<const * const * int>::getName() << endl;
// Outputs "const pointer to const pointer to int"

But this is just getting started. Similar things can be done with function pointers, arrays, references, pointers-to-member, etc. One of the more complex cases is:

template<typename R, typename S, typename T>
class TypeDecryptor<???> {
public:
	static string getName() {
		ostringstream output;
		output << "pointer to a member function (on type "
			   << TypeDecryptor<R>::getName()
			   << "), taking one argument of type "
			   << TypeDecryptor<S>::getName()
			   << " and returning "
			   << TypeDecryptor<T>::getName();
		return output.str();
	}
};

This allows us to deal with pointers-to-member-function with one argument. Annoyingly, you need an explicit specialisation for every different number of function arguments there can be (one for zero-argument functions, one for one-argument functions, etc…) and also a different specialisation for pointers to data member from pointers to member functions. This means the number of explicit specialisations gets quite large quite quickly.

On the plus side, you don’t need to know anything about type precedence rules to write this code, nor have to make any decisions about where in the type declaration to start processing. The C++ compiler does all the hard work. Without too much effort I was able to get something that could parse:

char (Person::*)(int (&)[42])

into

pointer to a member function (on type Person), taking one argument of type reference to array (of size 42) of instances of int and returning char