Archive for the ‘Software Development’ Category

C++ pointer template parameters are weird

Sunday, February 20th, 2011

I was browsing through the C++ standard the other day (as you do), and I was surprised to learn that the following are all legal as non-type template parameters:

  • integral or enumeration type
  • pointer to object or pointer to function
  • reference to object or reference to function
  • pointer to member

If you’ve ever done metaprogramming then you’ll certainly have used integral types as template parameters. And I’ve already come across pointers to member as template parameters in possibly my gnarliest typedef ever, which for the record was something like:

typedef
template< typename T>
boost::variant<T::* int, T::* string, T::* double, T::* bool>
   any_member;

You can use this type to refer to any (modulo ramming enough different types in the variant<> list) member on any type at compile time. I had this rigged up into rather a nice compile-time ORM system, where you could define a type and then declare a static instance of

std::map<string, any_member<T> >

that maps strings to members at compile time. You can then have generic code that pulls data out of a database and then assigns it to instances of a the class. So much of it is done at compile time that it was actually pretty reliable code for something so hairy.

But I digress. The point is, pointers to members act in this context quite a lot like enumerated types: there’s only a finite (usually very small) number of valid values that is obviously determinable at compile time. Even integral types, in the contexts where they are used, are finite and typically small. Pointers to objects, on the other hand, usually don’t exist until run time and can take any one of an effectively infinite range of values at run time.

So the declaration

template< int * Foo >
class Bar {
  public:
    void doStuff() {
        std::cout << *Foo << std::endl;
    }
}

is perfectly legal. But how can it be used? Remember, each specialisation of Bar is not just a different object but a wholly different class. On the face of it, this might seem to allow creation of types at run time, which is obviously impossible.

It turns out that they’ve thought of that. When you try to specialise your template it all goes wrong:

int main( int argc, char ** argv ) {
    int * ptr = new int(42);
    Bar<ptr> myObj;
}

This is rejected by g++ with:

error: 'ptr' is not a valid template argument of type 'int*' because 'ptr' is a variable, not the address of a variable

Full marks to g++ for a good error message here. This immediately prevents the possibility that we can generate an unlimited number of types at run time, since we have only a finite number of variables defined in our program, fixed at compile time. But what about the address of a stack variable? Taking the address of the stack variable doesn’t return a value fixed at compile time, because it depends how deep we are in the stack when we do it:

void confuseTheCompiler() {
    int someVar;
    Bar< &someVar > myObject;
}

Fortunately, they’ve thought of this as well:

error: '& someVar' is not a valid template argument of type 'int*' because 'someVar' does not have external linkage

This is made explicit, as you would expect, in the C++ standard. It’s in 14.3.2 clause 1:

A template-argument for a non-type, non-template template-parameter shall be one of:

  • an integral constant-expression of integral or enumeration type; or
  • the name of a non-type template-parameter; or
  • the address of an object or function with external linkage, including function templates and function
    template-ids but excluding non-static class members, expressed as & id-expression where the & is
    optional if the name refers to a function or array, or if the corresponding template-parameter is a reference;
    or
  • a pointer to member expressed as described in 5.3.1

So if we restrict ourselves to variables with external linkage then we’re back to the case where we have a fixed number of possible values at compile time, and everything in the garden is lovely.

I’m not sure if this would ever be useful. I came up with this toy example:

#include <iostream>
#include <string>
 
using namespace std;
 
template< int * foo >
class VariableNamer {
public:
	static string name;
};
 
extern int baz;
 
int baz = 42;
 
template<>
string VariableNamer< &baz >::name = "baz";
 
int main( int argc, char ** argv ) {
	cout << VariableNamer<&baz>::name << endl;
}

It enables you to associate one and only one name with a variable that has external linkage, at compile time. If you forget to assign a name for the variable you’ll get a compile-time error, which I suppose makes it better than having a global

std::map< int *, std::string> variable_names;

I’m sure with a little extra finesse it could be modified to assign a name to any type of variable, not just ints.

No, my password is not too weak

Sunday, September 12th, 2010

First of all, I must confess: despite dire warnings to the contrary, I use the same password for more than one web login. But I’m sure everyone does, simply because so many sites require logins that having a different password for each site will be less secure by virtue of needing to be written down or otherwise stored in plaintext. And really, the consequences of getting an account hacked (and the benefits to any prospective hacker) are so low that it isn’t worth bothering. So a hacker can tweak my Delicious bookmarks, or see my bit.ly statistics, or whatever. Big deal.

Even worse, my password for such low-value sites is a dictionary word. Admittedly it’s a dictionary word chosen at random from the Shorter OED, which gives a theoretical 500,000 possibilities. Even if we suppose I ignored overly short words and didn’t choose entirely at random, the odds of someone guessing it have to be one in 200,000 or longer.

This seems reasonable to me, but plenty of sites aren’t having it. As far as they are concerned, protecting me from the consequences of bad password choice is their problem, not mine, and the way to do that is to ensure that my password has at least one digit in it. Or at least four digits. Or maybe a non-printable character. Or at least 2 capital letters. They don’t seem to be able to make their minds up.

So I just do what everyone else does in these circumstances: I use the same password, and make minimal tweaks to satisfy the algorithm. I capitalise the first letter. Or I add a standard number on the end of it (hey hackers, I’ll tell you this for free: I always use the number 12).

Security is always a tradeoff with usability, and the tradeoff here is terrible. Users will always use one of a tiny number of tricks (appending 123, using l33t-sp3ak, etc.) that make minimal increase to the search space. Hackers aren’t greatly inconvenienced, but the user has to remember exactly what set of requirements they were trying to satisfy when they created their password. Here’s a hint to designers: if I find myself having to think hard about what combination of capitalisation and numbering I used when I signed up, and I don’t really like your site, then I just won’t bother logging in again.

But the really dumb thing is that this is defending against the wrong problem. Passwords aren’t typically compromised by brute-force attacks across a large search space, they’re compromised by being human-understandable and relevant to the person that created them (personal names, dates, etc.) Checking the password against common human names and date formats (e.g. 19xx) would be more effective.

And if you’re worried that the web application you’re writing will be subject to brute-force attacks, then it’s easy to defend: limit the rate of queries to human speeds (no more than 5 incorrect passwords per IP address per day) and use CAPTCHAs. Detect suspicious patterns of activity and take action (block the IP address, warn the user being targetted, whatever). Properly implemented, even brute-forcing a mother’s maiden name or a birthday (requiring just a couple of hundred attempts) becomes an uninviting task, particularly when the reward for success is non-monetary.

So why do programmers do this? It’s like the old joke about the drunk who dropped his keys in a dark street and resolved to look for them under a street lamp since the light was better there. Adding rate-limiting isn’t that hard, but it’s orders of magnitude hader than writing a regexp to ensure the password has two digits in it. Even if this doesn’t solve the problem, it gives the impression that we’re making an effort.

Is the rockstar programmer dead?

Friday, September 10th, 2010

What is it with technology? We can put men on the moon, but we can’t create an airport baggage system that doesn’t foul up. Or a word processor that doesn’t crash. Or a web application with a consistent user interface.

One explanation could be that there are actually two sources of difficulty in most human endeavours: necessary difficulties, where a task is at the limit of or beyond a person’s capability, and accidental difficulties, where a task is perfectly achievable but we are open to simple human errors. Though the progress of technology, training and specialisation has raised the absolute limits of what is achievable in most disciplines, human error remains stubbornly as a fact of life and little has been done to ameliorate it.

Front cover of The Checklist ManifestoThis is an idea that is explored at length in The Checklist Manifesto. The author, Atul Gawande, is a surgeon who became suspicious of the number of failures that were happening in surgery that are attributable to human error, often at the cost of grievous injury and death. It would be funny if it wasn’t tragic, that no amount of training (and surgeons are extraordinarily well-trained) is enough to stop a surgeon cutting off the wrong limb, or forgetting to administer vital drugs.

Gawande pioneered a remarkably simple solution to this in the form of a pre-surgery checklist. The evidence is that this reduces complications by double-digit percentages. The main problem in implementing it seems to be in persuading alpha-male surgeons to admit that they are fallible, and that something as simple as a 30-second checklist can make a difference.

It’s not just surgery either. Gawande gives examples of checklists being successfully applied to contain human error in other fields such as finance and, of course, air flight. As a software developer, the following commentary caught my eye:

Tom Wolfe’s The Right Stuff tells the story of our first astronauts and charts the demise of the maverick, Chuck Yeager test pilot culture of the 1950s. It was a culture defined by how unbelievably dangerous the job was. Test pilots strapped themselves into machines of barely controlled power and complexity, and a quarter of them were killed on the job. The pilots had to have focus, daring, wits, and an ability to improvise—the right stuff. But as knowledge of how to control the risks of flying accumulated—as checklists and flight simulators became more prevalent and sophisticated—the danger diminished, values of safety and conscientiousness prevailed, and the rock star status of the test pilots was gone.

As a discipline, software is remarkable for the speed at which it has developed from being on the bleeding edge of research to being an integrated part of our lives, and this is something that the well-worn analogies between software engineering and civil engineering inevitably fail to model.  I’ve often wondered whether this rapid growth, and the small number of generations between the bleeding edge and the current state of the art have distorted our view of what good software engineering should be. This is an idea explored in more detail, and specifically about software, in The Inmates are Running the Asylum, a book I encourage all developers to read.

So, how applicable are the solutions in The Checklist Manifesto to software development? That’s where it gets difficult. Much as I’d like to see the discipline mature, the opportunities for checklists per se are small. There are certainly opportunities to ensure best practices in things like code reviews, merging to release branches or building installers. But ideally any purely repetitive task should be automated anyway, and this covers a lot of ground: software development differs from surgery in that while surgery only looks repetitive, software behaviour genuinely is deterministic. Automated tests, continuous integration and 1-click installer builds are probably the closest thing we have to the surgery checklist, and they are already best practices (though like the surgery checklist, often ignored by alpha males who think they know better).

I’m going to continue investigating the checklist idea and see whether I can integrate any of the spirit of it into my work; I’ll update here if I have any noteworthy progress. But it would be remiss of me not to emphasise one very important point: this isn’t about taking away the creativity of the job, it’s about minimising the cognitive load of routine tasks so that tasks requiring creativity and judgment can be given more effort, not less. There is an art to writing a good checklist, and much of it is in minimising the checklist to give it maximum impact with minimal weight.

If software engineering could become more like flying, perhaps it would be no bad thing. Being a pilot is still a highly technical job that requires great experience and commands great respect. The skill of the pilot is indispensible when something unexpected happens, but that doesn’t mean they carry out routine takeoffs and landings by the seat of their pants.

http://www.amazon.co.uk/gp/product/1846683130?ie=UTF8&tag=reviewtfm-21&linkCode=as2&camp=1634&creative=19450&creativeASIN=1846683130″>The Checklist Manifesto: How To Get Things Right</a><img src=”http://www.assoc-amazon.co.uk/e/ir?t=reviewtfm-21&l=as2&o=2&a=1846683130