Archive for September, 2009

On the internet, nobody knows that you’re using your real name

Sunday, September 20th, 2009

It was a different world back when I first ventured onto the internet. There was a definite feeling that it was something apart from one’s normal life; a world in which nobody knew you were a dog. I also recall a sense of danger. The cautionary voices of the older generation outnumbered the enthusiastic calls of the more adventurous, and those with actual experience to draw on were all but non-existent. It seemed natural, then, to insist on an anonymous identity.

Things have moved on, and crucially the positive effects of one’s online life to real life are much more visible, whether you are making friends on Twitter or networking on LinkedIn. More and more I’m getting to the point where I don’t want to use anything but my real name for an online identity. Apart from anything else, hours of my life have now been spent sitting at account creation screens trying forlornly to find a name that is amusing enough to make me seem fun, serious enough to be seen by colleagues and above all else not already taken. My Twitter username, @timmartin2, is a sad testament to the eventual abandonment of this process.

My Twitter username is also a good illustration of what might be a real problem with internet applications. My first and last name when combined together are enough to uniquely identify myself among everyone I’ve ever spoken to—not just almost everyone, but literally everyone. Even for people with much more common names than mine, genuine duplicate names are quite rarely encountered in practice. Crucially, rather than try and pre-empt any problems of name uniqueness at birth, the preferred human solution has been to muddle through until a conflict occurs and then solve it ad hoc.

The design assumptions underlying internet technology seem to let us down here. The easiest way to deal with this uniqueness issue in designing a software solution is simply to declare that names must be unique and let the user figure it out. This isn’t just slightly easier but is orders of magnitude easier. The problem of identifying someone with a handle that is locally-unique but globally-duplicated hasn’t been solved, but then it hasn’t really been tackled. We’ve just been happy to let the technological tail wag the usability dog, as with so many usability issues.

I wonder whether the reasons for this are not wholly technical, but also partly born out of accidental properties of the early internet. The anonymity and the domain-name goldrush of the first tech boom have set in our minds the model that identities must be unique in a way that doesn’t serve us well in the era of the more social web. The Facebook URL land-grab of recent months seems like a good example of where the ordinary user is losing out.

One important result of socialising pseudonymously is that it changes your behaviour. Trolling is rare in real life, and mostly restricted to children who haven’t yet learned the importance of keeping a good reputation. Sites like Stack Overflow and Hacker News do well because they tilt the balance of payoff away from the troll and allow people to benefit from a positive reputation, but this still doesn’t spill over from the site into other facets of life. It would be impractical (and probably counterproductive) to attempt to port human social structures directly onto the web, but maybe real life still has a few tricks left to teach.

Optimising this blog

Tuesday, September 15th, 2009

Keen followers of this blog will probably have noticed that it runs pretty slowly. Being as this is the second post, I think we can rule out massive amounts of traffic as a cause. Obviously something is wrong in my configuration. Those of you who aren’t interested in server optimisation can probably ignore this post, I’ll try to write something for a more general audience later in the week.

According to top the percentage of time the CPU is in the wait state is close to 25%, which on 4 (virtual) CPU’s means I’m probably blocking entirely on I/O. Since I’m shifting around a few kilobytes of data, disk access ought to be trivial. The only likely candidate is swap usage, and sure enough I’m running with only a few meg of physical memory free.

I’m hosted on a minimally-sized Slicehost slice, which means I have 256Mb of physical RAM to play with. Apache is using 220Mb of virtual address space, and MySQL is using 158Mb, so both ought to be candidates for optimisation.

This post talks about optimising MySQL for a small slice, but doesn’t reach any firm conclusions. I can obviously cut down my MySQL caches quite a bit compared to the stock config file, so I changed the following:

key_buffer = 8M
query_cache_size = 8M

Both settings were reduced from 16Mb, which frees up 16Mb. Apparently the sort buffer is another variable that can be reduced since I shouldn’t have too many large data sets to sort.

sort_buffer_size = 128K

Changing this setting took effect (as evidenced by SHOW VARIABLES), but didn’t result in any reduction in memory use reported by top – perhaps this buffer is allocated and freed when a sort is in progress? Net effect is that MySQL is now using 142Mb.

On the Apache side, this post suggested disabling apache modules that aren’t in use. I disabled mod_cgi, which seemed to take Apache memory usage down to 200Mb. I also ought to be able to optimise the server for low traffic by tweaking the number of threads. I discovered I was running the prefork MPM by running:

$ /usr/sbin/apache2 -l

This meant I could use the settings the post suggested by reducing the minimum number of servers available. I turned both StartServers and MinSpareServers down to 1, and reduced MaxSpareServers to 5. After this Apache still has the same footprint per process, but there are fewer processes running so memory usage is reduced. It’s hard to estimate how much since the footprint per process reported in top contains an unknown amount of shared libraries that don’t add extra overhead per process.

The final score is that there is now around 100Mb physical memory free on the box, compared to 4-6Mb before I attempted to optimise. The blog feels much nippier, though that could be partially due to the effects of caching.

So, I finally have a blog

Sunday, September 13th, 2009

Those of you who know me well will be surprised to find me embracing the world of blogging, as I have  been something of an outspoken critic of it. I have perfectly reasonable practical reasons for setting up a blog at this point, but I thought it would be good for me to reflect on whether I was wrong in the past, or whether I’m just being inconsistent.

First, the practical reasons: I need somewhere to write stuff that I want to announce to nobody in particular, but is too big to fit in a tweet. In the past I guess these kind of things might have been posted to Usenet, but those days are gone. I used to use Drupal for this, but I finally came to the conclusion that this was just a blog by any other name.

Secondly, there has been a technological change (albeit one where I am late to the party): RSS, twitter, delicious and various other link-sharing media have more or less obviated the difficulty of ploughing through posts that don’t interest you or blogs that haven’t been updated. It seems that whatever pretence had existed of blogs being a coherent series of time-based updates has been replaced with an à la carte approach where people can pick and choose what they want to listen to.

So why did I have so much hate for blogs in the past? I think one of the reasons also forms the root cause of my change of heart. Back in the good old days, there was a certain amount of optimism that the internet would be used for stimulating dialogue and exchange of ideas. At the time I was a keen student of Neil Postman, and saw an answer to many of the mysteries of modern society in his idea that changing the medium of exchange alters the sorts of social interchange that can be had, and potentially alters the whole course of society. To me, blogging seemed an obvious case of a medium with insidious weaknesses: encouraging input from uninformed amateurs, and drawing focus to the frivolous daily events over longer-term issues.

In recent years the position of blogging seem to have swung the other way: social media increasingly allows us to take part with almost no effort barrier to entry (Twitter being the obvious example). In this world blogs stand head and shoulders above the landscape by being written in complete sentences and expressing coherent thoughts with appropriate context.

So maybe I was wrong about blogging in particular, though I believe I was largely right about the shift in communication style and content. It’s worth noting that some of the problems thrown up by technology (wasting time reading blogs) have been solved by further technology (RSS). I remain an optimist about the possibilities of technological progress, provided we don’t accept change uncritically.