<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tim Martin&#039;s blog &#187; Database</title>
	<atom:link href="http://blog.asymptotic.co.uk/category/softwaredev/database-softwaredev/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.asymptotic.co.uk</link>
	<description>On the human side of software</description>
	<lastBuildDate>Fri, 10 Sep 2010 08:28:55 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>You were never meant to do that with SQL</title>
		<link>http://blog.asymptotic.co.uk/2009/12/you-were-never-meant-to-do-that-with-sql/</link>
		<comments>http://blog.asymptotic.co.uk/2009/12/you-were-never-meant-to-do-that-with-sql/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 17:10:13 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Optimisation]]></category>
		<category><![CDATA[programming languages]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://blog.asymptotic.co.uk/?p=422</guid>
		<description><![CDATA[There seems to be a lot of hatred for SQL in the world at the moment: I can&#8217;t think of any other reason why the term NoSQL would catch on in the way that it has, when the key technological distinction is actually the lack of ACID guarantees (which are entirely orthogonal to whether or [...]]]></description>
			<content:encoded><![CDATA[<p>There seems to be a lot of hatred for SQL in the world at the moment: I can&#8217;t think of any other reason why the term <a href="http://en.wikipedia.org/wiki/Nosql">NoSQL</a> would catch on in the way that it has, when the key technological distinction is actually the lack of <a href="http://en.wikipedia.org/wiki/ACID">ACID guarantees</a> (which are entirely orthogonal to whether or not SQL is used, as evidenced by non-ACID MySQL and <a href="http://wiki.apache.org/hadoop/Hive/LanguageManual">HiveQL</a>, which offers a pretty familiar SQL-like interface on an entirely non-traditional backend).</p>
<p>I wonder whether one of the unspoken reasons for this hatred is that at one point or another almost everyone has ended up doing this sort of thing:</p>
<pre>   builder.Add("SELECT foo FROM bar WHERE id = ");
   builder.Add(id.ToString());

   if (additionalConstraint)
   {
      builder.Add(" AND frobbable = 1 ");
   }

   /* ... ad nauseam ... */</pre>
<p>SQL is a hard language to like: it&#8217;s never been properly standardised (or rather, it has, but the standard has never been implemented) meaning that you spend too much time worrying about compatibility. Its theoretical underpinning is poor, leading to constructions that are hard for the engine to optimise (meaning more manual work).</p>
<p>However, SQL is a language in its own right, and was never intended to be generated programmatically by another programming language. This shouldn&#8217;t come as a surprise, as I struggle to think of any programming language that has been designed to work in this way.</p>
<p>Using SQL from a decent command-line environment is a  powerful tool and often a pleasure to use. By comparison, generating SQL programmatically is an abomination that would be worth of <a href="http://thedailywtf.com/">The Daily WTF</a> were it not for the fact that nobody&#8217;s ever invented an API that offers the same flexibility.</p>
<p>Personally, I blame the vendors. Until RDBMSs can offer the same quality of optimisation that modern compilers can (that is, write in a high-level language and never even think about micro-optimisation) high-performance relational database access will remain a sea of vendor-specific optimiser hacks. Maybe there&#8217;s a theoretical reason why optimisers will never be this good, in which case perhaps we do need to abandon the relational model in practice. But let&#8217;s not pretend it has anything to do with SQL.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.asymptotic.co.uk/2009/12/you-were-never-meant-to-do-that-with-sql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Relational Database Basics: What is a relation?</title>
		<link>http://blog.asymptotic.co.uk/2009/12/relational-database-basics-what-is-a-relation/</link>
		<comments>http://blog.asymptotic.co.uk/2009/12/relational-database-basics-what-is-a-relation/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 10:19:31 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[relational theory]]></category>

		<guid isPermaLink="false">http://blog.asymptotic.co.uk/?p=321</guid>
		<description><![CDATA[The biggest misunderstanding people tend to have with the relational model must be the understanding of the term &#8220;relation&#8221; itself. Since people tend to learn relational theory as an add-on to learning about SQL, they naturally learn that the things you put the data in are called &#8220;tables&#8221; and that tables are related to each [...]]]></description>
			<content:encoded><![CDATA[<p>The biggest misunderstanding people tend to have with the relational model must be the understanding of the term &#8220;relation&#8221; itself. Since people tend to learn relational theory as an add-on to learning about SQL, they naturally learn that the things you put the data in are called &#8220;tables&#8221; and that tables are related to each other. The natural (but incorrect) assumption then, is that &#8220;relational&#8221; refers to the relationships that exist <em>between</em> tables, and this couldn&#8217;t be more wrong.</p>
<h3>The simplest explanation</h3>
<p>Put simply, a &#8220;relation&#8221; is what SQL calls a table. If you learn nothing else about relational theory, at least understand this. This is an oversimplification of course, but it&#8217;s close enough to being true that if you don&#8217;t want to learn any theory, it will at least make the discussions of theorists more comprehensible.</p>
<h3>The mathematical explanation</h3>
<p>This isn&#8217;t the best way to <em>understand</em> what a relation is, but if you intend to have meaningful discussions with other practitioners, you will need to have a common understanding based on a definition that is unambiguous. I&#8217;ll therefore get the mathematical explanation out of the way here; if it doesn&#8217;t make much sense, return to it after the more intuitive description below. Though this is a relatively formal description, it doesn&#8217;t come close to being totally precise, and anyone who wants to know more is encouraged to investigate a book on the subject.</p>
<p>An <em>attribute</em> is a combination of a <em>name</em> and a  <em>type</em> identifier, where we can for the moment treat a type as being a (possibly infinite) set of values with some operators defined on it. Think of an attribute as being like a column definition.</p>
<p>A <em>tuple</em> is a set of distinct attributes, where each attribute is associated with one value that is an instance of the type for that attribute. The members of a tuple do not posess an inherent order, and tuples ordered in different ways for display purposes nevertheless represent the same tuple.</p>
<p>A <em>relation</em> consists of a <em>heading</em> and a <em>body</em>. The heading is a (possibly empty) set of attributes with distinct names. The body is a (possibly empty) set of tuples, each of which has the same set of attributes as the heading of the relation.</p>
<p>To put this in terms familiar to an SQL user: an attribute is analogous to a column definition, a tuple is analogous to a row and a relation is analogous to a table. Note that this is an over-simplification, mostly because we think of the rows and columns of a table as posessing an inherent order, and mathematical relations have no such order.</p>
<p>One other thing that bears stating at this point is that a relation is technically an <em>immutable</em> value, and is held in a mutable variable called a <em>relvar</em>. This is analogous to common programming practice where an integer like 5 is immutable, but held in a mutable integer variable. If you &#8220;insert a row into a table&#8221;, then actually you change the contents of that relvar from one relation to another. This distinction is rarely of relevance in discussing theoretical issues.</p>
<h3>The intuitive explanation</h3>
<p>Unless you&#8217;re already familiar with relational theory, that was probably all rather unclear, in which case the only vital things to take away are: columns are unordered, and rows are unordered. If you are familiar with relational theory, you&#8217;re probably angry at me for making so many mistakes, in which case please point them out in the comments.</p>
<p>So what does this mean in intuitive terms? A common intuitive feeling about tables is that they represent a list of entities, and indeed this understanding works nicely for simple cases. Take a table of salaried employees in FictoCorp:</p>
<div class="mceTemp">
<dl id="attachment_324" class="wp-caption alignnone" style="width: 543px;">
<dt class="wp-caption-dt"><img class="size-full wp-image-324" title="employees_table" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/employees_table.png" alt="Table showing list of employees in a fictional company" width="533" height="184" /></dt>
</dl>
</div>
<p>The head of HR for this company might look at the table and say, &#8220;yep, those are my employees—I&#8217;d recognise &#8216;em anywhere.&#8221; As far as they&#8217;re concerned, each row in this table represents one of the employees they have to deal with. Furthermore, no row represents more than one employee, and there&#8217;s no employee of the company who doesn&#8217;t have a row.</p>
<p>It just so happens that FictoCorp (who have a lot of important customers in the netball industry) has a policy that all employees must play for one of the company&#8217;s netball teams. In order to keep track of this, the team captain keeps the following table in the company database:</p>
<div class="mceTemp">
<dl id="attachment_325" class="wp-caption alignnone" style="width: 543px;">
<dt class="wp-caption-dt"><img class="size-full wp-image-325" title="players_table" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/players_table.png" alt="A table showing the netball teams and positions of fictional players" width="533" height="184" /></dt>
</dl>
</div>
<p>We&#8217;ll simplify things by only displaying four employees, though obviously there would be more.</p>
<p>As an aside, netball has the nice property that the positions are named and unique; no player can be on the same team playing in the same position as another player. Therefore the combination of Netball Team and Position uniquely identifies a single employee. Obviously this constraint makes it impossible for FictoCorp to hire or fire people other than in unisex groups of 7 (in order that they can add or remove an entire netball team at once), but hey, it&#8217;s worth it for all the lucrative netball-industry contacts.</p>
<p>As far as the netball club captain is concerned, the entries in this table <em>are</em> the employees. Any employee will be in this table, and anyone in this table is an employee. So who is right, the HR manager or the netball club captain? Which table &#8220;holds&#8221; the employees? And if one table &#8220;is&#8221; the set of employees, what does that mean about the other table?</p>
<h4>A digression</h4>
<p>FictoCorp&#8217;s netball teams are so successful that the major league teams start to send talent scouts to their games. One day, the manager of a professional team rings up to enquire about hiring one of FictoCorp&#8217;s players.</p>
<blockquote><p>&#8220;He was brilliant, we just have to have him &#8230; Any price, any price at all &#8230; His name? I don&#8217;t remember that, but he was definitely playing Wing Attack for your Men&#8217;s First team&#8221;</p></blockquote>
<p>Luckily, with this information is all that is needed to identify that the player in question is Charles. The table of netball players worked equally well as a way of finding a player from their netball team and position as vice versa.</p>
<p>From the point of view of an outsider to FictoCorp, the table <em>is</em> a list of teams and playing positions, with the useful effect that the player&#8217;s name can be looked up. The talent scout&#8217;s view and the club manager&#8217;s view of the meaning of the table are different, but both are using the same table.</p>
<h4>Resolving the ambiguity</h4>
<p>The netball players table is neither a container of people, nor a container of playing positions. Both of these are <em>extrinsic</em> to the table: they will continue to exist if the table is deleted, though FictoCorp may no longer have the information it needs to get the necessary work done.</p>
<p>One way to think of the relation is in terms of the corresponding <em>predicate</em>: a function that takes a group of objects and produces a true or false value. An informal definition of the predicate for the netball players table might be:</p>
<p style="padding-left: 30px;">There exists a player called X, who plays on team Y in position Z</p>
<p>If we substitute into this values from the table, we get true values from the function:</p>
<p style="padding-left: 30px;">There exists a player called <strong>Alice</strong>, who plays on team <strong>W1</strong> in position <strong>GA</strong> (<em>true</em>)</p>
<p style="padding-left: 30px;">There exists a player called <strong>Charles</strong>, who plays on team <strong>M1</strong> in position <strong>WA</strong> (<em>true</em>)</p>
<p>If we substitute in other values, we get false values from the function</p>
<p style="padding-left: 30px;">There exists a player called <strong>Charles</strong>, who plays on team <strong>W1</strong> in position <strong>WA</strong> (<em>false</em>)</p>
<p>You can think of this as a function on a 3-dimensional space, where one dimension is the list of every person in the world, one dimension is every netball team FictoCorp has and the final dimension is every possible position in a netball team:</p>
<div class="mceTemp">
<dl id="attachment_332" class="wp-caption alignnone" style="width: 356px;">
<dt class="wp-caption-dt"><img class="size-full wp-image-332" title="relation_visualisation" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/relation_visualisation.png" alt="Diagram of a relation on a 3-dimensional space" width="346" height="351" /></dt>
</dl>
</div>
<p>The predicate is a function over this entire 3-dimensional space. The tuples (rows) in the relation represent points in this space for which the function evaluates to true. Tuples that could be in the table, but aren&#8217;t, represent points in this space for which the predicate evaluates to false.</p>
<p>Things to note:</p>
<ul>
<li>The predicate evaluates to true or false on every point in this space; nowhere in the space is the predicate undefined</li>
<li>The predicate can&#8217;t be evaluated anywhere <em>but</em> points in this space; it would be meaningless to do so</li>
</ul>
<p>In a sense, the predicate give the <em>meaning</em> of the table, and this meaning won&#8217;t change as we add and remove players from various teams. The tuples in the relation (the rows in the table) show us what is currently true <em>in the real world</em>. It is a goal of a well-maintained database that the facts implied by the table always remain a true representation of what is true in the real world, for drawing conclusions about the real world is the reason databases exist.</p>
<h4>Objections to this model</h4>
<p>One obvious objection to this model is that if people, salaries, netball team positions etc. are all extrinsic to the tables, how do we keep track of an entity that happens not to appear in any of the tables? If FictoCorp has a contractor called Edgar working for the company who isn&#8217;t in the employees table, and is excused from being in any of the netball teams, how do we keep track of this person?</p>
<p>The answer is that the database contains all the information we want to store, <em>and nothing else</em>. If the database system needs to be able to be used to answer questions about contractors, it will have a contractors table in which Edgar will appear. If for some reason FictoCorp doesn&#8217;t care to know what contractors it has relationships with, then Edgar will be a non-entity as far as the database is concerned.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.asymptotic.co.uk/2009/12/relational-database-basics-what-is-a-relation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Book Review: SQL and Relational Theory</title>
		<link>http://blog.asymptotic.co.uk/2009/12/book-review-sql-and-relational-theory/</link>
		<comments>http://blog.asymptotic.co.uk/2009/12/book-review-sql-and-relational-theory/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 14:43:09 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[relational theory]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://blog.asymptotic.co.uk/?p=341</guid>
		<description><![CDATA[

 


The first thing to know about SQL and Relational Theory is that it&#8217;s largely a retread of Chris Date&#8217;s previous excellent book Database in Depth. The latter is a favourite of mine: extremely readable, yet with enough theoretical clout to change the way I looked at databases forever. The new volume carries over large [...]]]></description>
			<content:encoded><![CDATA[<div class="mceTemp">
<dl id="attachment_340" class="wp-caption alignleft" style="width: 132px;">
<dt class="wp-caption-dt"> <a href="http://www.amazon.co.uk/gp/product/0596523068?ie=UTF8&amp;tag=reviewtfm-21&amp;linkCode=as2&amp;camp=1634&amp;creative=19450&amp;creativeASIN=0596523068"><img class="size-full wp-image-340" title="51cUKPgnCyL._SL160_" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/51cUKPgnCyL._SL160_.jpg" alt="Front cover of the book &quot;SQL and Relational Theory&quot;" width="122" height="160" /></a></dt>
</dl>
</div>
<p>The first thing to know about <a href="http://www.amazon.co.uk/gp/product/0596523068?ie=UTF8&amp;tag=reviewtfm-21&amp;linkCode=as2&amp;camp=1634&amp;creative=19450&amp;creativeASIN=0596523068">SQL and Relational Theory</a> is that it&#8217;s largely a retread of Chris Date&#8217;s previous excellent book <a href="http://www.amazon.co.uk/gp/product/0596100124?ie=UTF8&amp;tag=reviewtfm-21&amp;linkCode=as2&amp;camp=1634&amp;creative=19450&amp;creativeASIN=0596100124">Database in Depth</a>. The latter is a favourite of mine: extremely readable, yet with enough theoretical clout to change the way I looked at databases forever. The new volume carries over large chunks of the text from the older one, with some minor tweaks. As the name suggests, it brings in substantial additional material to link relational theory in with SQL, the only practical implementation of the model in current use.</p>
<div class="mceTemp">
<dl id="attachment_344" class="wp-caption alignright" style="width: 132px;">
<dt class="wp-caption-dt"><a href="http://www.amazon.co.uk/gp/product/0596100124?ie=UTF8&amp;tag=reviewtfm-21&amp;linkCode=as2&amp;camp=1634&amp;creative=19450&amp;creativeASIN=0596100124"><img class="size-full wp-image-344" title="41MQ41V09GL._SL160_" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/41MQ41V09GL._SL160_.jpg" alt="The front cover of &quot;Database in Depth&quot;" width="122" height="160" /></a></dt>
</dl>
</div>
<p>In the preface, Date explains that the motivation for the new book was the realisation that practitioners weren&#8217;t able to figure out for themselves how to apply his theoretical ideas within SQL. Clearing up this difficulty is an admirable goal, and illustrates well that Date&#8217;s approach is practical and not meant as ivory-tower theory, but I can&#8217;t help but wonder if one of the reasons he didn&#8217;t state was that books sell better with &#8216;SQL&#8217; in the title.</p>
<p>The additional material has resulted in a book that is roughly twice as long. This isn&#8217;t a problem in itself, though it does spoil one of the things I loved about &#8220;In Depth&#8221;: that it could be read in a couple of evening&#8217;s work by a sufficiently motivated person. The importance of making a book light enough that you can sit and read it on the sofa without looking like a database nerd should not be understated.</p>
<p>The prose remains clear and readable, and strikes a nice balance that makes it approachable to relative beginners while avoiding ever sounding patronising. Date&#8217;s style is precise to a fault, and some people will find it needlessly pedantic; nevertheless, there isn&#8217;t any pointless pedantry here, and if you stick with it you&#8217;ll learn why subtle distinctions need to be made.</p>
<p>So how useful are the new insertions on SQL? I find it difficult to tell. On the one hand, it makes it much easier to relate the ideas in this book to discussions of theory that actually occur in the real world, since SQL is the <em>lingua franca</em>. In the old book, it was certainly annoying to have all the examples written in Tutorial D, without a real specification of how the language works. On the other hand, Date&#8217;s examples in this book are still in a mythical beast called &#8220;Standard SQL&#8221;, of which no practical implementation exists. What is good practice in standard SQL might be impossible in your chosen implementation, or there might be a better way to achieve the same thing.</p>
<p>It&#8217;s certainly worth buying one of the two books here, but the choice of which is not as obvious. If you already own &#8220;In Depth&#8221;, the updated version probably isn&#8217;t worth buying. If you don&#8217;t, then &#8220;SQL and Relational Theory&#8221; is the thing to buy, unless you&#8217;re after a lighter and more portable read.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.asymptotic.co.uk/2009/12/book-review-sql-and-relational-theory/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Relational Database Basics: What is Atomicity?</title>
		<link>http://blog.asymptotic.co.uk/2009/11/relational-database-basics-what-is-atomicity/</link>
		<comments>http://blog.asymptotic.co.uk/2009/11/relational-database-basics-what-is-atomicity/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 21:37:16 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[relational theory]]></category>

		<guid isPermaLink="false">http://blog.asymptotic.co.uk/?p=245</guid>
		<description><![CDATA[Atomicity is an important concept in databases, indeed it&#8217;s a key part of the definition of first normal form. But it&#8217;s a surprisingly slippery concept, and our intuitive ideas don&#8217;t seem to serve us well enough.
Codd gave the definition that atomic data is data that &#8220;cannot be decomposed into smaller pieces by the DBMS (excluding [...]]]></description>
			<content:encoded><![CDATA[<p>Atomicity is an important concept in databases, indeed it&#8217;s a key part of the definition of first normal form. But it&#8217;s a surprisingly slippery concept, and our intuitive ideas don&#8217;t seem to serve us well enough.</p>
<p>Codd gave the definition that atomic data is data that &#8220;cannot be decomposed into smaller pieces by the DBMS (excluding certain special functions)&#8221;. Taken literally and not allowing for ad hoc exclusions, this definition would require that every field be a single boolean value: a string can be decomposed into characters and even an integer can be decomposed into prime factors, if we care to do so. Clearly we can choose a set of allowable operators that give a sensible definition of atomicity, but we risk begging the question.</p>
<p>The observation above leads fairly naturally to the idea that the concept of atomicity is a product of the operators we intend to use on the data. When you start to look at things this way, the intuitive grasp of which relations are in first normal form turns out to be more complicated than you might think. Take the following relation for example, which I&#8217;m going to assume everybody will agree is in first normal form:</p>
<p><img class="alignnone size-full wp-image-248" title="Database atomicity uncontroversial example" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/Database-atomicity-uncontroversial-example.png" alt="Database atomicity uncontroversial example" width="394" height="180" /></p>
<p>Let&#8217;s assume that Alice, Bob and Charles all work on the market selling fruit and vegetables, and that in their part of town the only products that customers have any interest in are Apples, Bananas, Cherries and Durians.</p>
<p><img class="alignnone size-full wp-image-253" title="Database atomicity controversial example" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/Database-atomicity-controversial-example1.png" alt="Database atomicity controversial example" width="394" height="180" /></p>
<p>Many people would claim that this is not in first normal form, since the &#8220;products sold&#8221; field is non-atomic. However, there is a fairly simple isomorphism between the two cases.</p>
<p>For a start, we can map our unordered set of products sold into an ordered tuple quite easily, since there is a finite number of elements that are allowed to be in the set (since greengrocers in this part of town can sell only the four products).</p>
<p><img class="alignnone size-full wp-image-255" title="Database atomicity isomorphism" src="http://blog.asymptotic.co.uk/wp-content/uploads/2009/11/Database-atomicity-isomorphism1.png" alt="Database atomicity isomorphism" width="337" height="112" /></p>
<p>However, there&#8217;s also a trivial isomorphism between ordered tuples of booleans and integers in an appropriate range, given by the binary encoding of the integer. It so happens that if we assume karate ranks run from 10th Kyu to 6th Dan (essentially -10 to +6, with no zero) we can biject these with the numbers 0 to 15. If you turn the sets of products into tuples this way, and then turn them into numbers, then map these numbers to karate grades, you&#8217;ll find that the output data is exactly the same as the first relation, which is in first normal form.</p>
<p>How to make sense of this? Normal forms eliminate (some) redundancy, but they don&#8217;t enforce good design. The second table may be in first normal form, but it isn&#8217;t good design. The reason that it isn&#8217;t good design has nothing to do with relational theory and everything to do with the way in which we intend to use the data. &#8220;Does Alice sell Durians?&#8221; is a reasonable question to ask, but &#8220;Is Alice&#8217;s karate rank isomorphic to an odd number?&#8221; is a directly equivalent but unreasonable question to ask. As a database designer, it is your job to anticipate as many valid questions as possible, without over-complicating the model to support invalid questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.asymptotic.co.uk/2009/11/relational-database-basics-what-is-atomicity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Customising sort order in MySQL</title>
		<link>http://blog.asymptotic.co.uk/2009/11/customising-sort-order-mysql/</link>
		<comments>http://blog.asymptotic.co.uk/2009/11/customising-sort-order-mysql/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 09:39:46 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://blog.asymptotic.co.uk/?p=215</guid>
		<description><![CDATA[I discovered a way to modify the sort ordering in MySQL, in order to sort specific text values lower than they normally would, without having to write a custom function.]]></description>
			<content:encoded><![CDATA[<p>I came across a situation the other day where I needed to sort my result set in the database for efficiency: we were selecting a small number of a very large result set, so sorting on the client would require the entire data set to travel over the wire. Unfortunately, the requirements of the front end weren&#8217;t compatible with MySQL sort order, since the customer wanted empty strings sorted to the bottom of the list.</p>
<p>In other words, I wanted to be able to do something like the following:</p>
<pre>  SELECT id, name
    FROM people
ORDER BY name
   LIMIT 20;</pre>
<p>and end up with the result set</p>
<table border="0">
<tbody>
<tr>
<th>id</th>
<th>name</th>
</tr>
<tr>
<td>3</td>
<td>Alice</td>
</tr>
<tr>
<td>7</td>
<td>Bob</td>
</tr>
<tr>
<td>4</td>
<td></td>
</tr>
</tbody>
</table>
<p>Changing the schema wasn&#8217;t an option, and nor was doing the majority of the sort in MySQL and post-filtering, since that would cause the wrong number of results to be returned after some had been shuffled to the bottom of the list. Luckily, there&#8217;s a quick hack in MySQL that allows this sort of thing to be done:</p>
<pre>
<pre>  SELECT id, name
    FROM people
ORDER BY FIELD(name, ''), name
   LIMIT 20;</pre>
</pre>
<p>This works because the <code>FIELD()</code> function returns the index of the name value in the list of fields given, and 0 if it is not present. That is, if the name is an empty string, this expression will return 1, and if not it will return 0. This causes empty strings to be sorted to the end of the list as desired.</p>
<p>Note that this prevents an index from being used for the ordering, which may be a problem depending on the size of the result set. In my case, the query was such that an index couldn&#8217;t be used anyway, so there was no substantial loss of efficiency.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.asymptotic.co.uk/2009/11/customising-sort-order-mysql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
