Databasen - NULL dates

David Cantrell david at cantrell.org.uk
Wed Oct 18 15:29:03 BST 2006


On Wed, Oct 18, 2006 at 12:38:07PM +0100, Neil de Carteret wrote:
> On 18/10/06, Nigel Rantor <wiggly at wiggly.org> wrote:
> >We got into a discussion about whether or not to use NULL values for
> >dates that are as-yet-unknown.
> I think we're unanimous.  So go on then, what reasons were cited for
> not using NULL in this case?

NULL screws up comparisons.  You can always get around it, but it's a
pain.

I suppose that in the case where users can, for instance, optionally let
you know their date of birth, then I'd just put in a NULL.  But in quite
a lot of cases, when you don't know a date it's because an event hasn't
happened yet.  For example, signing up to a mailing list - I wouldn't
just have one table like this:
  email
  date_asked_to_join
  date_confirmed

I'd have two tables, one for people who have confirmed that they want to
join, and one for those for whom we are awaiting confirmation.  When
they confirm, move their record from one table to the other in a
transaction (INSERT with a sub-SELECT and then DELETE).  This has several
advantages:
  * no NULLs
  * faster query to get list of addresses on the list
    (the query that you'll execute the most)
    SELECT email FROM people_who_have_confirmed is always going to be
    faster than SELECT email FROM people WHERE date_confirmed IS NOT
    NULL
  * by needing one less index it might save space

On the other hand, it smells of denormalisation even when I don't think
it really is, as data isn't duplicated.  Certain types of query might
also need duplicating because you can't use placeholders for table
names - stored procedures will help here.

-- 
David Cantrell | top google result for "internet beard fetish club"

You can't spell "slaughter" without "laughter"


More information about the london.pm mailing list