Wednesday, July 31, 2013

Documenting DATEDIFF

Does anyone know how DATEDIFF works, in SQL Anywhere?

To put it another way, do you know exactly what DATEDIFF does, for all the date units: year, month, day and so on? Or even what it does for some of the date units?

As it turns out, I didn't, not exactly. And because DATEDIFF has appeared so many times in this blog, it's time for a closer look... especially after publishing Should Examples Work?

DATEDIFF hasn't just been used in this blog, it's been a cornerstone feature of how-to posts like Today's Tip: Counting Days of the Week and Everything Looks Like a Database as well as the subject of in-depth discussions like:
  • EXPRTYPE, DATEDIFF, MICROSECOND, BIGINT and Volker Barth announced that starting with Version 12 DATEDIFF returns BIGINT instead of INTEGER for hour, minute, second, millisecond and microsecond date parts.

  • Let's play "Gotcha!" - Round Two warned "don't let your SQL code stray outside the true useful range for the TIMESTAMP data type which is 1600-02-28 23:59:59 to 7910-12-31 23:59:59".

  • Let's play "Gotcha!" - Round Three warned "don't let the fact that DATEDIFF has been enhanced to return BIGINT values lead you to assume that DATEADD will accept a BIGINT... it won't; you're stuck using INTEGER values."

  • Beware CURRENT TIMESTAMP talked about how Windows might reset CURRENT TIMESTAMP backwards in time if a drifting system clock can't be handled by adjusting the clock rate.
Here's what the Help says about DATEDIFF:
This function calculates the number of date parts between two specified dates. The result is a signed integer value equal to (date-expression-2 - date-expression-1), in date parts.

The DATEDIFF function results are truncated, not rounded, when the result is not an even multiple of the date part.

When you use day as the date part, the DATEDIFF function returns the number of midnights between the two times specified, including the second date but not the first.

When you use month as the date part, the DATEDIFF function returns the number of first-of-the-months between two dates, including the second date but not the first.

When you use week as the date part, the DATEDIFF function returns the number of Sundays between the two dates, including the second date but not the first.
Here's what the Help should say:
DATEDIFF ( year, date-expression-1, date-expression-2 ) returns the integer number of year boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( quarter, date-expression-1, date-expression-2 ) returns the integer number of quarter boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( month, date-expression-1, date-expression-2 ) returns the integer number of month boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( week, date-expression-1, date-expression-2 ) returns the integer number of week boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( day, date-expression-1, date-expression-2 ) returns the integer number of day boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( dayofyear, date-expression-1, date-expression-2 ) returns the integer dayofyear difference = DATEPART ( dayofyear, date-expression-2 ) - DATEPART ( dayofyear, date-expression-1 ).

DATEDIFF ( hour, date-expression-1, date-expression-2 ) returns the bigint number of hours between date-expression-1 and date-expression-2.

DATEDIFF ( minute, date-expression-1, date-expression-2 ) returns the bigint number of minute boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( second, date-expression-1, date-expression-2 ) returns the bigint number of seconds between date-expression-1 and date-expression-2.

DATEDIFF ( millisecond, date-expression-1, date-expression-2 ) returns the bigint number of millisecond boundaries between date-expression-1 and date-expression-2.

DATEDIFF ( microsecond, date-expression-1, date-expression-2 ) returns the bigint number of microseconds between date-expression-1 and date-expression-2.
The following example shows all the DATEDIFF calculations for two timestamps that are exactly one microsecond (0.000001 second) apart. It shows
  • that almost all of them return the number of unit boundaries (1 for year, quarter, etc),

  • two return the number of units (0 for hour and second),

  • one (microsecond) effectively returns the number of units because it's at the limit of timestamp precision so the difference between "unit" and "unit boundary" is moot, and

  • one (dayofyear) returns a simple difference that disregards the context altogether; the value -364 might look funky but it makes sense when you consider the definition above.
SELECT DATEDIFF ( year,        '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS year,
       DATEDIFF ( quarter,     '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS quarter,
       DATEDIFF ( month,       '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS month,
       DATEDIFF ( dayofyear,   '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS dayofyear,
       DATEDIFF ( day,         '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS day,
       DATEDIFF ( week,        '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS week,
       DATEDIFF ( hour,        '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS hour,
       DATEDIFF ( minute,      '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS minute,
       DATEDIFF ( second,      '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS second,
       DATEDIFF ( millisecond, '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS millisecond,
       DATEDIFF ( microsecond, '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS microsecond;

year  quarter  month  dayofyear  day  week  hour  minute  second  millisecond  microsecond
  1      1       1      -364      1     1     0      1       0         1            1
Whether the difference between "units" and "number of unit boundaries" is important depends on what your code is trying to do.

For example, if you're trying to calculate age, then DATEDIFF really sucks; it only gets the right answer half the time.

Want proof? Here it is...
If a baby was born on 2012-07-27 and the CURRENT DATE is 2013-07-28, DATEDIFF is correct in telling us the baby's age is 1. A baby born two days later, however, has not reached its first birthday so DATEDIFF is wrong.
SELECT DATEDIFF ( year, '2012-07-27', '2013-07-28' ) AS correct_age,
       DATEDIFF ( year, '2012-07-29', '2013-07-28' ) AS incorrect_age;

correct_age  incorrect_age
      1           1
Conclusion: DATEDIFF might be of assistance to underage drinkers, but otherwise it isn't much use for calculating age.
Since many (most?) uses of DATEDIFF are a variation on the age calculation (number of days, seconds, whatever), the "unit boundary" calculation may indeed have serious implications for program (in)correctness. Sadly, none of the examples in the DATEDIFF Help topic demonstrate how the unit boundary calculation differs from one that counts actual units... but other examples in the Help may be (adversely?) affected by it (e.g., Should Examples Work?)

Need a laugh?


That big ugly example above comes from the Microsoft SQL Server version of DATEDIFF:
1> SELECT DATEDIFF ( year,        '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS year,
2>        DATEDIFF ( quarter,     '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS quarter,
3>        DATEDIFF ( month,       '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS month,
4>        DATEDIFF ( dayofyear,   '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS dayofyear,
5>        DATEDIFF ( day,         '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS day,
6>        DATEDIFF ( week,        '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS week,
7>        DATEDIFF ( hour,        '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS hour,
8>        DATEDIFF ( minute,      '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS minute,
9>        DATEDIFF ( second,      '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS second,
10>        DATEDIFF ( millisecond, '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS millisecond,
11>        DATEDIFF ( microsecond, '2011-12-31 23:59:59.9999999', '2012-01-01 00:00:00.0000000' ) AS microsecond
12> GO

 year  quarter  month  dayofyear  day  week  hour  minute  second  millisecond  microsecond
 ----  -------  -----  ---------  ---  ----  ----  ------  ------  -----------  -----------
    1        1      1          1    1     1     1       1       1            1            1
If you like [cough] consistency, you should love SQL Server: no more zeroes, no more -364 :)


No comments: