Here is a dirty little secret: most of us who live in single-time-zone countries are not comfortable with time zones. Hardly surprising given that we tend to treat wall clock time as a law of nature that only get adjusted twice a year. Now relating one’s day-to-day life to local time is all very well, but is this a good way to store timestamps in your data?
The simple solution is to store all times in UTC, and many systems are set up like that. The beauty of UTC is that it can be unambiguously translated to anything else you’d want as display time. However, once you are in the habit of storing local time – and more importantly, thinking in it – making the move could be a tricky culture shift.
We have a peculiar pitfall here in the UK, where local time happens to correspond to UTC during the winter months. This means that code written during the winter has a chance of showing up with a time zone bug when summer rolls around, either through straightforward confusion about what the code is persisting or as an integration problem where you thought both systems were speaking local time but it turns out that that one was UTC and one was local. Make no mistake, consistent use of UTC is a solid, simple solution, but that is easier said than done outside of greenfield development.
Since its 2.0 release, .Net had a slight edge of SQL Server in that its Datetime data type could carry a tiny chunk of time zone metadata, namely whether the time payload is local, UTC or unspecified. Compare this to the SQL Server situation, where Datetime has no embedded metadata whatsoever. Another tiny advantage of .Net’s Datetime is that the repeated hour during the time change into winter could be disambiguated from the pre-clock change hour. Anthony Moore’s A brief history of datetime and its follow-up did a great job of documenting the difficult journey towards the v2.0 Datetime and the inevitability of an eventual DatetimeOffset type, which SQL Server then also gained.
Now DatetimeOffset actually has multiple identity crises, which I’ll use to support my case for treating it as the stepping stone or gateway drug to UTC. Here is how a DatetimeOffset of 2pm UK time in summer tends to get thought about:
|Displayed time||Offset from UTC|
|1 Aug 2015 14:00||+01:00|
However, here is what is actually stored:
|Underlying UTC time||Display offset from UTC|
|1 Aug 2015 13:00||+01:00|
This has tremendous practical implications for querying indexed DatetimeOffset columns. Predicates on such columns completely ignore the display offset and are purely done on the underlying UTC datetime – the basic premise is that we are searching for a given “now”, whether that moment is expressed in Pacific Standard Time, Azerbaijan Time or any other time zone. As such, there is no question of translating between time zones for the purpose of data storage or retrieval; this is purely a presentation issue.
The second identity crisis concerns what is persisted. Sticking with my 2pm UK time example, we can confidently say that we stored a 1pm UTC time point with an implied locale of some place that had a one hour offset from UTC. What we are certainly not persisting is that this was a UK time. If we cared about associating geopolitical metadata with whatever is being logged, that is a separate concern and one I won’t go into here – something like the combination of a simple UTC timestamp and a .Net TimeZoneInfo object (or a surrogate geopolitical descriptor) would feature there. In the absence of anything this fancy, we aren’t really storing anything all that useful by adding the time zone offset to a time stamp, and we’re just buying the illusion that locale can be inferred from the stored time.
All this adds up to the argument that DatetimeOffset looks flashy, but the advantage over Datetime or Datetime2 is largely cosmetic. Nonetheless, in an argument between storing local time and storing UTC, it can be a fantastic compromise. The “local” side wins by having the stored time rendered their way, and the UTC side wins by having a UTC time actually stored. Do not underestimate the weight of politics here – this goes far beyond a technical argument.
What DatetimeOffset doesn’t buy you
As mentioned in my previous blog post, DatetimeOffset and Datetime2 don’t necessarily gain the advantage of higher resolution over Datetime unless time stamps are sourced from outside Windows. I don’t personally feel that a 1ms realisable resolution is worth getting excited about when your baseline is 3.3ms.
What remains is the argument that the correct display offset from UTC will always be retrieved by the use of sysdatetimeoffset() and its Windows cousins. This brings up an interesting edge case around seasonal clock changes. Sure, these only happen twice a year, but that is still two occasions per year when behaviour you count on is actually undefined, and it’s a devil of an edge case to code for and test.
The problem in a nutshell: clock changes should notionally happen at a predefined instant, but in practice Windows acts rather like a human with an electromechanical kitchen clock, updating the current time when it eventually gets around to it. Over a number of daylight savings changes I have indulged in the rather sad practice of running a script that logs timestamps in a loop with a 1 second wait between iterations, and I have seen the system clock getting updated anywhere between a few seconds to a few minutes late. The exact degree of lag is really irrelevant though; anything other than “textbook perfect” means that you are given bad time stamps, including ones that don’t exist in the local calendar. This problem exists equally for Datetime (getdate) and DatetimeOffset (sysdatetimeoffset).
Why this isn’t a problem for DatetimeOffset
So let’s say you got yourself such a nonexistent time, e.g. 01:00:15 UK time on the morning where the clocks were supposed to go straight from 1AM to 2AM. In an integration scenario, I have seen software reject this as input, since it really isn’t a valid local time. If this was stored as DatetimeOffset though, e.g. 01:00:15 +00:00, while it would still indicate an invalid wall clock time for the locale (wall clock should have read 02:00:15 +01:00) it’s suddenly less of a problem if you focus on the fact that the underlying UTC timestamp is identical in both cases, and exposing it with the correct offset is a simple presentation concern.
Allow me to drive that point home, because this is data type bait-and-switch in action. You sell the idea that you are storing local time, which appears to be the case based on how query results get displayed. And then when local timestamps let you down, the solution involves using the underlying UTC time which you’ve actually been storing all along. Sneaky, but nobody loses.
One place where the argument gets more interesting is that a Datetime fits in eight bytes (=one processor register now that we’re in a 64-bit-only world) whereas DatetimeOffset requires at least nine bytes. Ignoring the 12.5% increase in memory footprint, to somebody who wants to live close to the metal, a nine-byte structure smells bad, but you can at least console yourself with the thought that the 8-byte option would likely not have been quadword-aligned!
Bonus material: How many hourly job invocations are there in a 25-hour day?
When running in a locale with clock changes, SQL Agent does some creative stuff during a clock change. I can kind of see this being sensible from a user viewpoint, but I don’t believe it is well understood. Here is what happens to a job scheduled to run every hour on the hour:
- When the clocks move forward, we have a 23-hour day. Sticking with my familiar UK example, the midnight invocation is followed by the 2am one (although getdate() might still call it 1am!) and then the 3am one. So we get 23 job invocations on that calendar day, one for each hour that exists in that day’s calendar.
- When the clocks move back, we have a 25-hour day. Here it gets more interesting. In the UK we’d have two instances of 1am before 2am rolls around, but only one of them gets a job invocation, so the 25-hour day only gets 24 job invocations, one for each distinctly labelled hour. And one hour goes by without the job running.
I am personally fond of DatetimeOffset, but I am starting to view it as syntactic sugar over a hidden decision to work with UTC. And this way of thinking about it can be a very useful tool to have around for compromise solutions.
For further reading, have a look at this excellent blog post by Troy Hunt.
Enough of time for the time being. Some SQLOS internals should be coming your way soon.