Times and Dates

Today a coworker and I grappled with a weighty issue: Time.

The thing is, time is one of the most complicated concepts humans have invented for themselves, but we've managed to hide the complexity behind artificial facades. For example, all business record the time and date when transactions occur. Businesses want to know how much money was made when, and this information is summarized on a daily, monthly or quarterly basis, but in the computer realm we deal with seconds or milliseconds.

How many milliseconds are there in a month?

It's a trick question, because the length of a month depends on which month it is, and whether this is a leap year or not. People expect to be able to calculate the time between two dates, and also to be able to determine what the date some number of seconds from now. Both of these tasks are extremely annoying because dates and times use convoluted units of measure.
1000 milliseconds in a second->60 seconds in a minute->60 minutes in an hour->24 hours in a day->7 days in a week->28, 29, 30, 31 days in a month->12 months in a year->365 days in a year->almost 52 weeks in a year->almost 4 weeks in a month

Computers typically represent time in milliseconds since some arbitrary point. Linux, Unix and Java use Jan 01 00:00:00 UTC as the reference time. Thus the number 378113400000 represents Fri, 25 Dec 1981 07:30:00 GMT, while the number 378563400000 represents Wed, 30 Dec 1981 12:30:00 GMT. This is useful for determining how much time has passed from one instant to another but makes it difficult to determine what the date is at a particular instance. This difficulty is compounded by the fact that the calendar isn't continuous: during the Gregorian switchover (when most Christian nations switched from the Julian Calendar to the Gregorian Calendar) the date went from Thursday October 4, 1582 to Friday October 15, 1582. Except in countries that didn't switch calendars, including the United Kingdom, which rejected everything the Pope said, and thus they held on to the Julian Calendar until 1752, when Wednesday, September 2, 1752 was followed by Thursday, September 14, 1752 (an 11 day jump, because in the intervening years the Julian Calendar had become further incorrect). It is difficult to accurately represent times in this era, mainly because not everyone agrees on what a specific time should be called.

Then there's time-zones: there's dozens of them and they cause havoc with time. In local terms some days are 25 hours long or 23 hours long when the time changes from or to Daylight Saving Time. Luckily most computer software uses UTC, which doesn't have time-zones. Internally computers represent time in UTC and only convert to local time for display purposes. But there are still issues, because the computer still needs to know the rules for how to convert to local time and sometimes the rules change, necessitating a software update. In some jurisdictions the rules for Daylight Saving Time change every year.

Which brings me to the problem we are facing at my office: for better or worse we have a system where dates and times are recorded in a non-UTC format, which is to say, in the local timezone. This arrangement makes it difficult to examine data and compare it to other systems because the other systems use UTC. Except for the ones that don't, because they use some other timezone. This makes it problematic to look at logs and discuss the order of events because the timestamps all mean different things. It's like England in the 18th century: When an Englishman spoke of the date September 1st 1752, any Frenchmen nearby would have to mentally translate to Sept 13th 1752. However fixing the clocks and time-zone settings on the various systems won't solve the problem of systems that don't speak UTC and can't be made to. For those there is nothing to be done. But as a lesson to future system designers: do everyone a favour and
  1. Synchronize the clocks of your computer gear, and
  2. Always store dates/times in UTC
The maintainers of your system will thank you.

No comments: