The Kobayashi Maru of Comparing Dates with Times
Saw a fascinating tweet this morning from the excellent Userlist team.
At a minimum, one hundred replies in a Slack thread is the going rate for asking a question about an edge case about times and dates in programming. Just for future reference.
Since Twitter embedding crops things a bit aggressively, here’s the image in the tweet directly:
Immediately, the tweet replies were chaos. The answers included “Yes/Yes/No”, “No/No/No” (which was my initial response, but I’ve also chaotically changed my mind since then), “No/No/Yes”, “Dates are the worst”, and “Have you tried using 🍺? Maybe lots and lots of 🍻 and then maybe just like pretending this never happened?”
As someone who wrote entirely too much about dates and times and edge cases, I found this discussion to be an exciting, total waste of time, and I really couldn’t wait to dig into it some more.
First of all, it’s all wrong
If you find yourself in this position, clearly something has gone wrong. And I say this without judgement, because unless you work in a clean room on a product that has no users and the code is no more than one hour old, you’re going to have issues with inconsistent data, and inconsistent storage of datestamps and timestamps is incredibly common. I ran into a similar problem this just this week with a multitude of inconsistent datestamps and timestamps I had inherited.
And that’s what happened here:
These are inputs from the user interface and data from a database. So it’s also very much about what (both technical and non-technical) people expect to happen.— Benedikt Deicke (@benediktdeicke) February 12, 2022
What’s more, there’s an additional issue in the original question that was left out, which astute readers of UTC is Enough for Everyone, Right? might have picked up on already: there’s no timezone data! That’s fairly understandable for the date values, since timezones don’t tend to be needed (which, like most things in this realm, is not always true), but it would be needed for the time values to ensure a valid comparison (conceivably a comparison of
12:00:00 on the same “day” could be off by as much as a couple of actual days, depending on the timezone). And, of course, you’d need to store more than just the offset if you properly wanted to account for historical daylight saving time.
Apples and oranges and orange-colored apples
So this is kind of a no-win solution. I can make a valid argument for pretty much every answer in the tweet replies (especially the one about beer). The reason why it’s particularly frustrating is because we make these snap decisions all the time in our day-to-day lives, and it’s pretty easy. When I tell you to meet me for lunch at noon today, there’s a highly likely possibility that we’re talking about being in the same place, the same timezone, and the same day as each other, so we don’t have to spend a lot of time thinking about this.
The issue is that dates and times are kind of supersets of each other. Which is a bit of a paradox. But let’s look at it further:
- A Date can be thought of a superset of a Time, because all hours and minutes and seconds naturally belong to a certain day.
- A Time can be thought of a superset of a Date, if you consider timespans to be an extention of a particular Time. Noon to noon spans two separate days. Or, well, just one day, if it’s a 24 hour day instead of calendar day. Even my paradox explanations have paradoxes in them.
Both of these are kind of wrong definitions, and also kind of right, which is why you can make these justifications for so many answers here. It really comes down to what you value.
Literally we’re all just making it up
This is the crux of why I find this stuff so fucking fascinating. We programmers famously tend to view the world in a real binary sense: the code runs or it doesn’t. The tests pass or they don’t. The record is valid or it isn’t. It’s a very helpful way of looking at the world… until it isn’t. Usually it’s not baked directly into the system, though.
There are certain examples where inherently correct answers don’t exist. My favorite example of this — and yes, I have favorite examples of time issues — is recurring events over a location-dependent daylight saving boundary.
Say you have a weekly 10am Pacific Time recurring calendar event with the rest of your dev team. Like a proper remote-first company, you’re broadly distributed instead of just having a main team in one city and one rando working “remote”. That’s great, but it also increases the odds of having one city or region or country having different daylight saving policies than the others (if they even have one at all).
You keep having your 10am call for a few months, and then Virginia, who lives in West Virginia, starts her daylight saving time in March. So now the 10am Pacific meeting time, which was 1pm her time, would still be 1pm for her since Pacific time always follows DST, too (except for Yukon, in Canada, which decided in 2020 to stop following DST, of course). But your teammate Brooklyn, who lives in Phoenix, doesn’t have daylight saving time because she lives in Arizona, which doesn’t follow daylight saving time (but don’t forget that the Navajo Nation in the northern part of Arizona does follow DST, so be sure to keep that in mind, too).
So: does Brooklyn keep at her normal 11am meeting time, or does she move forward with Brooklyn and have to start meeting at noon? Nothing in her life changed. It’s weird for a meeting to suddenly span her lunch break when it didn’t before.
The answer is: there is no answer. We’re just making it up. And that’s the answer. Google has a writeup on their developer API docs for Calendar broadly talking about how recurring events interact with invited attendees. One aspect:
For recurring events a single timezone must always be specified.
This is really the only reasonable approach that can be taken here. They decided that the creator of the original event wins out, and the effects of that permeate out to attendees as well.
This is also detailed in RFC 5545:
* The “DTSTART” and the “TZOFFSETFROM” properties MUST be used when generating the onset DATE-TIME values (instances) from the “RRULE”.
(RFC 5545 is a great, titillating read, too. Thought I was reading a Harlequin romance novel with how vividly they describe all the
RRULE specifications and permutations.)
So just make it up
So: bringing it back to the original topic at hand… I think this is a case where we get to just make the arbitrary decision. There’s no “right” or “wrong” answer.
I was curious how others broadly thought about the question, so I looked into it some more.
Ruby has a lovely treatise about Shakespeare while they discuss using
DateTime which touch on some of these issues- it’s worth reading the whole section, but this part is most relevant here:
So when should you use
DateTimein Ruby and when should you use
Time? Almost certainly you’ll want to use
Timesince your app is probably dealing with current dates and times. However, if you need to deal with dates and times in a historical context you’ll want to use DateTime to avoid making the same mistakes as UNESCO. If you also have to deal with timezones then best of luck - just bear in mind that you’ll probably be dealing with local solar times, since it wasn’t until the 19th century that the introduction of the railways necessitated the need for Standard Time and eventually timezones.
.NET also kind of goes with a “oh god just try to avoid any of this type of thing” philosophy:
To determine the relationship of
t2, the Compare method compares the Ticks property of
t2but ignores their
Kindproperty. [Ed. note:
Kindhere is effectively timezone-related data.] Before comparing
DateTimeobjects, ensure that the objects represent times in the same time zone.
Something called a “TIBCO ActiveMatrix BPM 4.2” punted on it entirely:
The method still works if the date/times are more than 14 hours apart, but if they are less than 14 hours apart, the result is deemed to be indeterminate.
Moment.js also has some caveats with comparisons:
As the second parameter determines the precision, and not just a single value to check, using day will check for year, month and day. [Ed. note: this basically means you have to choose which precision to decide- in the case of the original tweet, you’d pick either days or seconds.]
[ … ]
If the two moments have different timezones, the timezone of the first moment will be used for the comparison.
I don’t believe in the no-comparison date scenario
This whole post is basically a long-winded way of reiterating that humans create some very weird scenarios when it comes to dates and times. Raphael Schaad, founder of the fantastic Cron calendar, has a long-running thread on broken assumptions of calendaring that, frankly, don’t always have logical solutions. It’s all a bunch of trade-offs.
And that’s why I found the original tweet so interesting. We’ve all had situations where the data wasn’t consistent, or a third party munged data incorrectly for us, or where we’ve inherited data that was saved under incorrect assumptions (correct as they might have been at the time). The thought process can lead to a hundred-reply Slack thread. Or an entirely too-long and too-detailed Saturday morning post about dates and times.
So, with all that said… enjoy the rest of your Saturday!
Just kidding. I know some of you are on to Sunday already. And some of you will — gasp — read it years from now! Pretty sure no one read this a week ago, though. If you did, please remind me a few days ago that I will need to write this a few hours ago from now.