A lot of people in this thread are criticizing this move, but let me offer an opposite view.
One of the largest electronic health records systems has code that predates the UNIX epoch. Much of the time handling code is custom written to deal with this. However, the code was so poorly written that the system would lose data during the double 1 am window that occurs during daylight savings time shift. Hospitals would just shut off all of their computers during this time to deal with it.
As the article notes, issues with leap seconds have also brought down reddit and cloudflare. Many people in this thread are treating this like some sort of display of incompetence, but if you've ever written code that deeply interacts with time, you'd know how difficult it is to get right. A sign of a good system is one where it is difficult to fuck up.
IMO it is better to guarantee that time always moves forward rather than trying to match computer time to human time.
I don't see how replacing all UTC in software with TAI is more realistic than breaking UTC sync with UT1 (isn't it literally doing the same thing?). The whole point is that going forward, leap seconds are going to get harder to deal with. Especially in the case of a negative leap second, which seems like a more "true" y2k-like scenario.
The difference is that replacing usage of UTC with TAI is a voluntary choice made for each program, but redefining UTC to be a fixed offset relative to TAI, which is effectively just redefining UTC to be TAI, is a forced change on everything everywhere all at once that everybody has to handle because one of their dependencies changed.
It would be like silently changing the start of unix epoch time to 1800 instead of adding a new “Unix time since 1800” and asking people to switch.
Not at all. Everybody using UTC would just not need to deal with leap seconds anymore. A UTC second is the same as a TAI second. It's a no-op for the vast majority of UTC users. UTC will just drift slightly more from UT1.
This change only affects people who need UTC to be close to UT1 and also somehow don't know what UT1 is.
Sure, everybody using UTC when they actually want TAI would be a no-op, but then you irreversibly break everybody who actually wants UTC and assumed that UTC would not change meanings.
The people who would be unaffected by the redefinition can already just trivially switch manually (as we already assumed that just redefining things under them would work), leaving the UTC people alone. There is no good reason to silently break all programs carefully designed to use UTC correctly to fix all of the programs haphazardly written by people who did not know what they were doing and used UTC when they actually wanted TAI. Especially since fixing the wrong use of UTC is so trivial that we assume it can be done with no modification.
‘Programs carefully designed to use UTC’ would only irreversibly break by very slowly becoming out of sync with the rotation of the earth.
A few applications should switch standards, the question is whether solar concerned applications should switch to UT1, or continuity concerned applications should switch to TAI. The former is simpler, easier, cheaper, and only causes unexpected behavior (quite slowly), NOT systematic failure.
>IMO it is better to guarantee that time always moves forward rather than trying to match computer time to human time.
Not sure if you're playing Cunningham's Law or if you don't know this was the line of thought until everything was so far out of touch with reality, 10 days of time never existed, and official records were kept with dual-dates.
> However, the code was so poorly written that the system would lose data during the double 1 am window that occurs during daylight savings time shift.
> [...]
> Many people in this thread are treating this like some sort of display of incompetence, but if you've ever written code that deeply interacts with time, you'd know how difficult it is to get right.
Your example only speaks for the incompetence argument.
In reality, times and dates are really complicated. Luckily, the engineers at Facebook, Reddit, and Clouflare are being paid hundreds of thousands of dollars to show off their expertise. Is it that much to ask for them to read into details like leap seconds?
It is too much. I was Google SRE and there is an internal meme showing a time series graph jumping backwards during the double 1am at DST. These mistakes happen everywhere and are best avoided by a system that doesn't allow them to happen in the first place.
So advocates of memory safe (or even high level, period) programming languages are just showing off their incompetence in your book?
Would you say to an advocate of C (much less ... rust): Look man, real programmers write in boolean circuits. Programming is hard, sure, but the engineers at Facebook, Reddit, and Clouflare are being paid hundreds of thousands of dollars to show off their expertise. Is it that much to ask for them to read into details multiplication circuits?
:)
Leapseconds causing widespread failures isn't a hypothetical, just like buffer overflows aren't. Yet, in theory, with perfectly competent development ...
Yet even with perfect competence leapseconds are still pretty gnarly: They require systems have a trustworthy and consistent source of the list of leapseconds. ... and they mean that you fundamentally cannot predict the amount of time between two UTC timestamps when one or more of them is more than 6 months in the future... and no amount of competence can fix that.
> Hospitals would just shut off all of their computers during this time to deal with it.
FWIW, there are many things that deal with leap seconds that way too. Too much risk of ending up in a difficult to fix or silently corrupt state, while coming up from a reboot is highly tested and known to work.
The cost of leapseconds is quite significant.
> but if you've ever written code that deeply interacts with time, you'd know how difficult it is to get right.
Good odds that even if someone has that they got it wrong and don't know-- especially when it comes to leapseconds as they're fairly hard to test esp. with distributed systems and infrequent enough that you may not realize the cause even when you've suffered from an issue.
If one is relying on time of all actors in a distributed system to be perfectly in sync, you already have a bug, leap seconds or not. (unless you are Google Spanner)
For timers within a single system, use monotonic clock of your own cpu.
One of the largest electronic health records systems has code that predates the UNIX epoch. Much of the time handling code is custom written to deal with this. However, the code was so poorly written that the system would lose data during the double 1 am window that occurs during daylight savings time shift. Hospitals would just shut off all of their computers during this time to deal with it.
As the article notes, issues with leap seconds have also brought down reddit and cloudflare. Many people in this thread are treating this like some sort of display of incompetence, but if you've ever written code that deeply interacts with time, you'd know how difficult it is to get right. A sign of a good system is one where it is difficult to fuck up.
IMO it is better to guarantee that time always moves forward rather than trying to match computer time to human time.