It adds levity to the article and also introduces the reader to the sorts of things that can go wrong if they try it at home.
The last paragraph highlights how they fixed one of the main pitfalls I normally see in this sort of thing, where floating-point operations are mangled in myriad ways in the name of efficiency (almost always correct for physics or whatever, but a single bit being incorrect will occasionally mangle this compression scheme).
Mind you, actually doing what they claimed in that last paragraph is usually painful. The easiest approaches re-implement floating-point operations in software using integer instructions, and the complexity increases from there.
Not just efficiency, if you have e.g. floating point values arriving asynchronously to be accumulated, you'll always have a slightly unpredictable result.
Fun fact: Gemini 2.0 Flash is 100% deterministic with temp 0, unlike most models. This must be related to TPUs somehow, not sure why all previous Gemini versions are not like that, though.
The last paragraph highlights how they fixed one of the main pitfalls I normally see in this sort of thing, where floating-point operations are mangled in myriad ways in the name of efficiency (almost always correct for physics or whatever, but a single bit being incorrect will occasionally mangle this compression scheme).
Mind you, actually doing what they claimed in that last paragraph is usually painful. The easiest approaches re-implement floating-point operations in software using integer instructions, and the complexity increases from there.