Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's axiomatically true and ML models get a pass on being inefficient because people are so busy being impressed by their magic tricks.

In particular, using ML models for compression would get a free advantage because they're often several GBs where a traditional codec is KBs, so they can hide data in there, but most people wouldn't think of that as being part of the compressed message size.

But they could have very differently shaped dependency trees, especially if you somehow managed to not run an entropy coder as the last step of compression.



Totally agree, tgeneral compression that requires the dictionary to be transmitted on every message is not the best use case (if there even is any). Minimal description length is actually a nice information theoretical measure to describe power of ML models to compress. However I have also seen tons of theoretical proof that probably there is no optimal compression method ( I looked at grammar vs entropy coding some time ago) hence there is probably no optimal hardware in general. In the end it depends on your assumption about the the data (quality, structure, size, chunking,...) you want to compress.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: