Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That depends on how large the misprediction penalty is and how many times the branch executes. Branch prediction isn't magic. It never makes branches beneficial. The best it can do, with perfect accuracy, is bring the amortized misprediction penalty down to zero. The key is that Duff's Device also relies on branching, and it's a less predictable branch that's more likely to be mispredicted. On the other hand, one BP miss is still cheaper than several hits plus one miss at the end. On the other other hand, there are issues to consider like code size and cache turnover (favors a loop), handling of special cases where larger loads/stores can be used (favors DD), processor extensions, etc. If you're really trying to write an optimal memcpy-like function, you'll probably end up using a hybrid of several approaches - including Duff's Device.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: