Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Notes on debugging HotSpot's JIT compilation (2023) (jornvernee.github.io)
59 points by lichtenberger on March 26, 2024 | hide | past | favorite | 32 comments


I don't really see why:

  public class TestJIT {
    public static void main(String[] args) {
        for (int i = 0; i < 20_000; i++) {
            payload();
        }
    }

    public static int payload(int a, int b) {
        return a + b;
    }
  }
shouldn't be optimized into a 'no-op'. The end-effect is the same.


Things like that often will be, which is why you generally need to use the JMH harness to do microbenchmarking on the JVM. It uses internal APIs to stop the compiler treating results as dead and eliminating them.

In this case it doesn't happen because to see that the entire operation is dead requires the compiler to inline payload into main, but he says he disabled inlining for that method specifically so it wouldn't happen. Recall that the goal is to see the assembly for a block of code in isolation, not demo what the JVM can do when given free reign.


Well, inlining is not necessary, good old interprocedure analysis would resolve it too.


Sure, but neither javac (the Java to bytecode compiler) or HotSpot are doing that. The former tries to preserve as much as possible, and for the latter interprocedure analysis is too costly at run-time.


Could javac do the analysis and record it in the bytecode for HotSpot to optimize? Or is this kind of hybrid teamwork not done?


It is done, but for this case the problem is partial compilation. For this you'd need methods to be tagged as pure, but that assumption needs to propagate and it could be violated by a library being upgraded.


There are two aspects to this:

- If payload is not inlined, the loop can't be optimized away. The fact of iteration itself may be a desirable side-effect (spin-wait/pause) a stricter compiler can't make an assumption about, unlike GCC or Clang

- If payload is inlined, it should be a no-op. If it's not, and its result consumed by an opaque "sink" method, there may be limitations.

On interproc analysis - don't forget you can dynamically load code and access payload through reflection too. This limits certain optimizations that are otherwise legal in AOT compilation. .NET has similar restrictions and corresponding differences when publishing binaries with JIT vs AOT - the former gets to enjoy DynamicPGO (HotSpot kind of optimizations), the latter gets to enjoy frozen world (with exact devirtualization, faster reflection, auto-sealing, etc. but overall not as good as DynamicPGO with guarded devirt, branch reordering, etc.).


Early in the article, the author tells the compiler not to inline calls to payload(). When that isn't inlined, the compiler can't tell if the body of the loop has side effects or not, so it won't be able to eliminate the loop.


Why would you want to remove those calculations?

You would change behavior of this program


How? Except it would finish sooner. The is no behavior.


There is a side effect like cpu temperature increase

If I put my leg on my desktop tower then I may feel enjoyably warm or if I put some chocolate on my laptop then it may start melting

Also fans will be louder


None of that is behavior according to the java (or any sane) language specification, so it can be optimized out of existence.


It is written in other, more general "specification" called physics

Computers arent purely abstract, they exist in real world and are affected by it, so lets do not try to pretend otherwise


So according to that "specification" no optimization is allowed. Since that would almost always change the "heating behavior" of code. Therefore, it is absurd.


None at all, even out of order execution. For that matter, executing the same code on different hardware is right out. Every program must be implemented on single-purpose hardware, and you can't even manufacture two of them.


Yet the compiler writers care only about the language spec, and you can bet that failing to optimize this as dead code would be considered a compiler bug.

This goes not only for Java compiler, but many other languages as well.


Sometimes compiler engineers try to be too smart and they end up creating a mess :)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537

>When optimizing code for "dead store removal" the optimizing compiler may remove code necessary for security.


I'm going to point out that the bug is marked " RESOLVED INVALID "


Political thing, but bug created by removal is real :)


Well you posted the link… next time at least try to post one that supports your idea instead of the opposite.


The bug is real, but it isn't the compiler's.


What exactly do you expect an optimizing compiler to do?


Leave code without obvious side effects alone (this is different from dead code)


What would "leave alone" even be? There's no "default" state of performance of Java code; it would be ridiculously stupid for there to be something saying that, say, "a+b" for int type values has to take at least 1 nanosecond or something. And you can't use big O complexity here either - the int type has a maximum of 2 billion, and thus a loop over it is trivially O(1), just with a potentially-big constant factor. (or, alternatively, the loop was sped up by a constant factor of 2 billion, and optimizing compilers should extremely obviously be allowed to optimize code by a constant factor)


But this is code obviously without side effect.


There are side effects, but in real world, not abstract


So non-obvious side effects?

You can say that about any code at all, so no optimising would ever be possible. The program running faster is a side effect after all.


No, that code has no side effects. The implementation is free to produce whatever side effects it wants or needs as part of execution, but that is absolutely none of the compiler's business.


Not sure if being facetious, but FWIW you can't really rely on these. Next year you'll get a much faster CPU and memory and the timing will be all different. Or, tomorrow you run it while encoding video on the CPU which eats 99% of CPU, and it's hundred times slower.


Obviously, there is an xkcd for that: https://xkcd.com/1172/


Excellent. Next time I’ll have something to refer to when making memes like this one: https://twitter.com/_saagarjha/status/1576961522936340480. Maybe I can avoid building OpenJDK too!


Some self-advertising of a linux tool I made which can display perf record data with Java JIT disassembly (doesn't need hsdis): https://github.com/dzaima/grr?tab=readme-ov-file#java-jit




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: