I think the problem is actually the opposite. There aren't enough abstractions that help programmers write good code. So when a programmer has to do something complex, they end up doing the easy thing which is just importing every library they need. If instead, there was a way to say "only import this when I'm doing this action" and there's no way for the programmer to explicitly say what to import, then that problem almost disappears.
1) In dynamic languages, it's not possible to detect whether a function is used or not in the general case. For example, consider string accesses on objects. If the compiler is not sophisticated enough to resolve the set of possible strings at compile time (or such analysis would unacceptably increase compile times), then you can't shake out unused methods on that object. [1]
2) For languages like C and C++, the compiler cannot tree shake because it only knows about a single file at a time (translation unit, to be precise). You would have to rely on link-time optimizations to effectively tree shake, but LTO is not well-supported by all toolchains.
Tree shaking also has a cost that I mentioned earlier -- it increases compile times. Both LTO and tree shaking in dynamic languages increase compile times superlinearly [citation needed] wrt. the size of your application. As other commenters have mentioned, it's better to avoid including unnecessary libraries in the first place.
[1] For the pedants: yes, I know resolving the possible set of values (stricter than "all possible values for that type") for a variable is undecidable in the general case.
You could perhaps better use a lazy-loading strategy instead of a static analysis. (But this would change the semantics in case of an existing language that allows side-effects while loading modules, unless you have a lazy strategy for them too ...; and then there are the errors you'd have to deal with)
To truly achieve the same thing as "tree shaking," the function call overhead would be abysmal. You'd have to check whether the module was loaded already (with synchronization if your program is multithreaded). For single threaded programs, you could avoid this by hot-patching your machine code, but there's no way around some synchronization barrier in multithreaded programs [1]. In JavaScript (or any language where you want to avoid sending a large bundle over the network), you'd incur the latency of a network request for the first call to any function.
You're right that people are already splitting apps into bundles, but that is usually done at the page level.
[1] You could probably avoid having to take a mutex after the first call to the function with self patching code, but that sounds incredibly ugly and could have other implications (self-modifying code could be detected as a virus; could be used as a gadget to exploit some other vulnerability).
I am under the impression that JIT compilers do not modify the compiler’s own bytecode. They write generated code into a separate data region and mark that region as executable. If the code needs to modified, then control transfers back to the compiler which can mark the region as writable again.
The same can be done for a binary and it’s own code, but I wonder if it’s used as a signal in antivirus protections if done too often.
See the Elm to JS compiler, it does deep unused code elimination at the function level so you only ship the code you actually use (or the functions u actually use, even inside libraries).
The closure js compiler is also pretty good if you prefer writing js.
If we only start using better languages in our applications we could improve performance quite a bit.
Dead code elimination has been available in C compilers since... so long ago I can't remember. It's not that compilers can't do it, but much like the halting problem, they really can't catch it all.
I can’t tell if this is sarcastic or not. Tree shaking isn’t a silver bullet and really only works under specific circumstances. Better to not include the library to begin with.