As mentioned on the blog post's comments, it's often caused either by storing a pointer in a DWORD variable (very common on 32-bit Windows), or by storing a pointer in a "long" variable (on most common 32-bit and 64-bit platforms, a "long" is at least as wide as a pointer; the main exception is 64-bit Windows).
The correct variable type to store a pointer would be intptr_t/uintptr_t for portable software (and DWORD_PTR for Windows-only software), but the "intptr" types didn't exist until the 1999 C standard (and, as far as I could find, the default Windows compiler didn't add the corresponding header file until around 2010). Software older than that would often assume that a "long" (or a DWORD, which IIRC was just another name for "long") was big enough to store a pointer, or perhaps tried to use the C99 types and used "long" if the header wasn't found.
> storing a pointer in a DWORD variable (very common on 32-bit Windows)
Is it? I mean, if you're lucky it won't cause a compiler warning on x86, but the sane thing to do is to use the types which the API docs specify. Which would usually be things like LPVOID or HANDLE.
That's basically how I managed to port a service and several drivers to 64-bit Windows many years ago with almost no changes.
At the time the code was written the warnings you get now didn't exist. Then the 32-bit code worked, so nobody needed to change the it. Now with windows 8 they see it for the first time that it doesn't work.
I know as I also worked on the huge codebase where the oldest pieces of the code were written around 1995.
Even if this is why, there is never a valid excuse for blindly truncating 64-bit pointer to 32-bits. It's just like saying "I'm gonna just go ahead and truncate all bank account balances to 16-bit integers to save memory in my data structures." It may be the reason, but that reason is still incredibly stupid.
> It may be the reason, but that reason is
> still incredibly stupid.
Sometimes there are stupid reasons, but sometimes even reasons which are incredibly stupid from a technical standpoint make sense in a broader context.
For a short-term fix, for example, I can very well imagine to prefer this "trick" forcing sub-4G allocations if the other alternative would be to change 1000 places in undocumented legacy code doing crazy casts... And if it's for a product (or product component) on life-support only needed for a forseeable future, to me it makes perfect sense to ask the initial "crazy" question, even if it makes me cringe...
> For a short-term fix, for example, I can very well imagine to prefer this "trick" forcing sub-4G allocations ...
But that's not what this is doing - it's taking an address allocated at an arbitrary place and then pretends that it was allocated sub-4G, whether or not it actually was. If it happens that it wasn't, this will likely cause memory corruption or an access violation.
Even if your product is nearing end-of-life, I'd still consider such a move a big "f* you" to your customers as you'd basically be tolerating that your product may crash without warning at any time.
From what I understood, what makes this idea so bad isn't truncating to 32bit - it's truncating to 32 bit while setting the LARGEADDRESSAWARE build flag - which basically tells the OS "I'm fine with all kinds of 64bit addresses, bring them on!"
The correct fix is outlined at the end of the article - and it does not involve combing through undocumented legacy code:
> If there is some fundamental reason that they have to truncate pointers to 32-bit values, then they should build without /LARGEADDRESSAWARE so that the process will be given an address space of only 2GB, and then they can truncate their pointers all they want.
Perhaps the maintainer of the legacy application can override operator new to use a custom memory allocator that uses a block of memory allocated below 4GB using VirtualAlloc.
Hotspot is really quite subtle, and thorough in its strategy. Because of the alignment of java objects it can both shift and truncate pointers (reversing the shift when they are actually used), but can also handle the heap not being allocated in the low end of address space by storing the offset of the base address, and adding/subtracting that as needed as well.
There are tons of C/C++ codes that use `int` for every type of integer, including indexing arrays and storing pointer values. A relic of the time where everything was 32-bit. It is common on Unix system as well. When you compile C/C++ library, a Python c-extension, there are usually many warnings about integer truncation.
Maybe to get more compact data structures? But in that case they shouldn't rely on that and try detecting that behavior at runtime.
E.g. the hotspot jvm tries to map memory in the low end of address space to compress pointers but falls back to full pointers when that fails.