I feel like you're probably right, but I can't figure out the exact reason.
The multiplication of (n-1)(n-2) already expands to 64 bits. It has to: Even when the loop accumulator doesn't overflow when counting up to n(n-1)/2 in the abstract machine, (n-1)(n-2) will be larger (and possibly overflow 32 bits) for most n. (The highest valid n is 92682, I think.)
So surely once you're doing that, I'd think n(n-1)/2 can be done in 64 bits just as easily.