*Although I have no evidence that it is being miscompiled, OpenSSL’s AES impleme...

kazinator · on March 15, 2016

It's nonsense. The function is external, called from a separately compiled file. The pointer comes in as a char *. The code checks its alignment before assuming it can be cast to a block. There is no way in it could be "miscompiled".

The ivec argument could in fact have come from an object that is of type aes_block_t. The only thing which might reveal that it didn't is wrong alignment. In other regards, there is no way to tell.

Lastly, any cross-compilation-unit optimization which could break code of this type is forbidden, because ISO C says that semantic analysis ends in translation phase 7.

I'm looking at C99, not the latest, but I think it's the same.

In translation phase 7 (second last), "The resulting tokens are syntactically and semantically analyzed and translated as a translation unit." Note the "semantically analyzed": semantic analysis is where the compiler tries to break your code due to strict aliasing.

In translation phase 8 "All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment."

No mention of any more semantic analysis! So unless somehow the mere resolution of external symbols can somehow break OpenSSL's AES, I don't see how anything can go wrong.

One thing I woudl do in that code, though is to make sure that it doesn't use the original ivec pointer. In the case where "chunking" goes on, it should just cast it to the block type, and put the result of that cast in a local variable. All the ememcpy's, load/store macros would be gone, and the increments by AES_BLOCK_SIZE would just be + 1.

haberman · on March 15, 2016

Citing the translation phases in the standard as evidence that undefined behavior is ok, as long as it's divided between two translation units, strikes me as wishful thinking.

kazinator · on March 15, 2016

Undefined behavior is the absence of requirements: there not being any requirements for some situation.

Suppose a document tells you that for some special situation X, there is an absence of requirements. However, suppose that some other general rules elsewhere in that document in fact imply a requirement for that situation.

That just means that the claim that situation X has no requirements is incorrect.

For instance, ISO C says that two struct types appearing in separate translation unit are only compatible of they have the same typed members in the same order ... with the same names.

This says that if you do aliasing with otherwise identical structures that don't have the same names, the behavior is undefined: i.e. that there is no requirement that it work.

But, we can infer that it must work by the logical fact that during the semantic processing of one translation unit, the translator has no clue what the names of struct members are in another translation unit. They disappear at translation time and turn into offsets.

I mean, we can fwrite a struct to a file, right? We can send that file over a network. According to ISO C, every program (or at least every C program) will have to use a structure with the correct names to fread that area of the file! Ridiculous!

Suppose we take ISO C and add a statement to it like, "the consequences are undefined if one of the operands of the + operator is the integer 42". The rest of the document would still be exactly what it is, and if we strike out that sentence with a black marker, nothing has changed. The rest of the document continues to inform us that adding 42 to something is well-defined (in the absence of overflow, or overflow-like issues with pointer displacement and so on).

Basically, it's a contradiction: the document gives a description which adds up to some requirements, but then some sentence tries to take them away.

In such a situation, we can just proceed as if the requirements apply. That is to say, when a requirement conflicts with the claim that there is no requirement, just let the requirement prevail.

(In a situation where conflicting requirements are asserted, it's a different story, of course.)

xorblurb · on March 15, 2016

You can't infer shit, because of the way current compiler writers are interpreting the standard today. At one point in the 90's it was obvious for the entire planet including compiler authors that some undefined behaviors did not apply for a given target architecture, so obviously obvious that nobody would even require to have it specified in the compiler doc (still nice to have, but you would not be too much angry if it does not appear)

Now they just add their "optimizations" at the highest levels so of course even for targets where it should makes no sense, and without asking for your permission, and even by default, and even some they consider "aggressive". So either you have provisions to avoid all that shit, like having a guy disabling all the new ones each time you upgrade you compiler, and you better have some defenses in depth, and I agree with you that using TU as boundaries is also a good idea, if your compiler+build-sys have an option to NOT do WPO.

But it is just not in any way guaranteed by the ISO standard, and still just an implementation detail from its pov. And honestly that's a problem. Because compiler writers will takes more drugs and come up with new imaginative way to break your code more, in the name of their "strictly-conformance and nothing more" crazy ideal.

xorblurb · on March 15, 2016

Translation phases are all fun and games, but WPO can still break your code thanks to as-if rules - and because nothing prevent alias analysis to be performed regardless of the TU boundaries. And compilers are doing it.

kazinator · on March 15, 2016

> nothing prevent alias analysis to be performed regardless of the TU boundaries

Standard conformance does. You know, that principle in the name of which the alias-breaking optimizations are done in the first place.

The "as if" principle (there is only one) means that optimized code produces the same results as the abstract semantics (under a certain set of requirements of what it means for the abstract and actual semantics to agree).

The separation between translation phase 7 and 8 is part of that abstract language semantics.

> And compilers are doing it.

GCC currently only does optimizations across translation unit boundaries when it is told via special options, and only for the .o files which are designated as participating in it. This is no different from using __attribute___ or __asm__.

xorblurb · on March 15, 2016

Well, in the model, there is no such thing as effective type of objects loosing their power at TU boundaries, and otherwise no relationship defined between effective types and linkage. An implementation that tags all accesses to dynamically check/enforce the effective type rules (by killing a faulty program or doing otherwise nasty stuffs) would not loose its conformance because of that. The default behavior of an implementation used under a particular and completely unspecified build-system is irrelevant to the fact that aliasing analysis is allowed across TU boundaries in the model, and the fact that it does not happen in some cases is just an implementation detail - that if not guaranteed by other means can always be changed without even a warning by compiler vendors.

Now that is the situation, in regard with the standard. Should we be happy with it or not is another story.