Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Defining interfaces in C++ with ‘concepts’ (C++20) (lemire.me)
73 points by pjmlp on April 19, 2023 | hide | past | favorite | 73 comments


> So what are concept good for? I think it is mostly about documenting your code.

Emphasis mine. While it is somewhat good as documentation, sometimes static_assert with a custom diagnostic message is sometimes better for this purpose.

The hidden power of concepts/constraints is the way it shapes overload sets. You can have something like:

  template <forward_range R>
  void foo(R&& r) {
    /* some generic algorithm */
  }

  template <random_access_range R>
  void foo(R&& r) {
    /* optimized for random access */
  }
and it will work if you pass a random access range, as the compiler can deduce that it's more specific than a forward range to resolve the ambiguity. Prior to concepts writing code that did this was way more cumbersome.


Indeed. I haven't found concepts to be very good at generating error messages. But they are great for documentation and to get rid of SFINAE hacks for overloading.

The shorter template syntax is a bonus.

edit: concepts should also allow for better IDE tooling, for example proper completion inside template functions; although it is supposed to work, I didn't notice it firing yet in clangd.


They are also great for poor man's reflection, you can combine concepts with C++ traits (not the same as Rust's) and constexpr if, to check if the given type supports some specific capabilities.


In Qt Creator I can see the interface defined by a concept upon autocomplete


Rust requires trait bounds (its equivalent of concepts) to type-check the template code at definition site, when it's written, not at the instantiation site where it's used. This results in much better error messages. The downside is that trait bounds for numeric code are awfully verbose, and you can't sneak in a printf without declaring it as a requirement.


> In Go, I found that using an interface was not free: it can make the code slower.

The Go version that was presented isn't equivalent though. In Go you are accepting an interface directly which will hide the value under some fat pointer for dynamic dispatch, in c++ you are using generics to monomorphise the function to specific types. If you want to compare the implementations fairly you should've used Go generics:

  func Count[T IntIterable](i T) (count int) {


Fair criticism, though I do wonder if it'd really make that much of a difference. Go doesn't really monomorphize generics either, and would end up with an equally if not more expensive lookup for the correct generic function at runtime.

Some reading: https://github.com/golang/proposal/blob/master/design/generi... https://planetscale.com/blog/generics-can-make-your-go-code-...


I don't know why I thought Go generics also do monomorphization, must've misremembered or it was an earlier proposal? Thanks for the correction!


That's true at the moment, but still an implementation detail. I think I remember early versions of C++ compilers doing the same thing with templates.

Considering the progress Go compiler has gone through, I think it's reasonable to expect the optimized implementations will come few versions down the road.


C++ templates have never used runtime dispatch


I assume you've checked the version control history of every C++ compiler in existence?


Not the OP, however I have programmed in C++ since 1987 across many different operating systems and hardware platforms and I've literally never heard of a compiler that implements template stuff using runtime dispatch. CFront3 which was I think the first real template implementation that most people used certainly never did it that way, neither did any version of gcc, visual studio or Sun Workshop, which are the compliers I used the most from that period. Dug out my old copy of Coplien[1] which is from the early 90s and it discusses runtime dispatch in depth in the context of vtables and virtual function pointers and the cost of these things, so the concept was well understood but not a cost anyone was paying with templates.

[1] https://archive.org/details/advancedcbsprogr00copl "Advanced C++ Programming Styles and Idioms" aka the first programming book that genuinely kicked my ass when I first read it and made me realise how good it was possible to be at computer science.


It would be extremely hard to implement templates with dynamic dispatch while maintaining the correct semantics.


Right. For starters, from the very beginning C++ has supported function templates which take native types. So you don't even necessarily have any kind of pointer you could add a vtable to even if you wanted to. Then add to that the guarantee[1] about pod types being directly compatible with C which as you say I don't see how it owuld be possible to do.

[1] which has always been strong even before there was an actual ISO/ANSI standard


templates don’t exist after the front end. there is no ABI that allows them to exist in any object file. there is no object file format they could be embedded in, sans a string representation of the source they came from.

extremely hard is underselling it somewhat :)


I have used C++ before templates were a thing, and never ever saw one that did otherwise, added C++ to my toolbox in 1993.


Really good video by Conor Hoekstra on the differences between C++ Concepts vs. Haskell Typeclasses vs. Rust Traits vs. Swift Protocols.

https://youtu.be/E-2y1qHQvTg

It is a good intro overview to the subject. Unfortunately the video has far too much filler, so you need to skip a bit e.g. start at 22:30 when the video gets into examples.


> In Go, I found that using an interface was not free: it can make the code slower. In C++, if you implement the following type and call count on it, you find that optimizing compilers are able to just figure out that they need to return the size of the inner vector.

I'm surprised at this, do Go interfaces really introduce much overhead? Of course this depends on the level of performance you care about but surely, being a statically typed language, lots of the same optimisations are available.


Interface methods in Go are like virtual methods in C++. In principle C++ compilers when statically compiling everything can often remove virtual method indirection for some objects, but I think in general this optimization is still uncommon and difficult to coax out a compiler. Go definitely does not do this, even though in principle it should be easier.

Basically, "interfaces" in Go and C++ actually refer to quite different language features. (Or at least, the author is using the term to describe quite different language features.)


This is common when doing LTO, without it there is no guarantee that there isn't some dynamically loaded code that would be broken, this is one area where JIT focused languages have an advantage.


Indeed you need LTO for generalized devirtualization, but guarded devirtualization, static classes and final can still help even without LTO.


There is overhead in interface indirection because the Go runtime needs to perform dynamic dispatch to determine which method to call at runtime.

C++ can optimize interface indirection away because it supports static polymorphism, which allows the compiler to generate specialized code for each concrete type used with a generic interface, eliminating the need for dynamic dispatch.


Go generally is pretty conservative about that kind of thing (namely, compiler optimizations). Go generally abides by a “what you write is what you get” kind of thing, especially when it comes to “non-local” optimizations. It’s generally opposed to anything that’s “clever.” (Just my feeling as someone who uses Go pretty often and who respects the choice they’ve made on that spectrum).


it’s actually to keep compile times fast

and for the implementation of the compiler to remain simple


Yep, I think those things are all related.


The first one is doable in more complex languages, e.g. D, Ada, Delphi,...


Interfaces always go to the heap in go, so yes, you can take some simple code that could easily live on the stack and make it slower by wrapping it in interfaces with go.


I would argue Java interfaces are very different from Go interfaces and C++ concepts, because the former is nominally typed and the latter are structural.


That's something I have been pondering for some time.

I believe it's a false dichotomy.

My thought is still that structural supersedes nominal.

A nominal interface is just another constraint added to the list of constraints of an underlying structural interface?


In a nominal type system, a method x() is part of the interface X, while in a structural one it's part of the implementor of said interface. In Go there's a Human.HasOrgan(), not an AbstractBody.HasOrgan().

A consequence of this is that in Rust, which has a nominal system, you can implement two traits that contain a method with the same name and are required to disambiguate at the call site. In Go you can't do that, since the method is part of the concrete type.


Fair. That's not really in contradiction too.

The additional naming constraint added to a structural interface would form a sort of namespace for methods.

I think in the comments below that someone likens this to tags in C++.


A nominal type system is not a more constrained version of a structural one. That statement would imply that any program written for the former would work using the latter as well, which is false. Name collisions would simply not resolve.

For it to work, you need to add a namespace to all the colliding methods (a simple one would be a prefix like people do in C).

A nominal system is a more constrained structural system in some ways, but the opposite is true as well, so it's not as simple as 'nominal is subset of structural'.


Hmmh. You seem to be restating what was said above.

A nominal type system still is superseded by a structural type system.

The difference is in how a type is defined. Or what kind of constraints are in entailment said otherwise.

An interface enforces constraints. The difference here is merely that the current implementations only have either one of these type of interfaces. So for the structural type system, all methods are in the global namespace, somehow.

That's all. Because our current languages are this way doesn't mean that the two concepts cannot be reconciliated or that one is just better than the other.


Yeah I think our arguments overlap in some ways.

> That's all. Because our current languages are this way doesn't mean that the two concepts cannot be reconciliated or that one is just better than the other.

I don't think I agree with this though, I believe they're fundamentally different. The whole point of structural constraints is that they don't need the type to be aware of them. The point of nominal constraints though is that they require the type to explicitly acknowledge them.

In an ideal situation, everyone names and types things the same ('logical') way, so structural constraints 'just work'. A type implements has_organ, and an interface requires has_organ, and the type is automatically compatible with the interface.

A nominal system is the opposite though; the type explicitly understands what a specific interface implies and formally states it.

I just can't see how there's a subset-superset relationship, or how they can somehow be reconciled.


One way to see it is that a type has a given methods located in a given namespace in the nominal type system.

A nominal type system doesn't necessarily enforce semantics either.

It just enforces the location of a method definition.

Seen that way, because the relation is dual, one could indeed claim that a structural interface is a nominal interface where the name constraint is elided.

But just as in subtyping, one less constraint also means bigger set.

Of course if one were to decide that an object satisfying a nominal interface doesn't satisfy the structural interface obtained by ignoring the namespace, then I'd agree as well, these concepts would be disjoint.

I don't think they are though but I don't know of a language that ever mixed both either.


AFAIK Python [optional] type system supports both. The nominal types are the "common" types, while the protocols [1] are structural. It's quite cool, actually :)

[1] https://peps.python.org/pep-0544/


Nominal interfaces can still be useful though, as they convey a stronger sense of intent than structural. For example, java.io.Serializable is a completely empty interface that classes “implement” to signal that they are safe to serialize. As a structural interface, it’d be useless.


structurally you can do something similar by adding some tag (in the form of a constant or nested type) to your class. For example:

  template<class T> concept serializable = requires { typename T::is_serializable_tag; };

  void serialize(serializable auto x) {...};

  struct NotSerializableClass { ... };
  struct SerializableClass { using is_serializable_tag = void; ... };

  serialize(NotSerializableClass{}); // error
  serialize(SerializableClass{}); // all good
In C++, specializing a trait is also an option. So, while nominal and structural interfaces are not the same, sometimes the lines are blurred.


Well it depends on a runtime/compile-time distinction. A nominal type is a structural type with a compile-time constraint.

If you have compile-time only constants you can model nominals with structural,

    type Square
        static const IsSquare = true
        
        var length = 10
You can kinda hack-in subtyping,

    type Shape
        static const Shape = true

    type Square
        import static from Shape
        static const IsSquare = true
        
        var length = 10


This is routinely done in c++ with tags (for example iterator_tag). Tags inheritance is also a thing.


In practice C++ Concepts don't do what you're suggesting.

The C++ 20 Standard Library provides numerous concepts which have a very different semantic requirement than the syntax they're checking. If you violate the syntactic requirement of course that'll earn you a compiler error, but if you violate the semantic requirements that's silently an ill-formed C++ program, it has no meaning whatsoever and might do absolutely anything if run.

If these were nominal, we could say, well, nobody should have deliberately implemented this inappropriate Concept, similar to an unsafe Rust trait, the act of implementation is a promise to others. But C++ Concepts aren't nominal and so there was no opportunity to do that and so in practice such deviations are likely very common despite the potentially drastic consequences.


I have been programming in C++ for almost 20 years [1] and I don't remember ever being bitten by accidental concept conformance. So I object to the "likely very common" description. Implicit conformance was very much an explicit design goal.

[1] yes, concepts as an explicit language feature are new, but C++ has had de-facto concepts since Stepanov work on the original STL in the 90s.


Since I know better than to suggests C++ programmers might be more capable of making mistakes than they realise, lets try a different question: How do you spot this mistake when reviewing other people's code? Do you memorise a list of all the semantic requirements of each concept so that you can mentally check that the concept's requirements are satisfied appropriately by what was written each time ?


This isn't something I look for in code reviews because it's just not something I've ever see be the source of a bug. There are a million bugs that I've eventually tracked down to some subtle C++ thing, but I've never had one come down to a type which appears to conform to one of the standard library's concepts but actually doesn't.


I expect other people to write tests (including compile time tests).


If you were worried about behavioural problems, including UB, tests would help.

But alas the problem here is IFNDR [Ill-formed No Diagnostic Required] so the compiler can't help you. All semantic constraints are your problem as the programmer, C++ decided that it's not the compiler's concern whether the program meets semantic constraints. Testing doesn't necessarily help at all, which is probably surprising.


As always, very informative and a perfect "snippet" size to quickly read and learn a new thing or two. Thanks!

Meta: I think the very first sentence is the victim of some drive-by editing, and needs one more pass. I'm not a native speaker, but I still suggest changing

In an earlier blog post, I showed on the Go programming language allow you to write generic functions once you have defined an interface.

Into perhaps

In an earlier blog post, I showed how the Go programming language allows you to write generic functions once you have defined an interface.

Considering the audience and author, I would seriously consider omitting the explanation of what Go is, but that's just polish. :)


There are quite a few little language mistakes that I couldn't figure out if it was a language thing or just a typo.

> Of course, it also limits to the tools that I use to program: they cannot much about the type I am going to have in practice within the count function.

I think dropping the 'me' is often a feature of those whose native language is eastern European/Russian. The second (possibly) missing 'know' seems to support just editing mistakes. The structure of both makes me think of Portuguese for some reason!

To avoid going off topic on HN, something... something... ChatGPT


    uint32_t next() { index++; return array[index - 1]; }
I do like

    uint32_t next() { return array[index++]; }
, shame it's kinda unintuitive.


does this imply that type erasure via base classes will become a thing of past ?


In C++20, concepts don't really provide any new semantic feature that wasn't already available before. Just a nicer and cleaner syntax (often considerably so).


i see, ok thanks ! so, the concept-model idiom is still (pretty) useful...


concepts don't change codegen in any way, they just mean your error messages become actually sane :D

Essentially, in the past you would have your template code, say something dumb like:

    template <typename T> T halve(T t) { return t / 2; }
and if you instantiated with the wrong type, say:

    halve("foo")
you get an error message pointing to the t/2 in the halve implementation. As your templates become less trivial, so do the error messages. So we could apply concepts:

    template <typename T> concept Halvable = requires (T t) { t / 2; };
    template <Halvable T> T halve(T t) { return t / 2; }
Now our call to halve("foo") will complain that const char* is not Halvable. It will also produce an error similar to the original saying that we don't conform to Halvable because the `t / 2` expression fails. In this case it's not super valuable, but if you were instantiating a type or method that was more complicated instead of getting dozens or hundreds of errors in the instantiated body of the template, you just get an early error saying "you aren't conforming to X for these reasons:...".

Unfortunately this does not stop you writing a template that depends on things that your concepts don't guarantee. For example, if we can halve something, we must be able to double it, right? (silly example to demonstrate the issue)

    template <Halvable T> T double_it(T t) { return t * 2; }
Note, Halvable doesn't ensure that * is available, but this template is still "correct".

Now we can do `double_it(1)` and that will work, but `double_it("foo")` will say the error is at the t * 2 in the body, when we probably want the error to actually at the point we try to call double_it.

Preventing this kind of error is non-trivial (possibly actually impossible?) given how concepts are defined. It's literally just a list of statements and expressions that need to be valid for a type. But going from a list of "these statements and expressions are valid" to "is this specific expression or statement valid in a template" is at best nontrivial. This is core limitation of the entire feature.


Arguably, pointing to t/2 (in the second case) is the correct behaviour, because (if concepts are being used), then it is the function (double_it) that is mis-specifying it’s requirements. Even better would be to point at the function definition as well and say that there is nothing in the concept that allows this function.

I think the core problem (don’t know if this is still the case, haven’t actually used concepts) is that templates raise errors at the point of expansion, rather than at the point of definition, because of how they are specified/implemented.

I think it’s possible that the compiler could do something smart by auto-creating a temp type that is the minimal possible implementation of the concept and attempting to compile the function. Any errors that result, should be flagged at the concept specified in the function.


[flagged]


Yeah, I agree, let's just rewrite all our system software in Javascript. Much better.

(Well, except for your Javascript interpreter, that will still be written in C++, obviously.)


Tsk, JS is already the old thing. All the cool kids are on Typescript now.


Why using an interpreter? Just go down the FJCVTZS route and make the CPU accept Javascript/WASM as assembly code at this point.


It is only a matter of implementing it on a FPGA actually.


It's just like Java or JavaScript or any other language people actually use for a long time. And before you know it, Rust will be like that too. And every other language you love as well, if it manages to gain significant traction.


No different from any programming language with several decades of evolution.


Hard agree. As others remarked it's almost like the unwritten rule over time. Do programmers end up with such an attachment to a particular language, they prefer to pretend this kind of over-iterating isn't happening rather than simply address it?

It's a shame as these things often begin well. Ie, if the lesson were instead learned, people might better stick to providing features in a rich ecosystem instead of endless feature-creeping of the core value-proposition to oblivion.


Rust is a serious contender is this space, and closing the gap quickly.


Including introducing new features on 6 weeks basics, just wait until Rust also gets 40 years of history.


It's true, improvements to Rust ship on a six week cycle, the next will be Rust 1.69. Nice. I was inspired to improve a compiler diagnostic earlier this year†, I benefit from that improvement already in the stable compiler today. Whereas if you "miss the train" with standard C++ you've got three years to wait each time, and of course the Powers That Be can ensure that oops, you just missed the train again...

Of course Rust's improvements are actually compatible, not only by fiat, but because Rust's automation extensively tests each of these six weekly releases against the vast field of Free Software out there written in Rust. Now maybe this is secretly happening for C++ and they're just very bad at it. Or, as seems more likely, it's not done, the results are the same either way, new C++ versions require extensively manual testing to upgrade your software before you can take advantage without too much fear.

† Rust knows that characters like 'A' aren't necessarily one byte, and it deliberately doesn't coerce them to fit in a byte, you'd need to convert them, so let ch: u8 = 'A'; won't compile. But ASCII characters can fit in a byte, so there is syntax to write that b'A'. My change means that the compiler will explicitly suggest you modify that earlier mistake to let ch: u8 = b'A'; which works, however it knows not to recommend nonsense like let ch: u8 = b'£'; the pound currency symbol isn't in ASCII so you keep the same diagnostic just explaining what's wrong with no suggestion.


Again, wait until Rust gets 40 years of history deployments, distributed from the tiny 8 CPU, to HPC workloads and FPGAs, or stuff running on Mars.

I doubt very much that Rust editions and backwards compatibility history will be able to survive 40 years with such diverse use cases, without introducing accidental complexity and corner cases along the way.

This assuming that we can still use Rust and not Crab , as if Rust also doesn't have its own show of politics.


Rust is not designed for 8-bit CPUs like Tiny8. The smallest usize is allowed to be is 16 bits.

In practice on these very tiny devices high level languages are total overkill. Grace's original "compiler" concept makes sense, but today's assemblers are more than sufficiently capable. You can literally memorise what all the individual memory locations (actually Tiny8 just admits they're registers, it's not as if it would make sense to also have registers when you only have 256 bytes of RAM) are used for which means even the idea of variable names is of doubtful value.

I don't know if it's practical to write a conforming C++ "freestanding" compiler for Tiny8, but I can't imagine it'd be any more useful than Rust would be if you did.

The reason there isn't stuff on Mars running Rust is mostly that it takes a long time both to get stuff approved for that kind of application and to send things to Mars. Still I'm sure in 40 years there will have been Rust on Mars because why not and I doubt it'll have significant impact on Rust syntax.

There already are inelegant decisions which cannot (for compatibility) be revoked, but they're much less numerous and egregious at this point in Rust's life than similar problems were in standard C++. If you want one to point at, for some reason, I suggest comparing ASCII predicates like char::is_ascii_lowercase(&self) -> bool with the non-ASCII ones like char::is_lowercase(self) -> bool

Because char is Copy, the latter design would be more elegant, and allows e.g. "C++".contains(char::is_uppercase) which is true, whereas the ASCII variant means we need the more awkward looking "C++".contains(|c: char| c.is_ascii_uppercase()) going via a lambda but alas the way we got here didn't allow that to happen.


Wait, maybe you meant one of the other "8-bit" CPUs which actually have 16-bit address bus? That's kinda cheating but yes now we might actually want a programming language, we've got all this RAM to play with, we can make a stack, we can invent data structures, sure, Rust is fine with that setup. Or well, it's crippled, but not in any surprising ways you care about.

But it doesn't seem like there are interesting lessons here? Running the compiler on this sort of hardware was torment (I know, I'm old, I wrote my first software in the 1980s for a Commodore Vic 20, my program source code didn't fit in RAM so my parents had to buy a RAM expansion) but we just wouldn't do that today, we can cross compile from say, a Raspberry Pi, or even a real computer.


You know this is a red herring. Frequency of the release cycle is orthogonal to the amount of changes or even how long the changes are in development.


Just wait until Rust gets 40 years old.

Pity I won't be no longer around to check on it, given average human life expectancy.


I can't wait for C++64.

But "just wait until Rust will repeat C++'s mistakes" is just pure speculation. Language evolution doesn't have to make the language worse. Java, JS, C#, or Ada are pretty old now, and have been doing fine. Rust is well prepared for a 40-year lifespan with its edition system.


You quite clearly are unware of the evolution pain points to move past Java 8, .NET Framework 4.8, and how the Java community embraces Java 20, or the C# one sees C# 12, and the rate they are adding new features.

As for JS, everyone knows the mess of the Web ecosystem and frontend development.

Ada is doing just fine, as most vendors are still adopting Ada 2012. Ada Core and PTC are the only ones with the latest version, from 7 remaining vendors.


But the ecosystem lagging years behind the latest version is a separate problem, and one that ironically the 6-week release cycle of Rust helps with: there are no major upgrades to fear, and small frequent releases make the ecosystem move with the compiler instead of having time to ossify and choose to stay on an old version (the same way nobody chooses to stay on an old Chrome, but people used to stick to good'ol versions of IE and Netscape).


Lagging behind is only one issue, I explicilty mentioned the drama of newer updates that make many unconfortable given the rate that they now are coming with changes for the sake of it, just go read the comments on the C# 12 features announcements for a taste of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: