Most of the comments so far are negative, so I'll add something positive. One thing I love about the Haskell community is how they are always questioning their assumptions and genuinely seeking the best way to do things (often drawing upon or contributing to computer science research). The core concepts behind Haskell are clean and simple (essentially something between System F and System Fω), but there's a lot of baggage on the surface that obscures that underlying elegance. We should encourage people taking an introspective look into their tools and asking how they can be better, even if certain proposals are unrealistic or controversial.
I think the author is just writing down their opinions (which are worth discussing!) and not seriously trying to start a new GHC frontend. Personally, I think Haskell is approximately stuck in a local maximum, which can only be escaped by embracing dependent types (which are actually simpler than where Haskell has been heading) rather than building increasingly complex approximations of them. Once you try a dependently typed language like Agda, Lean, Coq, or Idris, it's hard to go back to the complexity of having two (or more!) separate languages for types and programs.
Regarding the proposals in the article, the most interesting to me is (2), although I'm not sure about some of the specifics. In general, I think Haskell needs a way to graduate (or retire) language extensions, rather than having them accumulate unboundedly. It's harder to talk about Haskell when everyone is using a different flavor of it.
> Personally, I think Haskell is approximately stuck in a local maximum, which can only be escaped by embracing dependent types
To give a counter point, there seem to be lots of small wins that aren't taken, possibly in part because people seem to wait for the big ideas.
Take partial functions like head: You can do what Haskell and Java do and throw an exception. Or you can have dependent types and statically prevent the function from being applied to an empty list. But the obvious and easy solution in the current language would be to return Maybe, which isn't done because there's a feeling that it's not a big enough step to be worth the effort, and dependent types will eventually solve this anyway.
> But the obvious and easy solution in the current language would be to return Maybe, which isn't done because there's a feeling that it's not a big enough step to be worth the effort, and dependent types will eventually solve this anyway.
That's not why it's not done. listToMaybe already exists[1] and you can't change the type of head without breaking everyone's code, so head in the next version of base will come with a warning[2] and that's about as much as you can do whilst still maintaining backwards compatibility.
That's an interesting counterpoint, but I think it's a bit of a different tradeoff. Adding, say, dependent types, can be done as a gradual process with opt-in (via language extensions) and "only" requires heroic effort on the part of the compiler writers... whereas changing 'head' requires 10000i Hackage packages to update their code.
This is all tied into how inflexible the base/Prelude story is currently, etc. If the Prelude were a fully independent library where you could just depend on any old version you like, then there'd be no problem changing the signature. (Other than the usual diamond dependency problems). Of course you can choose NoPrelude and go from there, but then you're already in a place where changing 'head' doesn't matter to you, only the maintainer of your alternative Prelude.
People are working on both of these aspects, and that's got me really excited about where Haskell is going these days!
(I've always loved Haskell as a language, but there have been undeniable ecosystem issues.)
In cases when head is used though, you clearly just want to unpack the first element of a list. So if you replace head with a function that returns a Maybe, all you’re going to do is pattern match on its result to check if it’s a Just. So I don’t see a benefit over matching x::_ against the list straight away.
(I do see how a function like that is useful, but I don’t think it’s a good replacement for head.)
> all you’re going to do is pattern match on its result to check if it’s a Just
Something is eventually going to pattern match on it. But it might well be some handler way way up at the top of my program, and all I want to do right now is `map` and `bind` as normal.
Then you're clearly not using head. For this you can use listToMaybe.
Don't change the meaning of an identifier to something meant for a completely different use. The alternative change involving dependent types or liquid types doesn't break existing code. Changing head so that it returns a Maybe breaks code, because its use case is different.
I'm not using head, because I know Haskell well. But someone new to the language is never, ever, ever going to intuit that you get the head of a list with `listToMaybe` and `head` actually means `unsafeHead` and should almost never be used.
> Changing head so that it returns a Maybe breaks code
Of course. So did `Applicative`. So did `Foldable` and `Traversable`. So will removing (/=) from Eq, whenever that actually gets merged - and that was imo a far, far less problematic wart than having partial functions in Prelude.
Breaking changes need to be made very carefully and very slowly: simplified subsumption was terribly handled, if not an outright mistake. But refusing to make them at all is how you get C++.
> But someone new to the language is never, ever, ever going to intuit that you get the head of a list with `listToMaybe`
They don't have to. Pattern matching is one of the first things they will learn, so they can use that instead.
> and `head` actually means `unsafeHead` and should almost never be used.
They don't need to intuit it. The next version of base will have a warning on `head` telling them to use a safer alternative instead.
By the way, the Haskell community is currently struggling to accept that it should make fewer breaking changes, in order to become a more stable platform for industry use. We're definitely not going to start breaking more code!
I recently got into an argument with someone who insisted that since “undefined” could be considered “non-terminating” there was nothing wrong with the current behaviour.
> I think Haskell needs a way to graduate (or retire) language extensions, rather than having them accumulate unboundedly. It's harder to talk about Haskell when everyone is using a different flavor of it.
That is what the standardization process is for. I don't think that the parties that could write a new Haskell standard have the time, resources, or bandwidth to work on one. That's why for now we're stuck with 98, 2010 and a bunch of extensions.
I really like the Haskell that goes easy on the dependent-type sort of stuff: and I say that as someone who has been on-call for Sigma (which is like, top 3 industrial use cases).
It’s just really, really expressive without trying to solve the halting problem via some unification failure mode.
Type classes and GADTs already give you 10x the canvas you can get without trouble anywhere else, and if you keep it tight, you have the tooling to really tune it up.
Idris is awesome, do that if you’re doing that. Haskell is changing all the time: it became a serious production tool: I’d really like for it to stay one.
Haskell's current dependent typing is expressive, but also painful to use and almost impossible to debug. The main thing I want out of dependent Haskell isn't more power (though I wouldn't turn it down), it's the ability to write type-level Haskell like Haskell instead of some early Prolog prototype.
> embracing dependent types (which are actually simpler than where Haskell has been heading) rather than building increasingly complex approximations of them
very much second that! haskell has band-aid type-level programming that is so awful, but sometimes so useful, that it's really hurting a lot. I would just really like to deprecate type-families and would like to see as a the first step some non-dependently typed type-promoted functions. There's a great, pragmatic, proposal I would really like to see implemented. I, personally, don't need dependent types thaaaaaat much but type families are so unergonomic and really a dead-end I think. It's just stupid.
it's sad to see that the dependent haskell progress is losing steam and, as it is my impression, getting maybe a bit lost in the weeds?
My goal with this tutorial was to introduce the core aspects of the language (dependent types, tactics, etc.) from first principles. If you're fascinated by proof assistants like Coq or Lean and want to understand how to use them, this tutorial is written for you.
I've done Logical Foundations and part of Programming Language Foundations, and some of the stuff in your tutorial is new to me (especially dependent types).
Your tutorial is very interesting, but in comparison to the Software Foundations series you have relatively few exercises and they're all at the end of the chapter. I found it helpful to do exercises as I went along, in many cases testing or exercising my knowledge of a new concept or Coq feature right away. So I would suggest writing more exercises and sprinkling them closer to where some of the material is introduced.
Also, some of your exercises in some chapters have a pretty steep ramp-up in difficulty, so again it might be nice to have exercises that gradually build in difficulty.
These are probably important considerations when readers are going to be working through this material entirely on their own and it might be their first exposure to Coq. (I think your tutorial says it was previously presented in-person, which has got to be a lot easier, since people can ask questions!)
Asymmetric fields are in a temporary transition state to/from required. So it would seem a bit odd to me for the transition state to be the default.
However, I think I see your reasoning: you don't want to accidentally introduce a required field without first making it asymmetric, and having asymmetric as the default would effectively prevent that.
You're considering an alternative behavior for asymmetric fields in choices, but you need to consider the behavior of optional fields in choices too.
In particular, the following duality is the lynchpin that ties everything together: "asymmetric" behaves like optional for struct readers and choice writers, but required for struct writers and choice readers.
From that duality, the behavior of asymmetric fields is completely determined by the behavior of optional fields. It isn't up to us to decide arbitrarily.
So the question becomes: what is the expected behavior of optional choice fields?
Along the lines you proposed, one could try to make optional choice fields behave as if they had an uninhabited type, so that writers would be unable to instantiate them—then you get exactly the behavior you described for asymmetric choice fields. Optional fields are symmetric, so both readers and writers would treat them as the empty type. This satisfies the safety requirement, but only in a degenerate way: optional fields would then be completely useless.
So this is not the way to go.
It's important to take a step back and consider what "optional" ought to mean: optionality for a struct relaxes the burden on writers (they don't have to set the field), whereas for a choice the burden is relaxed on readers (they don't have to handle the field). So how do you allow readers to ignore a choice field? Not by refusing to construct it (which would make it useless), but rather by providing a fallback. So the insight is not to think of optional as a type operator (1 + T) that should be dualised in some way (0 * T), but rather to think about the requirements imposed on writers and readers.
You're right to note the duality between sums and products and initial and terminal objects, and indeed category theory had a strong influence on Typical's design.
I can't speak with authority about ATD, but the following might be helpful:
Aside from algebraic data types, the big selling point of Typical is asymmetric fields. That's the crucial feature that distinguishes Typical from every other framework. Without asymmetric fields, there is no safe way to introduce or retire required fields. People using other frameworks fear required fields (rightly so), whereas Typical gives you the tools to embrace them.
Avro has no equivalent of Typical's asymmetric fields. In Avro:
1. Record types can have optional (but not asymmetric) fields, just like in most IDLs. Avro implements this by taking the union of the field type with a special `null` type, but in practice it's equivalent to having optional fields.
2. Avro doesn't support proper sum types, but it has two approximations of them: unions (but not tagged unions) and enums. Unions have no support for adding/removing new cases safely. Enums can have a default value that is used if the case is not recognized.
From the "Typical perspective", both of these are problematic:
1. If you are trying to introduce a required field in Avro, you first introduce it as optional, and then at some point in the future when you have manually confirmed that the field is always being set by every writer, you can promote the field to required. Typical's asymmetric fields offload the burden of having to do that manual confirmation onto the type checker.
2. For unions, Avro offers no equivalent of optional or asymmetric cases. If you add a new case, you better not use it until all readers can handle it, and the type checker won't enforce that. For enums, the solution of having default values is unsatisfactory, because not every type has a suitable default. In practice, this usually means adding a catch-all default that is only used to signal an unrecognized input, but then what is a reader supposed to do with that? With Typical, if a writer introduces a new optional or asymmetric case, it must then provide an appropriate fallback to use when the new case isn't recognized by readers. For example, if a new specific type of error is introduced (e.g., with a stack trace), the fallback might be a more general type of error (e.g., with only an error message) that readers already know how to handle. If the new case is asymmetric, then you know readers can handle it, so once its rolled out you know it's safe to subsequently promote the case to required (so that writers no longer need to provide a fallback).
I don't really understand your summary for record types; it doesn't match with my experience of Avro schema evolution.
Avro's schema resolution rules [0] for record types seem to implement asymmetry almost exactly. It's just that this is done by having separate reader and writer schemas, which is less ergonomic and clear.
To add a new required field, you add it as a *required* field to the writer's schema. To do this asymmetrically, you can make it an *optional* field in the reader's schema, as a union with null. Once all clients are compliant, you can change the reader schema to replace the union with a plain type.
To remove a required field, you can go through this same dance. You can also always delete a field from the reader side, and any written data under that tag will be ignored.
Regarding union and enums: a nit is that unions are tagged, at least on the wire [1]. The writer-specified fallback is an important difference though, I definitely agree with that, and it seems like a genuinely novel improvement.
> If I have a single language codebase, why should I prefer the first approach?
Probably the most compelling reason is that a single language codebase might not be a single language codebase forever. But, as you suggested, the switch to a language-agnostic framework can be deferred until it becomes necessary.
However, there's a reason to use Typical specifically: asymmetric fields. This feature allows you to change your schema over time without breaking compatibility and without sacrificing type safety.
If you ever expect to have newer versions of the codebase reading messages that were generated by older versions of the codebase (or vice versa), this is a concern that will need to be addressed. This can happen when you have a system that isn't deployed atomically (e.g., microservices, web/mobile applications, etc.) or when you have persisted messages that can outlive a single version of the codebase (e.g., files or database records).
An embedded DSL could in principle provide asymmetric fields, but I'm not aware of any that do.
> Typical isn't focused on JSON, so it doesn't seem like it is optimized for web.
It just makes different trade-offs than most web-based systems, but that doesn't make it unsuitable for web use. We have comprehensive integration tests that run in the browser. Deserialized messages are simple passive data objects [1] that can be logged or inspected in the developer console.
You're exactly right about other frameworks appealing to the lowest common denominator, whereas Typical isn't willing to make such compromises.
Languages without proper sum types are at a disadvantage here, but it's possible to encode sum types with exhaustive pattern matching in such languages using the visitor pattern. That approach requires some ergonomic sacrifices (e.g., having to use a reified eliminator rather than the built-in `switch` statement), and people using those languages may prefer convenience over strong guarantees. It's an unfortunate impedance mismatch.
I was wondering if you've considered having an alternative, human readable encoding (either your own syntax or a JSON-based schema)?
I find it quite useful to be able to inspect data by eye and even hand-editing payloads occassionally, and having a standard syntax for doing so would be nice.
(More generally, it's a shame JSON doesn't support sum types "natively" and I think a human readable format with Typical's data model would be really cool).
It's a good question! The binary format is completely inscrutable to human eyes and is not designed for manual inspection/editing. However:
1) For Rust, the generated types implement the `Debug` trait, so they can be printed in a textual format that Rust programmers are accustomed to reading.
2) For JavaScript/TypeScript, deserialized messages are simple passive data objects that can be logged or inspected in the developer console.
So it's easy to log/inspect deserialized messages in a human-readable format, but there's currently no way to read/edit encoded messages directly. In the future, we may add an option to encode messages as JSON which would match the representation currently used for decoded messages in JavaScript/TypeScript, with sums being encoded as tagged unions.
> it's a shame JSON doesn't support sum types "natively"
You can describe it using JSON Schema though[1], using "oneOf" and "const". Though I prefer the more explicit way of using "oneOf" combined with "required" to select one of a number of keys[2].
It seems like the safety rules are buggy because some assumptions are missing?
The safety rules say that adding an asymmetric field is safe, and converting asymmetric to required is safe. If you do both steps, then this implies that adding a required field is safe. But it’s not. As you say, it’s not transitive.
But lack of transitivity also means that a pull request that converts a field from asymmetric to required is not safe, in general. You need to know the history of that field. If you know that the field has always been asymmetric (unlikely) or all the older binaries are gone, then it’s safe. A reviewer can’t determine this by reading a pull request alone.
Maybe waiting until old binaries are gone is what you mean by “a single change” but it seems like that should be made explicit?
No IDL that supports required fields can offer the transitivity property you're referring to.
Typical has no notion of commits or pull requests in your codebase. The only meaningful notion of a "change" from Typical's perspective is a deployment which updates the live version of the code.
When promoting an asymmetric field to required (for example), you need to be sure the asymmetric field has been rolled out first. If you were using any other IDL framework (like Protocol Buffers) and you wanted to promote an optional field to required, you'd be faced with the same situation: you first need to make sure that the code running in production is always setting that field before you do the promotion. Typical just helps more than other IDL frameworks by making it a compile-time guarantee in the meantime.
We should be more careful about how we use the overloaded word "change", so I'm grateful you pointed this out. Another comment also helped me realize how confusing the word "update" can be.
That's reasonable but it's a different notion of what a safe change is than I remember from using protobufs. I believe they just say adding or removing a required field isn't backward compatible.
The word "safe" seems worth clarifying. There are some changes that are always safe, like reordering fields, because they are also transitively safe.
Removing an optional field is safe as long as you remember not to reuse the index. Sometimes the next unused index is informally remembered using a counter in a comment.
Other changes, like converting asymmetric to required, seem like they're in a different category where it can be done safely but requires more caution. It's more like a database migration where you control all the code that accesses the database. There is a closed-world assumption.
> That's reasonable but it's a different notion of what a safe change is than I remember from using protobufs. I believe they just say adding or removing a required field isn't backward compatible.
Safe just means the old code and the new code can coexist (e.g., during a rollout), which requires compatibility in both directions. Not just backward compatibility.
This is true for Protocol Buffers as well, except they have no safe way to introduce or remove required fields. So the common wisdom there is to not use required fields at all.
I think our misunderstanding is really about use cases.
Sometimes Protocol Buffers are used to write log files, and the log files are stored and never migrated. To read the oldest logs, you need backward compatibility all the way back to the first production code that was released. This means transitive safety is needed and the changes you can make to the schema, which is used as a file format, are pretty limited.
This isn't just a limitation of the Protocol Buffer format. Safety rules are different when you do long-term persistence. If Typical were used that way, you could only trust safety rules that are transitive. Asymmetric fields could be added, but the fallbacks never go away.
(Also, a rollback doesn't get rid of any logs that were generated, so it's not a full undo. As you say, both forward and backward compatibility are needed.)
Serialization isn't just used for network calls, and even when it is, sometimes you don't control when clients upgrade, such as when the clients get deployed by different companies, or as part of a mobile app. So it seems worth clarifying the use cases you have in mind when making safety claims.
I think you're right, and now I understand why the rules seemed buggy to you but not to me. You're considering persisted messages that need to be compatible with many versions of the schema, whereas the discussion and rules are formulated in the context of RPC messages between services which only need to be compatible with at most three versions of the schema: the version that generated the message, the version before that, and the version after. The README could do better to clarify that.
In the persisted messages scenario, there is one change to the rules: you can never introduce a required field (since old messages might not have it). Not even asymmetric fields can be promoted to required in that scenario.
To expand on this, a way to think about it is that there are some changes that are always safe and others that depend on what data is still out there (or that’s still being generated) that you want the code to be able to read.
“What writers are out there” isn’t a property of the code alone, though maybe you could use the code to keep track of what you intended. The releases deployed to production determine which writers exist, and they keep running until stopped and perhaps upgraded.
In some cases a serialization schema might be shared in a common library among multiple applications, each with its own release schedule, making it hard to find out which writers are still out there.
It’s much easier when the serialization is only used in services where you control when releases happen and when they’re started and stopped.
Thus, asymmetric fields in choices behave like optional fields for writers and like required fields for readers—the opposite of their behavior in structs.
So if you have a schema change which adds an asymmetric field to both a struct and a choice, it seems both writers and readers needs to be updated in order to successfully transmit to each other?
If you add an asymmetric field to a struct, writers need to be updated to set the field for the code to compile.
If you also add an asymmetric field to a choice, readers need to be updated to be able to handle the new case for the code to compile.
You can do both in the same change. The new code can be deployed to the writers and readers in any order. Messages generated from the old code can be read by the new code and vice versa, so it's fine for both versions of the code to coexist during the rollout.
After that first change is rolled out, you can promote the new fields to required. This change can also be deployed to writers and readers in any order. Since writers are already setting the new field in the struct, it's fine for readers to start relying on it. And since readers can already handle the new case in the choice, it's fine for writers to start using it.
Ah, I was thinking in terms of the actual messages, not compilation. So it should read "Fields are required for compilation by default"?
This leads me to versioning. Imagine you have some old code which won't be upgraded, say an embedded system. Either you don't promote to "required", or you do versioning.
Given the lack of any mention of versioning, I take it that's to be deal with externally? Ie as a separate schema, and detected before handing the rest of the data to the generated code?
Do you think you could ever generate types for go? The protobuf implementation of oneof in go is pretty rough to look at, and not fun to type over and over.
Yes! We have comprehensive integration tests that run in the browser to ensure the generated code only uses browser-compatible APIs. Also, the generated code never uses reflection or dynamic code evaluation, so it works in Content Security Policy-restricted environments.
I think the author is just writing down their opinions (which are worth discussing!) and not seriously trying to start a new GHC frontend. Personally, I think Haskell is approximately stuck in a local maximum, which can only be escaped by embracing dependent types (which are actually simpler than where Haskell has been heading) rather than building increasingly complex approximations of them. Once you try a dependently typed language like Agda, Lean, Coq, or Idris, it's hard to go back to the complexity of having two (or more!) separate languages for types and programs.
Regarding the proposals in the article, the most interesting to me is (2), although I'm not sure about some of the specifics. In general, I think Haskell needs a way to graduate (or retire) language extensions, rather than having them accumulate unboundedly. It's harder to talk about Haskell when everyone is using a different flavor of it.