Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I remember playing with Alpaca a few years ago, and it was fun though I didn’t find the resulting code to significantly less error-prone than when I wrote regular Erlang. It’s inelegant, but I find that Erlang’s quasi-runtime-typing with pattern matching gets you pretty far and it falls into Erlang’s “let it crash” philosophy nicely.

Honestly, and I realize that this might get me a bit of flack here and that’s obviously fine, but I find type systems start losing utility with distributed applications. Ultimately everything being sent over the wire is just bits. The wire doesn’t care about monads or integers or characters or strings or functors, just 1’s and 0’s, and ultimately I feel like imposing a type system can often get in the way more than it helps. There’s so much weirdness and uncertainty associated with stuff going over the wire, and pretty types often don’t really capture that.

I haven’t tried Gleam yet, and I will give it a go, and it’s entirely possible it will change my opinion on this, so I am willing to have my mind changed.



I don’t understand this comment, yes everything going over the wire is bits, but both endpoints need to know how to interpret this data, right? Types are a great tool to do this. They can even drive the exact wire protocol, verification of both data and protocol version.

So it’s hard to see how types get in the way instead of being the ultimate toolset for shaping distributed communication protocols.


Bits get lost, if you don’t have protocol verification you get mismatched types.

Types naively used can fall apart pretty easily. Suppose you have some data being sent in three chunks. Suppose you get chunk 1 and chunk 3 but chunk 2 arrives corrupted for whatever reason. What do you do? Do you reject the entire object since it doesn’t conform to the type spec? Maybe you do, maybe you don’t, or maybe you structure the type around it to handle that.

But let’s dissect that last suggestion; suppose I do modify the type to encode that. Suddenly pretty much every field more or less just because Maybe/Optional. Once everything is Optional, you don’t really have a “type” anymore, you have a a runtime check of the type everywhere. This isn’t radically different than regular dynamic typing.

There are more elaborate type systems that do encode these things better like session types, and I should clarify that I don’t think that those get in the way. I just think that stuff like the C type system or HM type systems stop being useful, because these type systems don’t have the best way to encode the non-determinism of distributed stuff.

You can of course ameliorate this somewhat with higher level protocols like HTTP, and once you get to that level types do map pretty well and you should use them. I just have mixed feelings for low-level network stuff.


> But let’s dissect that last suggestion; suppose I do modify the type to encode that. Suddenly pretty much every field more or less just because Maybe/Optional. Once everything is Optional, you don’t really have a “type” anymore, you have a a runtime check of the type everywhere. This isn’t radically different than regular dynamic typing.

Of course it’s different. You have a type that accurately reflects your domain/data model. Doing that helps to ensure you know to implement the necessary runtime checks, correctly. It can also help you avoid implementing a lot of superfluous runtime checks for conditions you don’t expect to handle (and to treat those conditions as invariant violations instead).


No, it really isn’t that different. If I had a dynamic type system I would have to null check everything. If I have declare everything as a Maybe, I would have to null check everything.

For things that are invariants, that’s also trivial to check against with `if(!isValid(obj)) throw Error`.


Sure. The difference is that with a strong typing system, the compiler makes sure you write those checks. I know you know this, but that’s the confusion in this thread. For me too, I find static type systems give a lot more assurance in this way. Of course it breaks down if you assume the wrong type for the data coming in, but that’s unavoidable. At least you can contain the problem and ensure good error reports.


The point of a type system isn’t ever that you don’t have to check the things that make a value represent the type you intend to assign it. The point is to encode precisely the things that you need to be true for that assignment to succeed correctly. If everything is in fact modeled as an Option, then yes you have to check each thing for Some before accessing its value.

The type is a way to communicate (to the compiler, to other devs, to future you) that those are the expected invariants.

The check for invariants is trivial as you say. The value of types is in expressing what those invariants are in the first place.


You missed the entire point of the strong static typing.


I don’t think I did. I am one of the very few people who have had paying jobs doing Scala, Haskell, and F#. I have also had paying jobs doing Clojure and Erlang: dynamic languages commonly used for distributed apps.

I like HM type systems a lot. I’ve given talks on type systems, I was working on trying to extend type systems to deal with these particular problems in grad school. This isn’t meant to a statements on types entirely. I am arguing that most systems don’t encode for a lot of uncertainty that you find when going over the network.


You're conflating types with the encoding/decoding problem. Maybe your paying jobs didn't provide you with enough room to distinguish between these two problems. Types can be encoded optimally with a minimally-required bits representation (for instance: https://hackage.haskell.org/package/flat), or they can be encoded redundantly with all default/recovery/omission information, and what you actually do with that encoding on the wire in a distributed system with or without versioning is up to you and it doesn't depend on the specific type system of your language, but the strong type system offers you unmatched precision both at program boundaries where encoding happens, and in business logic. Once you've got that `Maybe a` you can (<$>) in exactly one place at the program's boundary, and then proceed as if your data has always been provided without omission. And then you can combine (<$>) with `Alternative f` to deal with your distributed systems' silly payloads in a versioned manner. What's your dynamic language's null-checking equivalent for it?


With all due respect, you can use all of those languages and their type systems without recognizing their value.

For ensuring bits don't get lost, you use protocols like TCP. For ensuring they don't silently flip on you, you use ECC.

Complaining that static types don't guard you against lost packets and bit flips is missing the point.


With all due respect, you really do not understand these protocols if you think “just use TCP and ECC” addresses my complaints.

Again, it’s not that I have an issue with static types “not protecting you”, I am saying that you have to encode for this uncertainty regardless of the language you use. The way you typically encode for that uncertainty is to use an algebraic data type like Maybe or Optional. Checking against a Maybe for every field ends up being the same checks you would be doing with a dynamic language.

I don’t really feel the need to list out my full resume, but I do think it is very likely that I understand type systems better than you do.


Fair enough, though I feel so entirely differently that your position baffles me.

Gleam is still new to me, but my experience writing parsers in Haskell and handling error cases succinctly through functors was such a pleasant departure from my experiences in languages that lack typeclasses, higher-kinded types, and the abstractions they allow.

The program flowed happily through my Eithers until it encountered an error, at which point that was raised with a nice summary.

Part of that was GHC extensions though they could easily be translated into boilerplate, and that only had to be done once per class.

Gleam will likely never live to that level of programmer joy; what excites me is that it’s trying to bring some of it to BEAM.

It’s more than likely your knowledge of type systems far exceeds mine—I’m frankly not the theory type. My love for them comes from having written code both ways, in C, Python, Lisp, and Haskell. Haskell’s types were such a boon, and it’s not the HM inference at all.


> ends up being the same checks you would be doing with a dynamic language

Sure thing. Unless dev forgets to do (some of) these checks, or some code downstream changes and upstream checks become gibberish or insufficient.


I know everyone says that this is a huge issue, and I am sure you can point to an example, but I haven’t found that types prevented a lot of issues like this any better than something like Erlang’s assertion-based system.


When you say "any better than" are you referring to the runtive vs comptime difference?


While I don't agree with the OP about type systems, I understand what they mean about erlang. When an erlang node joins a cluster, it can't make any assumptions about the other nodes, because there is no guarantee that the other nodes are running the same code. That's perfectly fine in erlang, and the language is written in a way that makes that situation possible to deal with (using pattern matching).


Interesting! I don't share that view at all — I mean, everything running locally is just bits too, right? Your CPU doesn't care about monads or integers or characters or strings or functors either. But ultimately your higher level code does expect data to conform to some invariants, whether you explicitly model them or not.

IMO the right approach is just to parse everything into a known type at the point of ingress, and from there you can just deal with your language's native data structures.


I know everything reduces to bits eventually, but modern CPUs and memory aren’t as “lossy” as the network is, meaning you can make more assumptions about the data being and staying intact (especially if you have ECC).

Once you add distribution you have to encode for the fact that the network is terrible.

You absolutely can parse at ingress, but then there are issues with that. If the data you got is 3/4 good, but one field is corrupted, do you reject everything? Sometimes, but often Probably not, network calls are too expensive, so you encode that into the type with a Maybe. But of course any field could be corrupt so you have to encode lots of fields as Maybes. Suddenly you have reinvented dynamic typing but it’s LARPing as a static type system.


I think you can avoid most issues by not doing what you're describing! Ensuring data arrives uncorrupted is usually not an application-level concern, and if you use something like TCP you get that functionality for free.


TCP helps but only to a certain extent; it only guarantees specific ordering of bits during its session. Suppose you have to construct an object out of three separate transmissions, like some kind of multipart style thing. If one of the transmissions gets corrupted or gets errors out from TCP, then you still fall into that maybe trap.


so you need transactions?

I get what your saying, but can't you have the same issue if instead you have 3 local threads that you need to get the objects from, one can throw an exception and you only receive 2, same problem


Sometimes, but I am arguing that you need to encode for this uncertainty if you want to make distributed apps work correctly. If you can do transactions for what you’re doing then great, not every app can do that.

When you have to deal with large amounts of uncertainty, static types often reduce to a bunch of optionals, forcing you to null check every field. This is what you end up having to do with dynamic typing as well.

I don’t think types buy you much in cases with extreme uncertainty, and I think they create noise as a result.

It’s a potentially similar issue with threads as well, especially if you’re not sharing data between them, which has similar issues as a distributed app.

A difference is that it’s much cheaper to do retries within a single process compared to doing it over a network, so if something gets borked locally then a retry is (comparatively) free.


> static types often reduce to a bunch of optionals, forcing you to null check every field

On one end, you write / generate / assume a deserialisator that checks whether incoming data satisfies all required invariants, eg all fields are present. On the other end, you specify a type that has all the required fields in required format.

If deserialisation fails to satisfy type requirements, it produces an error which you can handle by eg falling back to a different type, rejecting operation or re-requesting data.

If deserialisation doesn't fail – hooray, now you don't have to worry about uncertainty.

The important thing here is that uncertainty is contained in a very specific place. It's an uncertainty barrier, if you wish: before it there's raw data, after it it's either an error or valid data.

If you don't have a strict barrier like that – every place in the program has to deal with uncertainty.

So it's not necessarily about dynamic / static. It's about being able to set barriers that narrow down uncertainty, and growing number of assumptions. The good thing about ergonomic typing system is that it allows you to offload these assumptions from your mind by encoding them in the types and let compiler worry about it.

It's basically automatization of assumptions book keeping.


Why couldn't you fix this by validating at the point of ingress? If one of the three transmissions fails, retry and/or alert the user.


But your program HAS to have some invariants. If those are not held, simply reject all the data!

What the hell is really the alternative here? Do you just pretend your process can accept any kind of data, and just never do anything with it??

If you need an integer and you get a string, you just don't work. This has nothing to do with types. There's no solution here, it's just no thank you, error, panic, 500.


You handle that in the validation layer, like millions of people have done with dynamic languages in the past.


> Honestly, and I realize that this might get me a bit of flack here and that’s obviously fine, but I find type systems start losing utility with distributed applications. Ultimately everything being sent over the wire is just bits.

Actually Gleam somewhat shares this view, it doesn't pretend that you can do typesafe distributed message passing (and it doesn't fall into the decades-running trap of trying to solve this). Distributed computing in Gleam would involve handling dynamic messages the same way handling any other response from outside the system is done.

This is a bit more boilerplate-y but imo it's preferable to the other two options of pretending its type safe or not existing.


> handling dynamic messages

the dynamic messages have to have static properties to be relevant for the receiving program, the properties are known upfront, and there's no "decades-running trap of trying to solve this".


> there's no "decades-running trap of trying to solve this".

I’m not as certain. The fact that we’ve gone from ASN.1 to COBRA/SOAP to protobuf to Cap’n’web and all the million other items I didn’t list says something. The fact that, even given a very popular alternative in that list, or super tightly integrated RPC like sending terms between BEAMs, basic questions like “should optionality/absence be encoded differently than unset default values?” and “how should we encode forward compatibility?” have so may different and unsatisfactory answers says something.

Not as an appeal to authority or a blanket endorsement, but I think Fowler put it best: https://martinfowler.com/articles/distributed-objects-micros...

It absolutely is a decades old set of problems that have never been solved to the satisfaction of most users.


> I’m not as certain. The fact that we’ve gone from ASN.1 to COBRA/SOAP to protobuf to Cap’n’web and all the million other items I didn’t list says something.

> It absolutely is a decades old set of problems that have never been solved to the satisfaction of most users.

ASN.1 wasn't in the same problem space with CORBA/DCOM, both CORBA and DCOM/OLE were heavily invested in a general-purpose non-domain-specific object model representation that would suppot arbitrary embeddings within an open-ended range of software. I suspect this is the unsolvable problem indeed, but I also believe that's not what you meant with your comment either, since all the known large-scale BEAM deployments (the comment I originally replied to implied BEAM deployments) operate within bounded domain spaces such as telecom and messaging, where distributed properties of the systems are known upfront: there are existing formats, protocols of exchange, and the finite number of valid interactions between entities/actors of the network, the embeddings are either non-existent or limited to a finite set of media such as static images, videos, maps, contacts etc. All of these can be encoded by a compile-time specification that gets published for all parties upfront.

> basic questions like “should optionality/absence be encoded differently than unset default values?”

However you like, any monoid would work here. I would argue that [a] and [] always win over (Option a) and especially over (Option [a]).

> and “how should we encode forward compatibility?”

If you'd like to learn if there's a spec-driven typed way of achieving that, you can start your research from this sample implementation atop json: https://github.com/typeable/schematic?tab=readme-ov-file#mig...


Interesting. Them being honest about this stuff is a point in their favor.

I might give it a look this weekend.


You seem to have a fundamental misunderstanding about type systems. Most (the best?) typesystems are erased. This means they only have meaning "on compile time", and makes sure your code is sound and preferrably without UB.

The "its only bits" thing makes no sense in the world of types. In the end its machine code, that humans never (in practice) write or read.


I know, but a type system works by encoding what you want the data to do. Types are a metaphor, and their utility is only as good as how well the metaphor holds.

Within a single computer that’s easy because a single computer is generally well behaved and you’re not going to lose data and so yeah your type assumptions hold.

When you add distribution you cannot make as many assumptions, and as such you encode that into the type with a bunch of optionals. Once you have gotten everything into optionals, you’re effectively doing the same checks you’d be doing with a dynamic language everywhere anyway.

I feel like at that point, the types stop buying you very much, and your code doesn’t end up looking or acting significantly different than the equivalent dynamic code, and at that point I feel like the types are just noise.

I like HM type systems, I have written many applications in Haskell and Rust and F#, so it’s not like I think type systems are bad in general or anything. I just don’t think HM type systems encode this kind of uncertainty nicely.


> When you add distribution you cannot make as many assumptions

You absolutely can make all assumptions relevant to the handling/dispatching logic expressed at type-level.

> and as such you encode that into the type with a bunch of optionals.

Not necessarily, it can be `Alternative f` of non-optional compound types that define the further actions downstream.

> Once you have gotten everything into optionals, you’re effectively doing the same checks you’d be doing with a dynamic language everywhere anyway.

Not true, your business logic can be dispatched based on a single pattern comparison of the result of the `Alternative f`.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: