> That's reasonable but it's a different notion of what a safe change is than I ...

skybrian · on May 22, 2023

I think our misunderstanding is really about use cases.

Sometimes Protocol Buffers are used to write log files, and the log files are stored and never migrated. To read the oldest logs, you need backward compatibility all the way back to the first production code that was released. This means transitive safety is needed and the changes you can make to the schema, which is used as a file format, are pretty limited.

This isn't just a limitation of the Protocol Buffer format. Safety rules are different when you do long-term persistence. If Typical were used that way, you could only trust safety rules that are transitive. Asymmetric fields could be added, but the fallbacks never go away.

(Also, a rollback doesn't get rid of any logs that were generated, so it's not a full undo. As you say, both forward and backward compatibility are needed.)

Serialization isn't just used for network calls, and even when it is, sometimes you don't control when clients upgrade, such as when the clients get deployed by different companies, or as part of a mobile app. So it seems worth clarifying the use cases you have in mind when making safety claims.

stepchowfun · on May 23, 2023

I think you're right, and now I understand why the rules seemed buggy to you but not to me. You're considering persisted messages that need to be compatible with many versions of the schema, whereas the discussion and rules are formulated in the context of RPC messages between services which only need to be compatible with at most three versions of the schema: the version that generated the message, the version before that, and the version after. The README could do better to clarify that.

In the persisted messages scenario, there is one change to the rules: you can never introduce a required field (since old messages might not have it). Not even asymmetric fields can be promoted to required in that scenario.

skybrian · on May 23, 2023

Okay, great! Hope that helps.

To expand on this, a way to think about it is that there are some changes that are always safe and others that depend on what data is still out there (or that’s still being generated) that you want the code to be able to read.

“What writers are out there” isn’t a property of the code alone, though maybe you could use the code to keep track of what you intended. The releases deployed to production determine which writers exist, and they keep running until stopped and perhaps upgraded.

In some cases a serialization schema might be shared in a common library among multiple applications, each with its own release schedule, making it hard to find out which writers are still out there.

It’s much easier when the serialization is only used in services where you control when releases happen and when they’re started and stopped.