Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

First of all, thank you for the amazing read! I thoroughly enjoyed the entire article, and it gave me a new perspective on the feasibility of CRDTs for real world applications performance-wise.

Though I am curious now to hear your thoughts on the conflict resolution side of the equation for complex data structures like deeply nested JSON.

The biggest takeaway I got from Martin's talk on the topic from a few years ago was that while CRDTs are theoretically guaranteed to eventually converge, the resulting converged state might not make any sense to applications that need to consume and act on that state [1].

It seemed like a pretty fundamental challenge to using CRDTs to store arbitrary application state to me at the time, but I imagine the field has progressed immensely since then, so would love to hear any insights you might have around strategies that could be used to alleviate or at least work around this challenge if I wanted to build a CRDT-based app today.

[1] https://youtu.be/8_DfwEpHE88?t=2232



I’m not sure how much the field has improved - good chance there’s some new papers I haven’t read. But I think it’s pretty doable. For all the talk of concurrent editing, the reality is that having multiple users edit the same value at the same time in most applications is incredibly rare. It’s rare enough that concurrent editing is just basically broken in most web apps and nobody seems to mind or talk about it. For structured / database data, the best effort merges of current systems (or doing simple last writer wins stuff) is a fine solution in 95% of applications.

But ideally we want something like the semantics of ot-json-1 [1] which supports arbitrary move operations. This is necessary if you wanted to implement a program like workflowy on top of a crdt. Martin thinks this is possible in a crdt by sort of embedding part of an OT system and doing transform, but I don’t feel very satisfied with that answer either.

The other thing I would love to see solved is how you would add git style conflicts into a crdt. The best effort merging strategy of most OT & CRDT systems is fine for real-time editing but it isn’t what you want when merging distant branches.

Automerge today supports arbitrary json data, inserts, deletes and local moves. I think that’s plenty for the data model in 99% of software. I think most software that fits well into a classical database model should be reasonably straightforward to adapt.

I’m not sure if that answers your question but yeah, I’m thinking about this stuff too.

[1] https://github.com/ottypes/json1


Thanks for the great post. Indeed, as a former scientist myself, I can say you have to take everything you read with a grain of salt. I've seen inside the sausage factory, and concluded that YMMV.

> The other thing I would love to see solved is how you would add git style conflicts into a crdt. The best effort merging strategy of most OT & CRDT systems is fine for real-time editing but it isn’t what you want when merging distant branches.

I found this comment very interesting. I have been playing with the idea of 3-way merging CRDTs, similar to the git approach. Have even used this type of branching in commercial software I work on for handling concurrent changes to files.

Be very interested to know if any efforts are being made on this in the CRDT community. (I'm more of an interested onlooker. I use a lot of the same concepts in my software, but not rigorously.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: