BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Local First – How To Build Software Which Still Works After the Acquihire

Local First – How To Build Software Which Still Works After the Acquihire

43:58

Summary

Alex Good discusses the fragility of modern cloud-dependent apps and shares a roadmap for "local-first" software. By leveraging a Git-like DAG structure and Automerge, he explains how to move from brittle client-server models to resilient systems where data lives on-device. He explores technical implementation, rich-text merging, and how this infrastructure simplifies engineering workflows.

Bio

Alex Good works full time as a maintainer of the open source Automerge library. He spent a large chunk of his career building and maintaining distributed systems of ever increasing levels of complexity.

About the conference

Software is changing the world. QCon London empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Alex Good: What am I going to talk to you about? I'm going to talk about how to build collaborative software which doesn't rely on servers and why that's a thing worth doing. Who am I? I work at Ink & Switch. We're a research lab focused on building software that's good for creative thinking, tools for thought is the buzzword for that sort of thing. At Ink & Switch I work on maintaining a library that helps build local first software. I'm going to get into a bunch of what that means.

Roadmap

What's this talk going to be like? I'm going to start with a little story about how I got into this and why I think collaborative software is fragile, how that story made that clear to me. How I think we could fix it by building some generic infrastructure that we could use for all applications. What it's like to build this kind of software right now, and some challenges and opportunities for the future.

Collaborative Software is Fragile

This is where I started. An irrelevant number of years ago I built a little application for tracking my workouts. It was really simple. It was just a little Android app. It took me a few hours to build just using off-the-shelf components in Android. It was really fun to build because it was a very small amount of code and it was really easy to iterate towards something that you want to use. I gave it to a bunch of different friends of mine who were using it, and then one day a friend of mine says, I want to use this on my laptop as well. I started thinking, how would I implement that? Here you can see I'm saying this is collaborative. I'm using collaborative in a really broad sense of anything which has multiple devices collaborating on the same data. I thought, I'm going to have to design and implement some kind of server and API and a whole other piece of software that runs on the server to do that so that I can synchronize data from one place to another via this server.

Then I'm going to have to handle optimistic updates of the UI and the synchronization with the server in that application. I actually have to set up the server somewhere and make sure it stays running. Then other problems, I'm going to have to figure out some kind of auth 'n auth thing. Suddenly this really simple little application that I've built has become this really complicated job. This bothered me because firstly it feels untidy. Conceptually, I have my phone and my laptop. Why am I building a whole other piece of software that runs on a computer I don't even see just to get data from here to here? It's a lot of work, and this is a fun project. The friends who are using it would now be dependent on the servers that I run to be able to use this application. I didn't do this. It just didn't seem fun.

It got me thinking. Why does this feel like a problem? I think it's a microcosm of a broader problem with a lot of collaborative software which comes from a couple of things. One is distributed systems are complicated to develop. It's not just that there's more code to write. There's more design problems to solve that are just an entirely separate area of expertise than the expertise required to build the code that runs on the client. Already you feel like you have a much more complicated problem to solve. Even when you solve it, you're often stepped backwards in terms of the UX. Here we've got a few problems. Every time I tap a folder in Notion, I have to wait for a loading spinner, even if I just tapped it. Or writing on a plane is now like I have to make sure I downloaded the right extensions or enabled the right option versus editing a document in Word 20 years ago where I just edit it. I don't think about it. I lose creative privacy.

People frequently have this pattern where they'll copy a Google Doc into a new version of the Google Doc just for them while they make some changes so that everyone else doesn't see those changes until they're ready to give them back. We've stepped backwards a bit. Then users also now depend upon a bunch of servers that they don't necessarily control or even know about, which is how we know we're in a distributed system. Leslie Lamport, who invented, amongst other things, the Paxos protocol, has this quote, "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable". If I stop paying for the DigitalOcean droplet that this thing runs on, suddenly my friend can't access his workout data. If I create a great startup off the back of this workout tracker and then sell it to Google and my workout tracker ends up on this list, now what do my friends do? We're losing some user agency here.

Then, a broader problem, which I think is related to the ethos I have, is I think we need to inculcate a culture of repair and maintenance in society broadly, but engineering especially. The idea that we don't want to have to throw away applications or throw away things in general. How do we repair this kind of application? The server is a custom piece of software. There's maybe one person or set of people who are experts in it. If that person or set of experts goes away, no one else is standing by to repair it. This is like if you had to repair the plumbing in your house, but every plumber used a different gauge of pipe and didn't use an off-the-shelf boiler. They made some custom device for every house. It would be much more complicated and expensive to maintain houses.

My broader point here is that I think collaborative software is fragile, because we have collaborative applications that all depend on some custom-built piece of software that has to stay running. I think this has structural costs. It means that the cost of developing collaborative software is high, so you can't scale it down to smaller communities or groups of people for whom you can't find a decent-sized market that would buy the software. Then the people making that software have an undue amount of control and power over the users that they have. It's also really inefficient. We have these powerful computers in our hands and in our homes, and we're not really using them. Instead, we're spinning up giant clusters of computers elsewhere that do all of the actual work, and the local computers are just displaying things. We have a major performance cost as well.

Building Generic Synchronization Infrastructure

I didn't make this application, but I did get thinking what I really want is something like this. I want there to be some generic protocol and set of servers. These things in the cloud or wherever they are, I don't want to care about them. I want to think that that's a commodity thing that I can just buy. If it goes away, I can point my application at another one. I want my application to just have a URL in it that says, where's your sync server?

Ideally, I want that to be optional. The application works without it, and the apps should be able to talk to each other directly if possible. I went looking around the internet doing a bunch of research, and I ran across this essay by some of the folks at Ink & Switch about local first software. They had a bunch of principles that I've put out here for what they think local first software should be like to use. No spinners, the data should most of the time be on your device. You shouldn't have any latency to switch folders. It should be multi-device. This is stuff we expect from cloud collaboration software, but for local-only software, multi-device is really hard. I have to email Word documents around or stick them on file sharing servers or whatever. Unlike cloud collaboration software, the network should be optional.

If I turn the Wi-Fi off, my Word processor shouldn't stop working. Collaboration should be seamless. Again, something we expect from cloud collaboration software. I shouldn't have to be manually copying changes from one file to another if I'm collaborating with someone. That should just be built into the application and easy to do. There's a few other points here that I think are important but are less what I'm going to talk about, except towards the end. You can recast a lot of these requirements as local first is a design principle in which the data on the user's device is the primary source of truth and other copies on other servers or other devices are secondary.

This all sounded really good to me. I was like, yes, this is what I want to build. How? That seems like a difficult problem. You're suggesting that this would be a way we could do things. How do we actually do it? The really interesting thing to me about the essay and the work that Ink & Switch have been doing is they had been building a few tools to do this kind of thing that were quite general. On the face of it, it seems like I'm saying things should be more complicated than they currently are.

I'm going to dive in now into a technical idea of how this kind of systems can work. Mainly because that's what I can speak to with authority but also because I think it's really interesting. I want to convince you that there's something here. The way I'm going to do this is by talking about something that a lot of us are already really familiar with, unfortunately. We all use Git. Git is local first in the sense I'm talking about here. You have the data on your machine. You can always make progress locally.

Turning off your GitHub server doesn't make any difference to your local machine. You can use any Git server. You don't have to use GitHub. You can use GitLab. You don't even have to use a server, you can run synchronization between two of your own machines if you're really in a pinch. I just want to be clear. I'm not advocating that we make everyone use Git or anyone use Git, really. It does give us this structure. This looks a little bit like the dream layout I gave you earlier. Let's push on this idea a little bit and start thinking, what if we explored a really terrible idea? What if we were to represent our application state as a JSON file stored in a Git repository? What I want you to imagine here is not that the user is interacting with a Git repository. You are writing an application that uses as its storage a Git repository. Every time you make a change, you have to make commits in that Git repository, syncing that Git repository. Here's one way to do it. Every device represents its state as a branch. To sync with another device, you push and pull all the device branches that the other side has, and then merge all those changes into your device branch. I'm going to hand wave a lot here, because this is a terrible idea.

Let's walk through a little bit of an example, so the venerable Todo app. Our application state is this Todo's JSON file, a single file in a Git repository. Here's some initial state. Let's make some changes to this file. On my laptop, I'm going to delete the first entry. On my phone, I mark the second entry as done. Conceptually, what we end up with is a structure that looks like this. This is the commits in the Git repository after this. They're on separate machines. This is conceptual. We haven't synced it yet. Now we have a couple of problems. There's two I want to talk about here, obviously we have a lot more. We have to choose what order to put those commits in, because of the way that Git works. We're going to get a merge conflict whatever order we choose to put the commits in. Here's two ways we could choose which order to put the commits in. Say the laptop has pulled the phone's device branch to its side, and the other way around. They both have to choose one of these orders. Or there are other things you could do. You could create a merge commit.

The point is that Git requires that the state of the repository at any particular point that it's looking at be a single commit. We have to choose some ordering. Unless we find a way of choosing the same ordering on every device, we're going to end up with merge conflicts every time we sync, regardless of whether something's changed or not. Even if we don't do that, we're still going to end up with merge conflicts here. We're going to end up with a merge conflict that looks something like this. Imagine you're writing an application that has to handle this.

Every time any kind of sync happens, it's going to have to look at the two versions of the file and be like, what's happened here? First off, it's going to have to do a rebase to reorder the commits, and then do this. It's obviously going to be really hard and complicated to build. We can fix it. This is going to start moving towards the structure of a system that's actually, in my opinion, easier to use than a lot of traditional server-client architectures. First off, if we relax the constraint that we have to have a single commit be the state of the repository. Secondly, we choose a finer-grained data model for each commit than a snapshot of the filesystem.

Instead of forcing these things to be in some particular order, we just say, no, we just have a single commit DAG that's the structure that the users created at the time. We say, if two commits conflict, then we'll just arbitrarily choose the commit with the lowest hash as the state of that commit. Then we provide an API for the application to use to examine those conflicts. To make that concrete, here we might say, these two commits conflict.

In Git, two commits gets conflict if they edit the same file. We're editing the same file, and we say, we're going to arbitrarily choose that the winning state be the one on the left. We're going to say, there's some API now that says, give me the other states that were there. The application can then show to the user, these are the other states. The nice thing about this is that because each node, each device, when it made edits, produced valid JSON, the winning state is always valid JSON that the application can make sense of. It can show you a useful UI. You can examine these other states when the user wants to. The user can continue making progress and then be like, now I want to figure out what to do with these conflicts. Rather than immediately having to solve conflicts when you sync. It's still conceptually annoying because these two commits don't actually conflict. One of them deleted an entrant's todo list, and the other one changed the state of a different entrant's todo list. You can imagine ways that you might fix this. You could put each todo list in a separate file.

Then you have to figure out, how do I maintain the ordering of these todos in separate files? That's the idea behind using finer-grained, richer changes than state of the filesystem. In Git, each commit is a snapshot of the whole filesystem. We're not editing text files on a disk. We're editing more general-purpose data structures, lists, maps. These are things we're used to using as programmers. If we represent our change using that kind of structure, on the left this is the Git commit. On the right, we might represent that with something like this. First create a list. Now insert an object into that list. Then set the value of that object's ID to 1. Add the characters, Buy milk, to the text attribute of that object. Set the value of todos not done to true. It's much finer-grained, and it's richer. What that means is that when we get the two commits that have made some change to these objects, we can see that they don't conflict in this case, because they're finer-grained. We can delete the object at todos/0, set the value of todos/1 to true. Those two things we can apply together, and we don't need to make the user resolve these conflicts. We will sometimes have conflicts that we need to resolve, but this means that we have to do it a lot less.

Where does that get us? We've got a system where, like Git, we track changes made to data as a graph of commits. That's the fundamental data structure we're going to sync around a commit graph. We use a much finer-grained model of change. Instead of filesystem snapshots, we have operations on a generic data structure. We provide a default automatic resolution of conflicts. The really important point about that default automatic resolution of conflicts is that every node that has the same commit DAG will reach the same default resolution. Two people who are in sync will be looking at the same thing on their devices. That's a really important feature of collaborative systems.

Otherwise, it gets very confusing to reason about. What this gets us is that we have this kind of mechanical application-independent sync layer, which we can use for real-time change as well as asynchronous change. Which is something we don't have with Git. Which is why, when I want to collaborate with you asynchronously, I send you a pull request. If we want to collaborate asynchronously, I have to get on a Zoom call or something. This is a substrate that we can use for both. I'll show you some examples of that. It means that the application developer's role becomes focusing on version control of their domain data rather than managing a distributed system. We get some nice extra capabilities because we have this fine-grained change history.

Building in Local First, Right Now

I want to give an example of what it's like to build in this kind of system. This is going to be based on Automerge, because that's what I'm familiar with. There are other systems out there. The big one in the world is Yjs that also implements similar ideas. I'm going to walk through about 80 lines of TypeScript that implements a todo list application. Then we're going to add a bit of sync over the top of it. We'll start local only and then add sync, so we can see the increment between these two versions. This is our todo list application state. The same thing we were looking at earlier. We're going to store it on the window. We have some initial state here.

Then this is what we're going to render this to. You'll see what this looks like in a bit. It's bad. That's ok. It's a demo. We have a div with an input where we can add the todo list description and a button to hit add. Then a list where we actually render the todo list items. Then this is where we render them. We're going to do this every time anything changes. We're going to create a list item for each todo list item with a strike for it that's checked. Then we wire up an event handler. Every time you click add, we grab the value of the input element and push it into the app state. Likewise, we add an event handler for toggling a todo list where we just set done to the opposite of whatever it was before. At the end of this, you've got a todo list item where you can add a new item or toggle them done or not.

How do we add synchronization to this? This is the first part. We're going to create a repository. The interesting parts here are this network and storage components. The network component, this has an adapter that connects via WebSocket to a public sync server that we run. That sync server could be one that you run. It doesn't have to be a sync server. We do do this, you could run a broadcast adapter to use the same mechanism for broadcasting changes to other tabs in the same browser. Or you could use a PeerJS adapter to talk to other devices over peer-to-peer. Anything that can provide a stream of bytes, you can implement this thing with.

Likewise, we have an IndexedDB storage adapter here so that once you've created the todo list or synced it, re-reloading will bring it back. It's local first. You don't need the network. Again, if you're not in a browser, you can use the filesystem for this. It's a generic interface. This is a little bit of boilerplate. Every document in the repository has an ID. We have to either create a document or look up the document that we want to collaborate on. You've sent me a document ID, I want to collaborate on it with you. The way we do that is we stick it in the hash of the URL. That means that there's a document URL in the hash. We look and see if it's there. If it is, try and find that. That's this repo.find call. Otherwise, we create a new document. Now we have a repository set up and we have what we call a document handle that points at some way of changing or listening to changes to that document. We can get into the interesting bit of actually making the application itself collaborative.

First off, rather than rendering the window app state, we're awaiting handle.doc, which is saying, wait until we've got this thing. If you've sent me a URL, it's going to fetch it from the sync server or whoever else we're connected to and say, do you have this document? Then, instead of directly twiddling the app state in our addTodo method, we're going to wrap the mutation of the state in this docHandle.change function, which captures any changes that we want to make. I think an important point to note here is that the logic here is the same as if you were working with a local POJO. We're just twiddling some JavaScript. We're just wrapping in something that captures that change, creates a commit from it, and sends that over the network, stores it locally, does all of the synchronization infrastructure for you. This is the toggling part. Same deal. Wrap the state mutation in a little function that captures that state change.

One thing to notice about both of these is that in the first version, we called render todos at the end of any changing method because we needed to tell the frontend that something changed. Here, we don't do that because docHandles have a generic way of saying react to changes to this thing. This gives us a bit of reactivity for free. Here, we're saying any time it changes, either locally or because of something happening over the network, just re-render the UI. This is a demo of that working. I copy the URL from the left, paste it into the right, and then we're going to do a few concurrent changes.

The important point about this was that it was a very small incremental change over a simple local app. If we go back to my demo with the workout tracker, this is what I wanted. I just have to sprinkle a little bit on top and I get synchronization, but in a network-optional fashion. With a backend that doesn't care about my application and is fungible, I could use anything server I want to. Here's a slightly more complex change, just for fun. This is a rich text editor. I'm going to disconnect these to show you how this looks when you receive changes slightly out of order. On the left, I'm making a span bold. On the right, I'm adding a new list item. When we merge, we get something predictable.

The point of showing all this is we can have a generic infrastructure that allows us to simply add new features, sync and collaboration, without giving up the benefits of local software. This is like simple applications where the merge behavior is obvious. That's not most of the time going to be the case. Most applications are more complex than a simple todo list or a single piece of rich text. They have invariants that you want to maintain. In these contexts, it's important to think about asynchronous change. That's what I'm saying here. In real-time scenarios, it often does make sense to just merge everything, because you're usually on a call talking to each other. You can talk about what's happened. You expect like the todo list item changed when you overwrote your change. That's fine. You can just say, did you just change that?

A lot of the time, you don't want to be doing real-time merge everything things. Sometimes you're just on a plane, you can't be in real-time. Sometimes you want to do a bunch of work on its own and then ask people to review it, because it's a big change and it wouldn't make sense if you merged it anyway. For more complex changes like this, we can take that commit DAG and reintroduce the idea of branches. Here I've got a branch for my laptop and a branch for my phone. Branch here is not like a Git branch. It has to be a single commit. It can be a set of commits. I've got a few examples here of what this kind of thing allows.

Again, it's an incremental change over the local-only version. Because we now have all of this rich change history, we can build more complex review UIs without having to do a whole ton of extra work to now store history that we weren't storing before. We've got the history. Here's an example of me editing my favorite RFC. I've decided I want to make a big change to it, so I'm going to create a branch. You see this little dropdown on the top. Here we go. I've added an extra paragraph. I can see the changes that I'm going to make. Maybe I want to get someone else to review those changes. I send it to them. Robin, you can see up here on the right, he's added a little comment saying it looks good. I can merge that. Now on the right, I have a nice history of what's happened. There's a history of creating a branch and merging it. Because this is generic infrastructure, this is all pretty ho-hum. This is text. We're used to doing this workflow with text because we do it with source code all the time.

This is generic infrastructure that works on complex data types. We can do the same for drawings, for example. This is tldraw. It's an off-the-shelf drawing application. It's open source. We took that and modified it to store things in Automerge documents rather than the JS that it was using. Then that gives us, for free, because we've already built all of the infrastructure for change management and tracking, we can build this diff visualization stuff on top. The work of the application developer here was to do some work to tell the application how to visualize the difference between two versions of the data structure in question. They didn't have to do any work to figure out how to track and change history or how to represent the different points in history or synchronize those around.

The point of me showing all this is not to be like, look, you can add collaboration. It's more that this gives us generic infrastructure. If we recast the idea of application development as version control for domain data rather than as the frontend for a single source of truth that's stored somewhere else, we get generic infrastructure, which protects user agency, gives us all of those nice benefits of local first software I was talking about earlier. I believe it also reduces the development complexity because you don't have this extra distributed system you have to manage.

My experience of it has also been that once you take the storage and networking out, you can move much faster on developing the frontend that you care about. We also get a bunch of new capabilities because we have these better change management features that come from tracking a rich history. This isn't a panacea. It's a trade. This works really well with documents that can already be quite easily thought of as some kind of media editing application. It's not going to work very well for an e-commerce application, because, fundamentally, that is about consensus on what's available in your store, whether you bought it. Often, that trade is worth making, I think. I think for the kinds of applications that I want it to work for most, it doesn't happen. I still get angry about the spinner in Notion.

Challenges and Opportunities

There are a bunch of challenges still to be addressed here. What I've showed you so far works really well until you want to add authentication and authorization, which it turns out people want quite a lot. The reason that this is hard is because the way that people manage it right now is people will layer it over the top. You have that sync server adapter I showed you. People will layer an auth layer over the top so you can only push and pull from sync servers that you're authenticated to. Then that server now needs to do a bunch of application-specific work. The whole thing that I was saying we didn't want to do about examining the changes and making sure that the kinds of changes that that server wants to allow through. That's often to do very simple things like make sure that the user is not writing to other users' todo lists, say. We are doing a whole bunch of work on this front right now. The idea is to represent users as cryptographic key pairs because we can do a lot of fancy cryptography here which allows us to turn all of our problems into key management problems. Famously easy to solve. We think we've got some things that are coming here.

Right now, we don't have a generic layer that anyone can use. In practice, everyone, you have to run your own little private sync server. Indexing and partial synchronization, so lots of datasets, you don't actually want to have everything locally. You want to have the documents that you were working on maybe in the last month. Or maybe you were part of a large team, several hundred people, and you don't want to download everything that everyone in that team has worked on that you have access to. Schema enforcement and evolution. The operations in the commits that I've showed you here are fairly generic. They all enforce that the document is valid JSON but not that price is below a certain amount or that a counter never goes above some number.

Those invariants are easy to do review workflows on if they're slow-moving human data like text where you're expecting to do those kinds of reviews, but much harder to do automatically. This becomes especially problematic in a local first context where you don't even necessarily have the same schema running in every version of that application. You've updated the application, you've added a new field, and you're synchronizing with someone running the old version of the application. This is a tough problem. All of them put together, I think are why we don't have some kind of generic infrastructure right now. There's research ongoing on all of these, but right now the state is that you will run your own sync servers and perform some schema enforcement and some access control and some indexing step outside of that sync server. I still believe that it's simpler for a lot of document editing use cases than the centralized version, but it's extra complexity.

The last thing I wanted to finish up on is I think there's an opportunity here because LLMs are really powerful, but they're very unreliable. They don't produce outputs that you can trust off the bat, generally speaking. In local first software, we have to build all of this general version control for arbitrary application types infrastructure anyway in order to do synchronization. You need this for LLMs as well.

If you solve both of these problems at once, you get applications that are more powerful than they would have been if you do it, like it would have taken a lot of work to build change management on top of a centralized application, and you wouldn't have been able to trust building an LLM-based workflow without that change management. We've done a little bit of research on this at the lab, and I just wanted to show a few, just a screenshot of it, really. Here's that same On Consensus and Humming RFC. Here, I've got a little bot tab on the right. You can see I've just said, line up this line, make this paragraph a bit more friendly. All that's doing is sending off a request to your configured LLM with a little description of how to make changes to the document in question and a prompt for what to actually do. Then it uses all the same built-in tools that we've got for reviewing human edited documents, to show the changes that the LLM has made. Here you can see, I asked it to make it a bit more casual. It's done the LLM thing of adding a bunch of adjectives.

Questions and Answers

Participant 1: I love the idea of local first, and that's how it's supposed to be always from the very beginning, and going to servers is just wasteful. There is a really good point about that, because in the examples that you showed, the server part was redundant. It was a very sophisticated backup, but quite often it delivers extra functionality. Whatever it is, it contacts with third party and delivers some documents. It could be, as you mentioned, a checkout on a store. You mentioned that this is part of the challenge. That it can be solved. What are the parts solved? Personally, I can mention two. Storing only data that is customer related, having like storage per customer, or doing some cryptography on top, so you have different users with different keys. Then it introduced complexity of the data validation. Are there any other approaches?

Alex Good: Just to clarify what I think you're saying is that this is an approach to functions that you want to be able to perform on the server.

Participant 1: Because I don't have a choice.

Alex Good: Yes, and you want to have some kind of authorization access control, which is still generic.

Participant 1: Yes. I don't want to really allow users to push certain data. They don't control the full storage, they only control a certain part of it.

Alex Good: For that specific problem of what users can write to where, the approach that we're working on is a project called Keyhive. There's some notes on the lab website about this. The idea is to represent each device as a cryptographic key pair, and then have on top of that a group management. It's almost like a little commit DAG of its own that manages the groups that you are part of. You would have a group, called the QCon organizers group.

Then you would have a group that represents you. Then you have a group that represents the devices that you own. You can use that to create a graph that you follow. Let me try and describe this in less detail. You want a situation where the authorization and authentication is a data structure that you synchronize around. That's what we're talking about here. Like you have the key pairs that make up your device, and the key pairs that make up a group, and you have a commit DAG that represents that whole thing. All you need is the commit DAG for your auth data rather than to ask a particular server whether you're allowed to do something.

Participant 1: What if the client is [inaudible 00:36:39] client that pushes raw data, it's like, how do we manage after that?

Alex Good: You've always stored all of the data, all of the history. If a client pushes something that is bad, you just go, ok, that's bad, ignore it, go back. The place where this doesn't work is for situations where you're taking real-world action. Like you've sent an email, you can't unsend the email. Or go to everyone who's received it and say, please forget that email. In those scenarios, you're going to need to have some special server that controls access to the email sending and does some access control. For me, the idea is to have those things be smaller and more constrained, and they don't have to be the whole application state. They can be those specific side-effecting operations that you want to control.

Participant 2: In the data model, like the initial simple data model of just hierarchical bars in the filesystem, how do you handle relations between the different elements of data? Where I can imagine that during this Automerge, some external constraint has been broken and now the application doesn't know what to do.

Alex Good: This is part of the schema management problem I was talking about. An example might be, say, in your todo list application, you have a list of important todos that's in a side table rather than being attributes on the todo list. Each one of those has the ID of a todo list or the index of a todo item that you care about. How do you make sure that the items in that list are actually in the document? Right now, you fudge it. It's not quite true. A lot of the time, this is less severe if you optimistically do things. You say, I'm going to have a list of todo list IDs that are important. If a todo list item isn't in there, just ignore that item. For the long-term future of this, I think what we need is custom operation types that encode those invariants. This thing must exist in the ordering of the commit DAG to make sure you never receive a change that doesn't have the thing that it depends on.

Participant 3: You mentioned briefly that versioning of the software is going to be difficult. Can you elaborate a little bit on how you would distribute versions of the software, who handles the authority and so on?

Alex Good: First off, I think it's fine for the software to still be distributed by a central entity. Because if you do have a local first application that depends on a generic server, for me, one of the key things is to make sure that that application continues working if you don't have the server. If you have the application, it's to some extent under your control, but you don't have the server, you don't have control over it, and it's a thing that you now just depend on. If you do want fully distributed publishing for the application, I think there's like a separate problem. The ATProto ecosystem has beginnings of a good public data infrastructure, I think. You could imagine publishing an application to a cryptographic ledger, not a blockchain, like a PLC, something more like a certificate transparency approach. I think it's an orthogonal problem.

Participant 4: In a case where you have multiple devices, and some of them are powerful, some of them are not at all powerful, do you consider a scenario where some devices due to their very weak performance, actually, we have to keep them online because they can't do the computations and stuff themselves, whereas the other ones are local first?

Alex Good: The term I use in my head when I think about this by analogy to Git is like a shallow clone approach, where you want a client which doesn't grab all of the history and reify it. It just asks some other peer, what's the latest state that I should care about? Likewise, rather than creating its new changes by creating full sets of operations, it just says to the peer, based on the latest that you gave me, here's a new change. That peer does the work of integrating that into the commit DAG and publishing it. We haven't done that work yet, but there are other systems that do that work. If you're interested in it academically, I think Eg-walker is the name of one algorithm that supports that quite nicely. That's actually the basis for that whole data structure is you always have the latest state and you only reify the history when you need it, and devices that can't don't. I think that's a promising direction.

Participant 5: I guess this makes it really easy to make stuff where the server has no idea and actually can't see the data?

Alex Good: The thing I'm currently working on is that. The commit graph I showed you, all it cares about is the structure of that commit graph. The server doesn't care about the contents of the commits. You can actually have those end-to-end encrypted, and then you can do a bunch of nice optimizations on top of that. The main point is, yes, the server can be blind.

Participant 5: I would guess that the commit history becomes really long, quite large, very quickly, especially in something like a text editor. Have you had to optimize that in any way?

Alex Good: All real systems that do this, the main technical problem that they're solving is compressing the change metadata. What Ethan's referring to here is that if you've got a text editor where you're creating a commit for every single key press and you're doing something like adding a commit hash for that key press, and maybe a reference to the previous commit hash, you can imagine the amount of overhead you have in terms of metadata there. You've got at least 64 bytes of overhead per keystroke.

Most real systems that do this will run-length encoding and take advantage of patterns in the data. When I'm typing, it's very normal for one character to follow another. Very few people do things like typing in reverse, for example. We can compress a whole bunch of text that you've entered. Technically it's like, yes, commit per key press, but I can just say, begin here, end here. There's 10 characters in between. Then you can recalculate all the hashes if you need to verify them, but you don't need to actually encode, put all of that metadata on the wire. The actual payload of the compressed range of changes will be basically star hash, end hash, bunch of characters.

 

See more presentations with transcripts

 

Recorded at:

Apr 08, 2026

BT