The Data-First movement: <https://duncancragg.sub...
# thinking-together
d
AKA "app-free computing"
Data-First Principles: • End-user computing: easy as a spreadsheet • App-free: an OS or a browser only • Data-focused: viewing, editing, structuring, laying out and navigating • Live data: the system or someone else may change it while you're looking • Local-First: no silos; generic data and update servers connecting local data
apologies for the logo - it was an AI experiment that I dropped into the article and now it's all over the place!
t
I mentioned this elsewhere but I am skeptical that multiplayer is an important concept for data first, personal computing apps. Even in big b2b companies actual honest concurrent editing and realtime is quite rare, that only an important concept in instant messaging so I don't think you should over index on it.
other than that I love the concept!
(the problem with multiplayer is it is directly at odds with other things and mandates an intermediate server)
g
Rather than multiplayer as one of the principles of data-first, I would prefer to have “write only”(/append only/historical). You can build multiplayer on top of that. The idea here is that if you have your one-person write-only local store, and you ingest values from someone else’s local store, where you’re wanting to present a multiplayer store built out of some parts of both, then you have all the information you need to present the multiplayer store; it is left an exercise for the reader. Also, you might want to do this in different ways. If you separate the multiplayer concern from having the data, you can support different syncing strategies, presentation of history etc.
c
@Tom Larkworthy i’m not sold that we should throw away multiplayer because of that. It’s almost more motivation to figure out how multiplayer fits into a decentralized, local-first vision. We have a couple of POCs for folk canvas exploring what infastructureless WebRTC looks like. “Infastructureless” in the sense of there being no need to setup your own servers for webrtc signaling, using physical, local protocols (e.g. sound, qrcodes) or piggybacking off open networks and protocols (e.g. gundb, MQTT, bittorrent trackers). There are still question about reliability, but feels promising so far. https://folk.systems/canvas/#network (still really early on)
g
I love multiplayer. But I think we could accept single player data first apps as legitimate if they have the other things. For me, the point is to escape the poor local maximum that SQL gave us. To have a world where we can wield the relational model freely. Multiplayer is lovely and should be more common. But I think it’s a separate concern.
t
I would like to think my single file local-first unbundled Observable notebook thing with integrated dataviz and reactive spreadsheet coding would qualify as data first. Its literally a local-first port of a reactive dataviz platform, it ticks every box except live multiplayer. I do not intend to add live multiplayer, I think its not so useful in comparison to async git style merging which I would like to have.
g
Exactly! I think for some purposes, having a single-player only-mine store with explicit syncing would be a nice choice to have, rather than must-have-multiplayer. Users of git should likely agree with me.
(albeit I’m not smart enough to like git…)
d
I'm still processing all of this, but for now I'd like to say that the primary, or indeed sole, purpose of the article was to stimulate discussion, not to get all abstract and definitive about what is or isn't Official. So ... "achievement unlocked"! I've been having some extremely intense and stimulating email conversations with many of the leads of the projects I listed, and would love for the insights we're digging up to be made more public. This article is by far the most popular I've written, so it's hit a chord (or a nerve!) with quite a number of people. I'm really excited at the possibility of more open discussion (i.e. not in Closed Proprietary Slack or my closed off email threads!) A public forum of some kind would be an obvious outlet for all of this energy.
Obviously, many of the leads are right here: @osnr @Alexander Obenauer @Andrés Cuervo @jonathoda @Guyren Howe @Federico Pereiro @Clemens @Christopher Shank @Orion Reed @Peter van Hardenberg @Joshua Horowitz @Paul Sonnentag @Geoffrey Litt - so it's a common ground amongst many future-of-computing folk. In other words, the community and the realisation that so many have such similar drivers and motivations is more important than the details of whether it's this or that in detail on the network. In fact, why not simply settle for "something that feels like Local-First" as having a common "vibe" in that department?
@Tom Larkworthy I must admit I'm still struggling with the idea of non-collab subsets of a Data-First philosophy... so you have use-cases where you yourself are literally the only and sole and forever-only consumer of your own work? This is hard for me to grasp. But maybe that's just me.
And git as an alternative to Local-First is something I really need to process on. I'm finding all this, well, surprising!
Chris Gebhardt has got me thinking about immutable content on the net, which seems to be related to @Guyren Howe’s comments above. The Noosphere project had something along these lines also. Again, I'm processing.... 🤔
@Christopher Shank I've moved Folk Canvas up with the others in the article in the light of the network stuff you linked to. I'm really excited to see what you come up with in that area!
g
I think economic incentives in the development of technology have strongly favoured big business. We want to write large apps with lots of users that can run efficiently on AWS. But computing is staggeringly cheap. If we are willing to entertain “inefficiency”, we can make small business and individual user software in very different ways to what we do now. This and other aspects of the economics of all this lead me to believe that our default when storing data is that it is a write-only store. The “current” version of a row is the one with the latest timestamp. It is easy enough to roll event sourcing into this. We already “store” incoming requests — in the stupid text log file, if nowhere else. If instead, we store full, structured inputs to each request coming into a system in a database table (because we’re all about relations), then we arrive at the results of an input to a system are the results of triggers on those inputs. Step back, and consider the larger picture: every state the system was ever in can be reviewed. Every input to the system is recorded, and every state transition. Now, we circle back to small business software, and to my other bugbear: we don’t make software for non-developers to solve their own problems. But if you put a FileMaker-like interface in front of that write-only store, and you think about augmenting that UI with tools to explore its history, I think you really have something. Future of programming? Give me this system, with a Datalog query interface, and I can replace most of what I do in a traditional programming language with queries. Traditional programming is relegated to side-effecting or efficiency-concerned stuff.
t
I must admit I'm still struggling with the idea of non-collab subsets of a Data-First philosophy... so you have use-cases where you yourself are literally the only and sole and forever-only consumer of your own work?
Sharing is to me is extremely important too. You build insights with data, you take data backed actions. Some actions affect other people, its good to construct and share data based arguments. Its one way of clear communication . The web/email is great for sharing. But, I don't personally see a need for live collaboration for the construction of data based arguments. I do think that if you receive a read only data based argument, e.g. a blog post with attached computation, then it would be nice to be able to construct a retort by forking that blog post and adding your own data/computation. That then becomes distributed async collaboration. I don't think data-first would need to insist thats a principal, thats just my personal taste of ideal collaboration. Anyway, I think collaboration is an orthogonal concern, there are many useful and different ways of collaborating and it doesn't make sense to be prescriptive. I only bring it up because the original blog has it as the litmus test to being "data-first". I love the overall idea especially about empowering people to derive data insights.
g
Multiplayer gets into authorisation issues, if nothing else. Sometimes, you want a different sort of sharing.
j
I suppose I should weigh in here since I triggered Duncan with my call for Data-first software in my

LIVE keynote talkâ–ľ

.
I defined Data-first software as: • Like a spreadsheet but for more general data models ◦ Disintermediate the user’s data from code and coders ◦ Plus usable version control of data • Code is metadata: small, readable, and hackable ◦ Like formulas in spreadsheets, “view source” in Web 1.0 ◦ Interop with legacy stack via web APIs: HTTP is the new FFI ◦ Plus trustworthy source-only supply chain management • App as WYSIWYG doc: reify opaque actions into visible state ◦ eg submitable form is view of request log
I suggested a research agenda: • UI metaphors (document, outline, canvas, graph,…) • DB sufficiency • Types & schema change • End-user version control of data • Trustworthy source-only supply chain management • Modularity/malleability • App as WYSIWYG doc: updatable views & reified execution • Big data, big code, performance, scalability (not!)
There is some discussion on this thread about multiplayer collab. My opinion is that this may be a necessary feature, but it would actually be better to have a version-control-like async collaboration model.
A great way to further this conversation would be to submit demos of your Data-first or Substrate ideas to the LIVE Programming Workshop, which will be online this year.
g
This! It is natural for “data first” to suggest having lots of “data about your data”. Its history is the obvious starting point — hence, my suggestion of append-only stores. Another natural thing is diffing and merging. Once you have that, it will be natural to want branching, as in version control. Properly presented, branching and merging will be natural for non-developers to use. Really, you’re talking about supporting multiple models. Meaning, first-class models. Imagine an accounting tool that supported this. A user could look at their entire system, subject to different assumptions, and run all their models. What if sales had been higher last quarter?
f
Coming super late to the party, but I'm very excited by what's being spoken about on this thread. Although we're coming at it from different angles, it feels like we want to converge onto the same thing.
d
That would be nice, Federico! My goal was only to get a conversation going, not to get all divided!