In Unix <everything is a file> which makes allows ...
# thinking-together
m
In Unix everything is a file which makes allows simple, reusable tools like
ls
,
cd
,
cat
, etc to be used across the whole system. However, files have their drawbacks as well: primarily that they're difficult to merge or detect changes which is useful for syncing, collaborating, and/or subscribing to data. Has anyone seen any alternatives to this paradigm? My startup is currently exploring the idea of append-only logs as our core primitive instead of files. We're not building a new kernel but we're attempting to create a new programming environment with collaboration and reactivity as core tenants. Would love to hear other perspectives on the subject!
❤️ 1
d
@Matthew Linkous some of this thread is likely relevant to you: https://futureofcoding.slack.com/archives/C5T9GPWFL/p1636714161079200
👀 1
but in general there are many great systems out there which have gone past the teletype/terminal and punch card stack/file model of programming and dealing with information. Smalltalk is one example or NLS for a more historical one.
m
Ah yes Smalltalk is a great example. Having the file not be the single source of truth but instead an optional artifact of your program is an interesting concept.
I think my focus is less on the representation of a program but more towards the representation of data or system state. For example: If you wanted to create a global source of a user's mouse position in a Unixy system, you might designate a file that is continuously overwritten by the kernel that other processes could stream in. However, if you had multiple mice (somewhat contrived I know) then you either must deal with locks, create multiple files, or make the file append-only. The same is true for a lot of multi-writer applications--especially collaborative text editing. So it makes me wonder: can/should files be completely replaced with append-only logs.
d
Unclear to me what you mean by <<log>> here. If you are thinking of building interactive, interpreted environment then the static flat file model/metaphor is only going to make it an upstream battle. You can image an message based system like Smalltalk, or an actor one as in the Carl Hewitt sense (in your mouse example, you can have each mouse be an actor sending a message to some hardware manager actor, no need to files or logs of any kind) or you can come up with another like model or better yet metaphor.
m
Yeah I see what you mean. I think my concern is around persistence and syncing. So I guess if we think in terms of an actor that is in charge of receiving messages/updates from each mouse actor and then can respond to messages that request the mouse state then the internals whether it's in memory, stored in a SQL DB, file, etc doesn't matter because it's a black box which for the most part is a good thing. However, I'm thinking more in terms of composable data structures. So if you instead think of each mouse as a stream, then to have multiple mice you would just interleave your streams. I.e. I think what I'm getting at is more of building on top of an event-bus rather than addressable messages. Like it decouples the source and the recipient.
My experience with the actor model is more in the Erlang world than Smalltalk and I have always enjoyed the each process/actor is completely isolated and could be it's own complete computer. However, I've always found the ideas of publishing to streams or channels more intuitive and ergonomic for shared state than exchanging messages. For example in this mouse example, if I had a drawing app then I would likely need to poll the "mouse agent" every 16ms which seems much more awkward than piping mouse changes into the application e.g.
mousePos() |> draw()
then the actual drawing application/process/actor is decoupled from the mouse agent. Which you could model with files but it seems like a different data structure would be more suitable
d
I think Nushell represents everything as a table
m
Yeah I love that idea!
t
Everything is a file is really "*everything is a number of named bidirection stream of bytes*.[1]" . This is very general and you can build pubssub and basically any protocol on it. Switching to everything is an append only log seems jarringly limiting and underspecified. How could you implement a network adapter over that abstraction? What is the read pattern to these files? Are the contents queryable forever? Where are these files stored? Though in your later chat you seem to be talking about streams, so maybe you have arrived back to where UNIX is already? [1] Linus Torvalds himself corrected it again a bit more precisely: "Everything is a stream of bytes".