I'm stewing on the idea of a "literate codec" -- <...
# thinking-together
w
I'm stewing on the idea of a "literate codec" -- the "quite OK" ecosystem feels like a good place to start. Can anyone recommend modern alternatives to CWEB?
k
Two comments on opposite extremes: 1. I quite like the blog post about QOI. Perhaps it doesn't matter that it's not literate? 2. After starting out enamored with LP, lately I think the most important thing is that a program is easy to run. If the iteration loop is right it's enough to just throw small programs at someone. They can use it if they care about the domain. If it doesn't seem like people read it, it's because descriptions like this are akin to rooms of requirement. People will come to it when they're ready, months or years later. Just make sure it runs then. The how to run instructions need to be rock solid. Everything else is gravy and nice to have -- if you meet the preconditions of receptive reader and easy to run. So I think classic dead-fish LP emphasizes the wrong things.
Are you familiar with the QOI eco-system? This is the first I'm hearing about it, and I'm immediately suspicious of the airy "20% better compression" claim in the repo. Have other people validated this claim, do you know? It feels more believable if they claimed "20x faster encode/decode, 20x shorter implementation, 20% worse compression."
w
No one really uses qoi or qoa in production, from what I can tell
I think the project stems from the desire to make as simple of a codec / format as possible while still being somewhat performant. I do also have a hard time believing the "20x blah blah" .. I don't think that was evaluated with much rigor. Or at least, it's cherry picking favorable stats
as far as the LP part of the Q, I'm actually less concerned with it being easy to run and more concerned with it being easy to read. This all stemmed from a number of conversations I had at a conference this week about "how do we teach people how codecs work" etc.
k
I'm actually less concerned with it being easy to run and more concerned with it being easy to read.
Yeah, my claim is that this is a false dichotomy. Reading and running are both contributors to helping build a mental model of a program in someone's head. Reading without running runs into all the Bret Victor criticisms we know and love here. There are a few different LP systems out there. I've built one myself and know of several more by just people in my circle. This page is one list. But they haven't caught on much, and I think it's because conflating code with books pulls in considerations from the publishing industry that don't actually help build mental models in people's heads. Literate programs look like blog posts, and reading them doesn't actually get people to engage actively with them. If you separate them from the publishing angle with its irrelevant constraints, other form factors seem more promising: • Textbooks in the context of some formal class, with exercises. (Your link above seems to be in this category.) • Documentation in the context of a specific program. These kinds of circumstances are why in the past year I've started to care more about running first before I even start reading. If you can run it, the reading experience can be more fault tolerant, and it can be more economic to provide. I get the sense I might be talking past you, so definitely let me know if I'm misunderstanding your question.
I've been trying not to plug my own stuff, but a couple of links might help triangulate where I'm coming from: • My literate programming approach. I used this for several years. • A post on why people don't read programs.
k
I very much agree that it takes both reading and running to engage with code, and that's indeed a major issue with traditional LP. Also with Open Source, btw, which suggests that having full access to the code is enough to understand and modify code, even if it's a huger mess and impossible to build. That said, integrating code with a narrative becomes very relevant when you also add data (via visualizations). It's not code you engage with then, but data, computational models, etc. This is the reason why notebooks were so much more successful than traditional code-centric LP. One major weakness of notebooks is the single narrative. What I would like to have is a graph of narratives, code, and data, everything being interactive. I am aware of two real-life systems that enable this: Glamorous Toolkit and Webstrates.
j
+1 on read and run, preferably in a system that allows per form evaluation to aid in codebase exploration
a
Computer programs are complex systems so it is impossible to understand them just by reading them.
For example
(x ^ y) % 9 == 0
is easy to understand as code but when you run it you get something in a completely different domain with effects and relationships that you couldn't have predicted
image.png
that's where liveness comes in - connecting code, domain and programmer
k
Self-referentially, Alex's comment just took us from abstract generalizations about code directly to the domain of code.
But now this thread connects up for me with a recent discussion on Mastodon about what 'understanding' means, and where 'understanding' lies.
Computer programs are complex systems so it is impossible to understand them just by reading them.
It is arguably also impossible to understand most programs today just by running them.
Which now connects up for me with the podcast episode on Programming as Theory-building: a lot of "understanding" a program comes from figuring out which inputs to pass to it. And I don't mean just some static list of inputs captured in a unit test. You learn the broad categories of phenomena you can expect from a domain, akin to the kinds of orbits people have discovered so far for the 3-body problem. I've learned knacks like this from working with others in the past. (Apologies if all this seems too much of a tangent to the original thread. I can start a fresh one if so.)
b
I recall Xiph doing some well visualized and explained posts on their research on experimental video codec: https://xiph.org/daala/
w
I have not abandoned this thread... just gotten busy. I'll swing back shortly to read, digest, and respond.
and also, yes, I've loved Monty's documentation and videos -- shame they take so much time to produce