<https://x.com/kgrgreer/status/1921988912857141638...
# linking-together
k
I suspect there is some overlap with narrow waists.
m
we've been paying for "code as text files" ever since
k
"Code as text files" isn't the problem. It's "text files as the only view on code". And I suspect that's true for most of the other universalist protocols in computing. I am currently experimenting with transclusions for providing additional views on code, which remains stored in text files as the underlying programming system requires: https://hyperdoc.khinsen.net/
m
I ~disagree. The problem is the same as with SQL - it is text(files) first, AST (abstract syntax/semantic/etc tree) – later. Therefore each must roll their own text->ast convertor. Yeah, you get carrying code around for free, looking at it with your own eyes too (however text is still bits, and text viewers - is a special software converting bits to human-readable "photons") But also you get for free "tabs vs spaces", even if you dont care – someone on the project does. You get all code-formatters and configs wars on the projects. You get junk diffs with zero semantic awareness and have to go through them, when someone does not comply with formatting style was there before. Even for top 10 languages with Millions users for decades. You get linters and bugs and cpu cycles wasted on linting, failed builds requiring attention because linter is not happy with space before comma. The worst part is – it might have been inevitable.
k
Therefore each must roll their own text->ast convertor.
Parsers for most languages are readily available. Python even has a Python parser in its standard library. It would be nice of course to make the AST even easier to access, but I don't see it as a major obstacle. The other points you list are partly an issue with text files, but mostly an issue with text representations of any kind. If you make ASTs the baseline representation of code, you still have to decide how to render it on screen, so code formatting matters exactly in the same way, just in a different place (the IDE rather than the text file).
m
re: https://hyperdoc.khinsen.net/ would benefit from more tiling (more than 1 pane in each next column)
so code formatting matters exactly in the same way
it would move it to client (= your IDE) where you could do all you want, write it however you want "tabs vs spaces" would never be a thing ever. linters would be compiler plugins alerting on semantics, not this "opening { must be on the new line, build failed" BS.
> I don't see it as a major obstacle. But it is an obstacle, because nobody writes code in DION as far as I am aware.

https://www.youtube.com/watch?v=GB_oTjVVgDc

Look how much of text files related BS is just absent from unison ecosystem. https://share.unison-lang.org/
k
Counterexample: https://github.com/khinsen/GT-Zotero/blob/main/src/Zotero/ZtApiClient.class.st This is Smalltalk code, stored in a text file. That's how it's stored in Git repositories, that's what the Smalltalk system loads into its image. But nobody every looks at it. It's parsed when loaded, and from then on the code lives in the image as data structures. The text files could be replaced by something else, e.g. ASTs, and hardly anyone would notice. All the problems you describe come from people using text files as their preferred shared user interface. In Smalltalk, nobody does.
m
I know it is not impossible. people follow path of least resistance. legacy (backward compatibility, existing code, existing tooling, existing projects hiring juniors, existing skills, habits and preferences) make changing things very hard. Some even start with explicit goal "existing things must work" (cough, clojure re: java, cough).
k
Existing things must work is a very good principle if you design tech for solving today's problems! You can't expect anyone to build real-life tools on top of prototypes. My point about text files is mostly that you can evolve their use, rather than abolish them and start from scratch. That's very much "existing things must continue to work". Keep text files, but create better views in IDEs that make text formatting issues obsolete.
g
FWIW, aside:
Therefore each must roll their own text->ast convertor.
Reality: existing tools want to deal with bracketed text. AST == Lisp
roll their own text->ast convertor.
Overwhelming problem in 1960. Nothing-burger in 2025 (OhmJS, PEG, I'm fooling with text-to-text (t2t)).
m
I mentioned
semantic
in "AST (abstract syntax/semantic/etc tree)" for a reason. "(foo bar baz)" tells you nothing about semantics.
foo
here might behave like
or
– then both foo and bar must exist, but only one will be evaled, or like
def
– then foo is new sym, which might or might not shadow existing one, and baz must exist. or like
list
– then this is just list of whatever bar and baz eval to. just having tokens conveniently grouped with parens is almost never enough. re keeping, and building on top of: if you leave access to "lower level" format – there inevitably will be "out of band" changes and consequences. And I'd argue that tabs-vs-spaces, formatter-wars, linters, package management, junky diffs, etc – are very much "today's problems" too
k
Semantics matter, obviously. But I am not convinced that they can be usefully integrated with syntax or more generally representation issues. I'd love to be proven wrong, because I see the separation of syntax and semantics as an important source of bugs in my field: programs that do something superficially reasonable, but not what a domain expert would expect them to do from a superficial look at the source code.
g
What is 'semantics'? It seems to me that 'semantics' is a set of operations which can be composed to explain useful concepts, and/or perform useful work. 'Semantic' description of some useful concept can be composed of: {opcodes | scripts-composed-of-opcopes | scripts-composed-of-other-scripts}. To me 'syntax' is just a handy notation draped over such opcodes and scripts. For example, Feynman Diagram syntax consists of squiggly lines, software syntax consists of text draped over fundamental opcodes. I'm fooling with using Python as a semantic substrate and using t2t to drape various syntaxes (DPL and TPL (Diagrammatic PLs and Textual PLs)) over the underlying semantics.
Yep, line-oriented thinking is holding us back. I place Dion in the “projectional editor” cubby hole. Lispers have been dealing with tree-oriented coding for half a century. More comments on tree-oriented programming.
m
by "semantics" I guess I mean several things, all steaming from "what do I need to be able to extract from sexpr to implement usual IDE features?", which are at least: • scope: which tokens are visible for particular expression, eg in
(let [x 1] (+ x 1))
(+ x 1)
sees
x
from the previous vector, note that
[x 1]
and
(+ x 1)
are siblings, same level, yet one can see the other, but not vise versa. We (at least me) by default think of
(+ x 1)
as "being inside of let, nested" but syntactic reality – expression dictating scope is sibling, not parent. • control flow branches: which sexprs are sequential, and which are parallel: in
(if foo bar baz)
bar and baz a parallel = "within single eval of this if - either bar or baz would be evaled but not both." • "special" places, eg
foo
after evaling
(def foo 1)
is made available as global var (in clojure terminology);
(let [x 1] )
- x is now local "var" available for anything "later" inside
let
. • maybe rendering hints (which is just one case of prev bullet-point, I guess), eg. in
(cond pred1 foo pred666 bar)
-
pred1
and
pred666
are predicates, and
foo
,
bar
– are branches, and maybe I'd like for autoformatter to render those as 2 columns:
Copy code
(cond
  pred1   foo
  pred666 bar)
but if first symbol is not cond, then it' would be just a list, where "items are interleaved columns" no longer applies, and maybe you want to pack-tile them, or render as a single column, or single row:
Copy code
(kek pred1 foo pred666 bar)
(kek pred1 foo
 pred666 bar)
(kek
 pred1
 foo
 pred666
 bar)
I disagree that lispers do this for 50 years. (I am sure someone had it in some systems (smalltalk? some say inter-lisp too?), but now it is not even a common knowledge that this is not only a possibility, but is actual reality somewhere) On the scale from lineoriented to treeoriented I see lisp (again, my main lisp is clojure) somewhere slightly rightwar of everything else, like python:
Copy code
|----*--*------------------------|
^    ^  ^                        ^ 
line py lisp                     tree
Just because if you are: - writing lisp in vim/emacs/ms code/idea/notepad as text (allows parse errors, like imbalanced parens) - looking at diffs in terminal as
git diff
output - even if you are using paredit-like thingy which does not allow invalid trees, but it knows nothing about semantics I listed above (basically to syntax highlight you parse "tree" you just wrote with another thingy again) - not to mention you can just open a file elsewhere and put unparsable junk there. I realize you can corrupt any file/format, but think of it as "another coworker edits your paredit-crafted-tree in notepad (usual situation, not a rare act of sabotage)" 🙂 – means you are writing lines of text = doing line oriented programming
re: smalltalk ENVY: going through these: https://www.youtube.com/@jarober/search?query=envy