Just a random thought that has been rattling aroun...
# thinking-together
a
Just a random thought that has been rattling around in my head while listening through podcast episodes and Ivan’s visual programing points. I am new here so apologies If all this has been said before, or covered in podcasts I haven’t gotten to yet. I want to propose a slightly different framing for what it means to say that a representation of a program is visual / spatial or not I think that textual programming is visual. It’s a series of symbols laid out on a screen in rows, and the spatial/visual relationship between the tokens does actually correspond to aspects of the operation of the program. For example an assignment statement has the thing being assigned to on the left, and the value on the right, parentheses group things together spatially on the screen, statements that appear above others within a block (usually) execute before the ones below (goto considered harmful breaks that down which I think you talked about). And good code formatting is mostly about making these visual analogies stronger. Zig plays with this by allowing you to release memory the line below where you have allocated it, which breaks down the time - up down analogy since the deallocation actually happens after the lines that follow, in favor of being able to spatially group two very semantically related statements in the program. That shows that there’s a lot to be played with within the context of textual programming if you think about in terms of spatial analogies to your program. To explore this more though, it’s interesting to think about how a program could be represented completely non-spatially or non-visually, and how these two things might differ. You could imagine being read the textual code for some program aloud, without seeing the text at all, that would be a non-spatial, and non-visual representation of the program You could also imagine seeing a single visual element, like a black circle on a screen or something, blinking on and off in morse code to transmit the code to you. To me that would be a visual representation that is not spatial, since the spatial dimensions of that single visual element don’t matter at all to the information being transmitted. It could be an array of dots or an image of some crazy person’s pure data patch or anything at all blinking on and off and the program being transmitted would be the same. So what would a representation of a program look like that is minimally spatial? Maybe you could imagine two dots on the screen that encode a two bit piece of a textual program encoded in unicode, one two bit portion at a time? They flash on with the first two bits for a second, then the second two bits for a second blah blah blah (so practical!) Is this spatial? There is a lot to explore here actually I think. Technically the particular spatial relationships of the pixels on the screen does need to be considered in order to decipher the program in the two dots scenario, and that is qualitatively different from the one dot scenario, where no information is gained from examining any spatial relationships. But I am skeptical of whether or not this is actually minimally spatial. It seems like we should define a spatial representation of a program to mean that some meaningful part of the program is represented spatially, and I don’t think “a two bit portion of a unicode character making up a textual program” meets that criteria. But Is this “meaningful vs non meaningful part of a program representation” a distinction that can even be clearly drawn? It seems obvious that there is at least a real quantitative difference if not a qualitative difference there. If the difference is only quantitative, then this two bit program would have to be a minimally (if highly unsatisfying) spatial representation of a program, but maybe there is some way to formally filter out non meaningful parts of a a program’s representation that are only essential to the representation of the program and not to the program itself? But like was pointed out in the No Silver Bullets episode, there are a huge number of different ways to visualize the same program. I think that we are stuck in a single way of visualizing our programs - which is the code, and to me the problem is not that code is bad, but that we are missing out on a huge number of different ways we could visualize and “reason about” (lol) our programs, so I’m not disagreeing with the actual takeaway, it’s more a critique of the hard division between textual and visual programming.
❤️ 4
🍰 4
s
Going to need to take some time to digest this, but I wanted to say I really like thinking about "how could a program be representated non-spacially or non-visually, and the comment that "I think that textual programming is visual" One thing that comes to mind - there have been some esolangs that have explored this idea a bit, though I don't think it was from this particular starting point. Velato (http://velato.net/) for example, encodes programs as midi files, so you could have an audio representation of a program...
j
Fwiw, a relevant piece of math lore is the informal distinction between `algebraic thinkers' and `geometric thinkers'. All of the most useful mathematical machinery, of course, may be represented either algebraically in terms of symbols, or geometrically in terms of diagrams. So these two modes are isomorphic, but not trivially so. This yields many productive conceptual moves. You've described a situation where every representation, I think, is both useful and arbitrary. So that at the limit you get that `minimally spatial' vision of the `essence' from which which the universe of diagrammatic, symbolic, etc. forms may arise.
❤️ 5
k
This overlaps a lot with some stuff I've been trying out recently to allow us to create new "characters" in our programming environments that string together more spatially than just a linear sequence. My inspiration for a minimally spatial notation is the mathematical notation for summation: https://en.wikipedia.org/wiki/Summation#Capital-sigma_notation. It's a character, and you have a few places to add characters above and below it (in addition to the right) The direction my thoughts have been going with this is, what if we had a way to just draw new such characters, specify holes around them where other characters go. And then use them transparently in an otherwise textual screen of code. The primary use case in my mind is function calls, making them more visually distinct. Links to my posts, all in #C03RR0W5DGC over the past few weeks: https://futureofcoding.slack.com/archives/C03RR0W5DGC/p1719791347513999 https://futureofcoding.slack.com/archives/C03RR0W5DGC/p1720015836688779 https://futureofcoding.slack.com/archives/C03RR0W5DGC/p1720660021938839 https://futureofcoding.slack.com/archives/C03RR0W5DGC/p1720750642619629
1
❤️ 4
a
@Scott this is a cool esolang! definitely an example of a completely non spatial representation if you were to listen to the midi
👍 1
k
@Scott on the subject of esolangs, one earworm I have never been able to free myself of is, "what if Befunge but readable?" https://akkartik.name/archives/foc/thinking-together/1681698511.039269.html https://akkartik.name/post/2020-03-13-devlog
💡 1
a
@Kartik Agaram this is almost exactly where my thoughts went - that thinking about textual programming visually could give you a path to a sort of incrementalist way of getting more and better spatial metaphors from where we currently are with textual programing - I will have to read through some of these threads. there is a lot of cool discussion to read through in this slack
❤️ 1
@Jasmine Otto That divide in math does maps to what I am talking about here I think, I am reading through sipser’s computation textbook right now and he gives the formal algebraic and graphical representations for finite automata, pushdown automata turing machines etc, and it’s always easier and quicker to understand the graphs. But the algebraic definitions are still are spatial when written down on paper, in my mind, so maybe it’s not exactly the same thing? not sure though
🕸️ 1
s
@Kartik Agaram hah! First I loved this line: "Now you can imagine a reduce operation as a series of solar systems that the spaceship visits one by one."...going to have to play with this a bit...I think there might be something here - I've been noodling around about new types of interfaces for AI agents...and a fully 2d/spatial metaphor for building/interacting/observing them seems worth digging in to...
❤️ 1
d
This is really thought provoking. As someone who is a big fan of text, it's really interesting to think of it as a visual medium. It reminds me of a friend who I did uni with, where we took a course programming 6800 microcontrollers in assembly. And of course being the types we were, we decided that we needed to build a better IDE for assembly to make the course tolerable. One of the first things we tried was just changing the lines' background colour at each basic block boundary. That's playing with the visual nature of the text representation. There wasn't anything profound about that, I'm just reminiscing and saying thank you for this post!
❤️ 5
s
Ahh! That’s a great point, that must be what resonated with me about the “text is a visual medium” line…years ago I suggested to a new developer on my team that he turn syntax highlighting off on his ide for a few months to start to build the muscle of being able see the structure of the code he’s working with rather than the details…it’s the inverse of what you’re saying, but still points to this idea that we do treat text visually without really talking about that aspect of it
💯 1
d
Syntax and code formatting is so often and hotly debated because it deeply affects the visual quality of our textual representation of the program!
🤯 2
a
Very true, it’s a way we can have control over the visualization of the program within the textual paradigm
o
Don't forget syntax highlighting, Remove all colors from a piece of code and find yourself disoriented. quote: "..meaningful part of the program is represented spatially_"_ I think that is exactly the question, what parts should be visually / spatially represented. The big one in textual code is, top to bottom is execution order. We are giving one dimension to basically procedural programming.
🍰 1
k
Not always, which complicates things.
a, b, c
executes from left to right, but
a(b(c))
executes from right to left. As those variables grow long they also take up the vertical axis (top to bottom and bottom to top respectively).
👍 1
👍🏻 1
d
I had a similar question and reduced it down to "what would an audio-only programming language sound like?" My guess was a lot more subgoals, markers for "filling in later", and fine scoped tests/examples.
👍 2
h
Loving all of the orthogonal thinking prompts throughout this thread. My perspective is that text-based programming is more characterized by the interaction model of essentially being equivalent to typesetting a document. Also worth considering is that many people interact with textual programming strictly non-visually through the use of text-to-speech. In my mind, things move to being more "visual" or "spatial" programming as the document interaction model goes away and the mapping to audio-only interface becomes less clear.
👍 1
a
Ok so sorry for posting a super long thing again but what @Jasmine Otto brought up about Geometric vs Algebraic representations got me thinking. So if you define an “Algebraic” representation to be a representation of some formal object that can be expressed as a sequence of symbols, and a “Geometric” representation to be one that has a set of symbols that can be oriented anywhere in 2d space in relation to one another, then I think you can actually draw a pretty clear distinction between the two. Maybe the distinction is better defined as 1 dimensional vs 2 dimensional, but I will stick with this Algebraic vs Geometric thing. say the formal object we are representing is the fully connected graph of 3 nodes a, b, c. With a geometric representation, you can have three nodes laid out as an equilateral triangle, with 3 edges connecting them. Very simple. You have 4 different symbols, a, b, c, and <line>, which represents an edge. In the standard algebraic representation, where you have a pair (V,E) of vertices and edges, the representation is ({a,b,c},{(a,b),(b,c),(c,a)}). The symbols used are (, ), {, }, a, b, c, “,”, which is 8 different symbols. And the total number of symbols in the representation, counting repetitions is 29 (if I counted right). Let’s say we just want a representation that can distinctly express any graph with just 3 nodes to simplify things a bit. For the fully connected graph you can think of a sequential representation that has the same number of symbols and symbol instances as the geometric representation: “a-b-c-” the “-” at the end could be interpreted to mean that there is a connection back to the first node in the sequence. Or you could simply do “abca”, where two nodes have an edge between them if they are next to each other in the sequence. In this case, we have a repeated symbol “a”, but fewer symbol types than in the graphical representation. But then you could similarly eliminate the <line> symbol from the geometric representation by saying that nodes are connected if the symbols representing them are touching. Maybe you replace each symbol a, b, c with that symbol inside a circle to imagine this better. Now I think you reach a point where there is a real difference. In the case of the geometric representation, you can differentiate between the fully connected graph between nodes abc, with only 3 symbol types, and 3 symbol occurrences. Whereas in the sequential graph, there is no way to differentiate between the two with only three symbol types and 3 symbol instances. With only “abc” you don’t know if the a is connected to the c or not. Textual programming is purely sequential in nature. No matter how you format your code visually, the one dimensional sequence of characters fully determines the program. So you will need either more symbol types, or more symbol occurrences to represent programs textually than we might if we use 2d space!!
🍰 2
i
Shorter version of the previous: In visual programming, spatial arrangement of objects doesn't have to be meaningful. But it can be. That part is open to design. Now I'll add: In a good system, the programmer would get to choose whether spatial arrangement is meaningful or not. In a good system, the movement of objects through time is just as significant as relative positions, connections, visual representations, etc. There are physical dimensions we haven't explored. If you live in text, you won't feel the pull to explore them.
❤️ 3
o
@Adam Davidson Reading your post,.. you know, writing down geometric and algebraic problems like this is sort of a prompt for an AI? That makes me realize why I dislike AI for coding, it's like we un-invented algebraic notation and went back to writing prose. Instead I want to go into the other direction.
👍 3
d
Love this train of thought. I like to think of visual programming as a 'superset' of textual programming. In general the following roots my thinking in a broader context but beware: This definitely has im14andthisisdeep energy: Perhaps the stark divide between computation and physicality is an illusion as all computation is physical in nature, with much of it being observable- some through a representation on an interface (the 'visual' component), or some so hidden in patterns of electricity. So we might say minimizing the visio-spatial element of a computation is either compressing the representation of the code (Now you don't need semicolons!) or leaving information out (No more writing explicit types!).
❤️ 1
a
I live for getting into the I’m 14 and this is deep topics personally 😂 - yeah I do think that the physical / platonic divide is not something that exists in the real world, It’s only a divide of human concepts not real world stuff in my opinion, but as long as you recognize we are dividing up words and not real things I think it’s useful
❤️ 1
j
the argument for 2d representations having more degrees of freedom than 1d representations falls out of first principles, I agree! now I'm suddenly very curious what the process of compiling a 2d representation to an abstract syntax tree looks like, and I'm sure Ivan has been thinking about this
❤️ 2
k
@Jasmine Otto This was one thought I had even in the recent threads on #C5U3SEW6A: • https://futureofcoding.slack.com/archives/C5U3SEW6A/p1721317585631879https://futureofcoding.slack.com/archives/C5U3SEW6A/p1721317807883659 The links are talking about things like state diagrams and time sequence diagrams, and it's been on my mind to try compiling down some example state diagram or time sequence diagram. My suspicion is that most of the charts those links talk about as being useful in practice are at a high level of detail. You can't generate code based on such a diagram in isolation, because high-level pictures by definition abstract/hide a lot. So at best the diagram is augmenting the code. In which case how do you refer to stuff in textual code inside diagrams, and (harder) refer to stuff inside diagrams in textual code. Would we say things like, "find all the edges coming out of the node labeled `Foo`"? Seems ugly.. Perhaps a cleaner alternative is that the text is authoritative, but we can verify that a diagram is in sync with the text. Or even generate the diagram from text. That was something I was playing with like a month or two ago before I got side-tracked by other things. But I hope to return to it at some point. Have the compiler's output be not just some executable for the computer but also summaries (at multiple levels of detail) that people can look at. I think the challenge here is finding diagrams that are likely to be simple (<20 nodes or so) because anything more complex is a rat's nest when a computer lays it out automatically. So scale-invariant properties of a codebase, that will remain simple even when the codebase grows complex. cc @Ivan Reese @Joshua Horowitz. I wish there was a way to merge threads.
❤️ 1
j
For sure, I'm really excited about both posts. I wonder if we know who is responsible for the dataflow minimap in Observable? For me it's been a killer feature in diagnosing topological execution. (Even then, a length-five cyclic dependency in the update loop of some interactive diagram is still quite a pickle to deal with.) We've also been thinking about the value of high-level 'speculative visualizations' (as well as the data-driven kind!) over in technical games research, where there's been a great deal of drift between the game analytics literature and the concerns of narrative design. I'm very glad this problem of the high-level underspecification comes up in (design-minded) developer support tools more generally. I don't remember if I posted my DendryScope work in its own right? It's an alternative compiler for the Dendry IF authoring language that does the static analysis, basically. After I gave up on box-and-wire digrams to go with heatmaps, I had more trouble with combinatorial explosion than with layout. demo: https://jazztap.github.io/DendryScope/ paper: https://jazztap.github.io/dendry-paper/2023_AIIDE_Dendryscope_Camera_Ready.pdf
💡 1
I agree that 'compile the DSL to both an executable and a diagram' is perfectly feasible. Going the other way, from a diagram to a DSL? Now there's the golden problem of design engineering, I think.
k
I see, so you're imagining that a high-level diagram might compile down to a textual DSL with some interface that lower levels of textual code can hook into as needed? Interesting.. I had not seen your project before! Thank you for sharing it. You should definitely create a thread for it in #CCL5VVBAN. I'm going off now to read more about this Dendry thing I'd never heard of 🙂
🙏 1
j
I mean, the trouble isn't really in building the compiler. It's just graphical bindings once someone has got the right diagram-to-DSL fit. Now you're going back and forth between the algebra (an AST) and the geometry (a diagram), so everything is fine. All I need to do is articulate what a 'good fit' looks like, and why any given project needs a suitably multi-disciplinary design engineer to go and find it. These people seem to be unicorns, from a management perspective, but also a lot of them are here in this community right now. That will be the subject of a full-on Substack post, I imagine.
k
Yeah, I'd be 🏃🏼 to read it. From where I am now, yes the compiler doesn't seem hard. What seems hard is co-designing the source grammar and the target DSL for some domain..
i
@Jasmine Otto
I wonder if we know who is responsible for the dataflow minimap in Observable?
I believe @Toph Tucker worked on it.
🙏 1
g
@Jasmine Otto a) minor technicality: once compiled, it is no longer an abstract syntax tree, it is a concrete syntax tree ; the compiler uses an AST internally for pattern matching against all possibilities, then spits out a pruned-down CST to represent what was actually found in the input b) I believe that attacking low-hanging fruit is a good first strategy. Many 2D editors save out diagrams in XML and JSON format. Once the information has been compacted down to text, albeit ugly to humans, it can easily be parsed by existing text-parser technologies, like XML parser libraries, or my favourite, OhmJS (a descendant of PEG parsing tech). I use draw.io as a DPL editor. I’ve use yEd and keep eyeing Excalidraw. c) parsing is much simpler if you use a technology with backtracking. I use Prolog (SWIPL, to be exact) a lot. I use Nils Holm’s Prolog in Scheme a lot (I’ve ported it to Common Lisp and to Javascript and am working on lifting it upwards to be able to spit out just about any 3GL, like Python, WASM, etc.). PEG is more useful than CFGs and REGEXs, since PEG incorporates backtracking d) simple grade-school math, e.g.
(x,y)
,
intersection
,
containment
, goes a long way towards creating useful semantic info that can form relations in relational languages (Prolog, miniKanren, etc.) e) do unto DPLs what was done to TPLs - pare down the input language to something that can be parsed. We use only a tiny sliver of English in most PLs, similarly we need to only use a tiny sliver of 2D graphics to form DPLs, e.g. “rectangle”, “arrow”, “text”, “group”. We don’t bother to parse all of War and Peace, likewise, we don’t need to parse Rembrandt.
j
@guitarvydas All good points. Let's say I want to parse a 'tarot spread' (i.e. rects with three distinct orientations) for alignment, ordering, and superposition. The loop I'm seeing here is DOM events (drag & drop) -> SVG output in an Observable notebook -> CST ('scene graph') of (cx, cy, w, h) tuples -> favorite Prolog or ASP to produce fixed-length 'reading orders' -> rank by parsimony heuristic -> splat top three to prose. In this case the diagram is already 1-1 with our scene graph, so step (b) of parsing is bypassed. Upon performing steps (d,e), we additionally have a pile of highly redundant adjacency predicates. Turning those redundant statements into a clear statement of facts is a problem that you or I can solve on the spot, obviously, by exploiting the diagram using spatial cognition. But writing down an equivalent 'pathfinding algorithm' is very interesting and properly challenging. It's a problem I've been meaning to get back to, given the right brain sparkles. Hmm.
Step (c) is the essential bit. Others are more proficient than I. But WASM ports have been massively helpful for me to learn with. I see SWI-Prolog has one of these now!
g
I don’t comprehend what you’re saying, maybe sleeping on it will help. Yet, the first thing that jumps to mind is that, instead of figuring out every little bit of the parsing algorithm, maybe a little-LLM can help? Kind of like a 1:N decoder chip in hardware for exploiting the diagram using spatial cognition? As I understand it, you need something that sees a pattern in all of that. Once you know the pattern, does it become easier to write a Prolog-y post-algorithm of what to do with the info? If this seems like an interesting direction, then a test would be to write the pre- and post- code and leave the middle, where the little-LLM should go, stubbed out, using manual editing to insert appropriate values. Then, see what ideas / understanding that generates. @Jasmine Otto
j
The fault is mine, I'm picking up in the middle of an old project with too much intermediate state. My old frontend code still works and gives me the scene graph, a list of card positions. That's new to me, but a demultiplexer is the right description for going from List[Positions] of size 4 to a List[Relations] of size 20+. But you're right that mocking up the List[Relations] is the way to go - then I can start with high-quality sentences (reading orders), and somehow reverse-engineer the Prolog algorithm.
whoops @guitarvydas
g
@Jasmine Otto my favourite tools for exploring a design space: (1) procrastination (2) just doing something, and, (3) I like using Common Lisp, because it makes it imaginable to change my mind and to throw away all of the code and to begin afresh [I don't feel that same freedom when I engage in premature type-ing with C and Python and etc., and, even, OOP ; the thinking:code ratio is much higher in simple dynamic languages]
😆 1