For those working on graphical, node-based, drag-n...
# linking-together
s
For those working on graphical, node-based, drag-n-drop interfaces: I recommend reading “Always bet on text” — https://graydon2.dreamwidth.org/193447.html — What do you think?
👍 2
i
I think… computers.
🙂 2
1
j
Not working on that, but still chiming in: I think they make some great points, but there’s an implicit logical fallacy. Just because text is the best thing we’ve made so far, doesn’t mean it’s the best thing. We’ve only had ~50 years of being able to separate presentation from storage of information, and even less time with interactive media. So, it makes very little sense to me to go from “A has had 5000 years of evolution and refinement, B has had 50 years of evolution and refinement, and A is better than B” to “A will always be better than B” (which is unsaid, but was implicit to me).
💯 2
Actually, “always bet on text” is pretty explicit.
i
There's no best cuisine, no best genre of music, no best style of painting, no best kind of shoe. You could write a snappy blog post like this arguing that French, jazz, impressionism, and sneakers are the best. You could write an article making an equally snappy case that text is the worst form of communication, just by choosing a different basis for comparison. This article is a paean, not a proof.
👍 4
The best part of the article is:
If you can use text for something, use it.
I think the thesis behind a lot of graphical / visual programming languages is: there are things we want to be able to do in programming that we simply can't do with text. (Here's a fun seed. Imagine that all programming was graphical, and there was no use of text at all. Now imagine what people would be imagining when they say: there are things [...] we simply can't do with graphics.)
❤️ 2
d
How about we update the title to something that is more supportable: "Always bet on text, unless a graphical interface is a better solution, then use that instead".
@Ivan Reese Well, there are things you simply can't do with graphics, which is listen to music. Audio is a much better medium for that.
👍 1
😑 1
s
See if you can write some text to convey this
👍 5
Or this:
I think the article evokes a sharper distinction where it's more useful to think of various aspects and overlap of representations. Consider a plain text blob description vs a spreadsheet - both are 'mostly text', yet there are big differences in how these representations work and are interacted with.
🍰 3
a
I mostly agree with the article, but it has an anti-progress, get-off-my-lawn kind of position. I’d say “don’t underestimate the unreasonable effectiveness of text”. If we’re going to try to replace text-based coding then the replacement had better be really, really good
👍 5
s
Is hypertext text?
i
Is hypercard card?
🗃️ 1
😆 3
e
Adobe makes Photoshop for bitmaps, Illustrator for vector art, InDesign for text, Dreamweaver for creating hybrids out of text and graphics, and for programming they had Flash/Flash Builder. I think all 5 realms these products belong to will exist 20 years from now. Clearly some merging has happened, where vector art has crept into Photoshop and bitmap features have crept into Illustrator, and programming features are now present in more interactive tools like Adobe XD. When you want vector art a bitmap does not work, but going the other way is an easy transition. So i see the future as combining all of our existing components using easier and easier tools to make things. There are some serious issues with 'diff' and 'grep' in graphical tools, so a graphic-to-text mapping may need to exist. However, this adds greatly to the complexity of the tool. But even in purely textual oriented tools like InDesign, their product is full of graphical interactive palettes and toolbars. So everything is really a hybrid anyway.
y
Also: Worth mentioning that what is considered "text" is culturally mediated and as a definition is pretty blurry. (some)UNIX people talk like it's some kind of fundamental force of nature which I think muddies the water of these discussions a lot.
👍 2
🍰 3
I think it's illuminating to carefully look at old manuscripts, esp. from before typewriters, and see how rich people's conception of writing was, before it was mechanized.
For example, consider Galileo's notes. Which parts are text and which parts are pictures?
👍 1
💯 3
(To anyone curious about expanding their perspective on this topic I recommend reading Unflattening by Nick Sousanis and Understanding Comics by Scott McCloud, and maybe throw in some Edward Tufte.)
👍 8
w
@Andy F I like your sentiment. To generalize: when proposing something different, best to keep in mind what same-old same-old good. Do any of those good qualities translate? Are they replaced with contrasting goods? Or is the something different just better at different things?
i
What do you think, @Steve?
f
Text can convey ideas with a precisely controlled level of ambiguity and precision, implied context and elaborated content, unmatched by anything else.
For programming, it's important to distinguish between presentation and storage. In the presentation, you want a personalized experience, taking into account the user's context (e.g. which syntax sugar, libraries or PL features a user knows). For storage, you want an unambiguous representation including all relevant context. PLs typically use the exact same representation for presentation and storage, which requires compromises. It's neither personalized (e.g. displaying
HashMap
instead of
std::collections::HashMap
in my editor) nor unambiguous (e.g. a python script that doesn't specify compatible versions of the interpreter or any of its imports). Text is a great communication medium in a lot of cases. In others, pictures, formulas, etc. are better suited. Providing different presentations is probably better than any single presentation (which is projectional editing, after all). Due to their conflicting requirements, it's necessary to separate presentation from storage! I don't care whether the storage presentation is textual, binary, or png-pics as long as it's well-defined, complete and easy to work with.
👍 2
c
I think the thesis behind a lot of graphical / visual programming languages is: there are things we want to be able to do in programming that we simply can't do with text.
There's also the fact that blog posts, documentation, lectures, personal notes etc, all optimised purely around understanding programming as easily as possible, tend to be packed with diagrams
s
@shalabh — Your bridge example is interesting. What does the image convey to you? Because to me, I see no structural definitions, no measurements, no equations, hard-to-see use-case (i.e. I’m assuming it’s a bridge for cars?). I think the image says a lot, but perhaps more emotional and art than logic and mathematics.
👍 1
Generally speaking, I’m in high agreement with the Graydon Hoare’s post, specifically around the use of text in logical and mathematical representations of complexity. I did a study on this ~7 years ago and the results confirmed that text was most optimal for expressing, sharing, understanding, writing, and reading complex logic. I did this study outside of academia during my own younger years as a professional so the records of this are all but lost on some hard-drive in a landfill, but the memory is still very much alive. I do want to run this study again…
However, it’s really all about perspective of application here — UX. Nodes, images and other forms of HCI are powerful, text may be best, but that does not make it exclusive. I would however argue that text is best for many domains, but not all.
👍 2
r
This is probably obvious at this point, but it's worth re-iterating that visual programming languages are extremely popular anytime "programming" is used outside of traditional software development. Here's a quick list of a few examples that I personally use: Max/MSP, Reaktor, Blender, Houdini, and VCV Rack. I would even go as far as to say visual programming languages are actually the default approach outside of traditional software development. This leads me to believe that visual programming languages, in a vacuum, are actually better, outside of one absolutely gigantic feature: Collaboration. Version control is the canonical example of a powerful collaboration feature powered by plain text, but even Stackoverflow, Slack, and any other place you cut and paste code are examples of leveraging plain text for collaboration. Using visual programming languages so much in various non-software development pursuits has convinced me that outside of these collaboration features, they're probably better. But if you are building complex systems that need collaboration between many different individuals, they're much worse.
g
i can’t get over the fact that text-adherents deny all the visual affordances in everyday programming: if all we need is ascii, why indent? why syntax color? why use symbols to represent operators instead of words? what about things like decorators in javascript or python, which are positional? why do we use arrow notation for functions in a lot of cases? what’s the textual semantics of separating keys and values with a colon? what do we think parentheses are?
☝️ 1
👍 1
2
s
@Garth Goldwater Indeed I agree some overlook or discount the topics you present — however, they are topics not universal required in textual representation of logic. For example: “when the door bell rings, text me” this statement does not need any syntax sugar to have meaning yet it conveys a respectable depth of complexity, would you agree? The reason indent, color, etc. is applied to increase depth of meaning. If we apply colors, indentation and symbols to our door bell workflow than it (at least I hope) is only added to improve reading/writing/understanding of the logic, but by no means is it required at this depth of complexity.
👍 1
In this particular example (and I have thousands more), text is best. Drawing a picture of the doorbell ringing and a text message sending is less effective at conveying the required complexities of our intended behavior — in fact it would include a terrible amount of unnecessary complexity if represented in a drawing that may include unnecessary emotion or a diagram which includes unnecessary shapes/lines/structures/visual-space.
@Howard Tang this conversation and topic is very relevant to our recent conversation.
If I may continue: though text may be best in places, it does not need to be exclusively text. Why choose one? Why not choose text when appropriate and visual when appropriate? I would argue the latter here is most optimal for UX.
👍 1
1
g
@steve you’re working at a different level of text, so i’m not sure the argument applies here. storyscript (as i understand it) aims to have the user communicate in natural language, then conversationally hone in on a mapping of that language to the formal, underlying system, right? i wouldn’t describe that as fundamentally textual (at least for the purpose of this discussion) since “displayed code” gets evolved over the course of the interaction—does that make sense?
s
@Garth Goldwater I intentionally left Storyscript of out my argument 😉 but yes, it has influenced my argument, no question about that. But my intention and argument was more about generalized logic which has vastly different weights of complexity ranging from simple Alexa commands to something much closer to what traditional programming supports. I rather not bring Storyscript into this conversation an focus on different levels of necessary complexity. Generally speaking, I’m arguing that text is best for the majority of complex domains, but not exclusively.
c
One example where I don't think that text is better is in logic gate circuits. The way text is laid out it's not obvious what the inputs and what the overall output is. e.g. A or B and C or D and E This is very simple example but it would already need brackets and even then isn't clear. When you see highly structured text (like law) they tend to essentially lay it out diagrammatically (with indents) and it's still not very clear
🔍 1
❤️ 1
Text shines when you have a large vocabulary
👍 1
🍰 1
s
@shalabh — Your bridge example is interesting. What does the image convey to you?
@Steve - it's about the arrangement of beams and trusses - how do you convey that in text? "There are two large trusses. The left truss has N beams. The first beam starts at coordinates (0,0) at an angle of 15%.. this is connected to two other beams". Then how do you describe all the connections? You could even make a table with the list of beams with coordinates and connections, but a reader wouln't get it until they reconstruct the diagram from the table and look at it.
1
👍 2
g
Ok—I think I might have a better way of putting this. Articles about the primacy of “text” as a medium frustrate me because they move the goalposts on what they mean by text, and often are used to justify a world where we continue to edit programs character-by-character in order to produce a single array of characters that pass lexing, parsing, and compilation or interpretation checks. This is frustrating. Here’s one of the example quotes from the article:
“Human rights are moral principles or norms that describe certain standards of human behaviour, and are regularly protected as legal rights in national and international law.”
I’m not sure how to convey that graphically, but I’m also not sure how to define even a process to find an appropriate semantic model for that sentence. Some quick bullet points: • The social context for this sentence matters. You could be reading it at a filibuster, an open mic night, or during a debate around a constitutional amendment. All of those wildly change the intention that the audience (or the “compiler”) should intuit from the text • The vast majority of the meaning of this sentence is buried in definitions, assumptions, behaviors, cultures, etc—there are a million “why?” questions a naive alien could ask about this sentence—to say nothing of the non-precise definition of every single word used in it—reminding me of a long-lost old tweet: “Words are a scam. If you look them up, they just mean other words. That’s how they get you” • Even humans, who designed the damn language thing, can’t agree on definitions or reasonable inference. That’s why lawyers and judges and polticians write and argue about what laws mean all the time. All this is to say that the advantages that authors of these types of articles often cite as belonging to text tend to actually belong to human language, not arrays of ascii or unicode. Sorry—I’m bringing up storyscript again because it’s cool and a good example: I find the most appealing part of being able to write the sentence “text me when the doorbell rings” and have it work is the idea that there is a tool or process set up that 1. knows who i am 2. understands that “text me” means “text me `the doorbell has rung`” 3. knows which doorbell I mean when I say
the doorbell
👍 4
this is the second time ive accidentally sent the message before finishing it so I’m just plowing ahead
s
@shalabh https://blokdots.com/ blends this very well IMO
❤️ 3
Beautiful blend of visual and textual representation of complexity.
@Garth Goldwater I will return to comment soon, have meetings for a while now 😄 — Also DM so we can schedule time to chat and show demos.
👍 1
g
What do all these specialized tools for data entry actually give you? Context, live feedback, and constrained input. And most of them _don’t feel like using a language_—the thing about flow states is that they deal with feeling embodied. So a fencer considers their sword an extension of their arm, a driver considers a car the extension of their body, and a musician often considers their instruments an extension of their hands, voice, or speech. There’s a reason we put frets on guitars, even though a rational being with an infinite time horizon would be perfectly fine with labeled fretless guitars, or simply perfect pitch. The thing about all these affordances is that they quickly move from conscious to subconscious attention as people become more familiar with them through use. That’s (part of) what generates flow states. So most of what good visual or gestural interfaces provide quickly become invisible. The use of carets for editing, highlighting for selection, clicking for activating or selecting, etc, eliminate reams and reams of parameters we’d have to add to functions if we were doing everything with text-based languages—but we don’t talk about this when we talk about text and programming languages. That’s the power I see in phrases like
the doorbell
. It’s an implicit selection or context that doesn’t require you to hop out of your subconscious way of thinking “*that* doorbell, which is close at hand, or in my mind” and into machine language “what’s the UUID of that doorbell? what’s the order at the call site of the function? did I import the right module? is it objectName.data or objectName.value?” and pointing—whether with your eyes, explicit context from associating things with your username, using a mouse, or using the history of a conversation—makes this much, much more ergonomic. And none of those options have to do with the array of characters that you hope defines a parseable program. They all have to do with directions to the interpreter or whatever about how to look up and interpret symbols.
1
Those three numbered points are all examples of embedding context in the environment interpreting a set of commands. I think the more useful way of looking at the bare reality of how computers get used is that we send them instructions. The tradition is to batch them up in code files and pass them somewhere, then run them by clicking or invoking them on the command line. One major frustration I have is that the input for these commands tends to be constrained to appending, concatenating, inserting, and deleting either characters or regular expressions over characters on arrays (really ropes if you want to nerd out about probable data structures—but the user is experience is indistinguishable from arrays) of text that will eventually get parsed. That sucks: • There’s clearly a level of direct inefficiency here: we edit text files, so that they can be parsed to data structures (ASTs)—and then attempt to edit those data structures by editing our text files—and we maintain a relationship between those two pieces of state by maintaining a model in our brains, and spot-checking it repeatedly against a parser, typechecker, and manual and automated tests. I’d argue that storyscript is actually a conversational interface over a structural editor, that uses a lot of NLP and some clever UX to make the experience very pleasant. Plus all the stuff about connections to services and everything else it’s doing. • there are demonstrably more efficient or ergonomic ways of entering many types of data (interactively or not): dropping pins on a map, using a color picker or eyedropper, drawing a box for a screencap or crop, clicking an element on the screen rather than specifying its exact position or some sort of structural selector, etc. • In fact, some of these may be more useful for data types that have direct analogues in code and ASTs: do you think there are more tables or maps in all the excel documents in the world or all the code in the world? How many cars are there whose speed is controlled by the user typing in a percentage delineating the openness of a throttle?
Sorry, COVID isolation + unemployment = my high pressure programming rant chamber required venting
🍻 1
j
That’s an interesting point about how most good interfaces fade into the background. I find text does not do that to the same extent. Perhaps part of what is interesting about being a programmer is the novelty of how much can be accomplished by learning to edit text files. Maybe text isn’t good; it’s just conspicuous. Even if you’re doing more, better, you don’t feel like you’re programming when you’re using a spreadsheet (or I don’t).
👌 2
Programming is just conspicuous computing.
s
@Garth Goldwater There is a lot to digest in your statements, I’m honestly not sure where to start…
g
@Steve no worries! I apologized because I realized it had evolved into more of a blog post haha
🤣 1
s
If I may continue: though text may be best in places, it does not need to be exclusively text. Why choose one? Why not choose text when appropriate and visual when appropriate? I would argue the latter here is most optimal for UX.
I agree with this. If we think of 'information' as an artifact, text is a UI. The same information can have other UIs too, other text or visual representations, or mixtures of course. I think storing the text as the essential representation of information is where the trouble starts (which is what I think Garth is referring to). First, you have an incredibly denormalized representation where you have to normalize the links every time you load it (e.g. repetitive use of the same word/variable name). Second, you've embedded display and style information (80 columns?) into your information. Third, you often embed text delimiters to represent structure, and now parsing is a whole giant subject. This can go on.. Really I think we should be moving off this pattern to a more hybrid and layered approach: the innermost representation only stores the essential elements of the information (kinda closer to a graph with nodes having ids?). The the outer layers map this into various forms, adding text, graphics, style etc. This is what I mean by 'universal hypermedia'. Imagine a view that shows "when the doorbell rings, ...", what does 'doorbell' refer to? What does 'ring' refer to? If I hover over the words, I should see highlights on all currently visible UI represensations of the same entity, e.g. an icon in a different pane. If I click I should be able to navigate to other entities connected to the 'doorbell' - provenance, other rules, etc. The entities are part of the information model here, which is more unified than fragmented, and we have these kinds of 'lenses' to look at the information. Yes, various expressions in the UI may use text, but I think 'always bet on text' misses the potential power of an information/hypermedia system. You should be able to switch to other views of the same information, even multiple text views in fact.
👍 5
1
1
i
the innermost representation only stores the essential elements of the information
I've never seen a case — including formal mathematics — where this can be taken to a perfect, full realization. You always have to draw a "good enough" line somewhere. Eg: 2+2 is not a minimal encoding of that operation. There is no one true minimal encoding. Things get even worse when you start looking at (eg) Clifford algebras. Or you could just draw a "good enough" line at the cross product, since that (comparatively) minimizes the number of CPU operations you need to burn in order to do your workaday 3d vector algebra. So if leaking some aspect or another of inessential complexity is inescapable, I argue that you (the designer of some innermost representation) should choose a healthy number of inessential elements you want to have handy and insist that all viewers understand, and not feel the pressure to scrub away as much as possible. For textual source code, we've chosen that indentation is worth storing, but highlighting is not. I think most of us can agree that these are bad choices — but that doesn't mean that the best choice is as little choice as possible. (Solidly agree with everything else you say, though!)
👍 2
s
Yeah I've struggled in rabbit holes looking for minimal schemas models 😄. It's kind of like finding the minimal repr of the rectangle - is it center and half-widths, or four corners, or ..? It appears we can never really share information, we can only share views of information. So whatever the medium of the view is - it will introduce artifacts into our representations (~indentation). The thrust of my argument kinda missed what I really wanted to emphasize, which is that with computers we now have the ability to store deeply interlinked views of the same things, and we should engage more in this direction. Rather than trying to think in terms of the non-computer ideas (dead text), we should think more in terms of computer-ish ideas (many views of the same intertwingled things). I'll see if I can phrase this better at some point.
1
💯 4
i
One of the open problems down the road of hyperlink-it-all is how you make all the views computable. I have that drawing of the bridge - how does that turn into useful information for me to work with without losing the semantics the drawing is actually providing me? There are lots of deep holes to fall into along the path: notions of identity, the dream of semantically preserving transforms, conceptual implicitness with execution stability, and so on. In the end I think you run up against the curse of dimensionality no matter what you do and at some point you have to accept there will always be representations that you can't preserve. Our best efforts remain lossy (computers aside, we still have no idea how to convey information losslessly across formats between people). But that's ok. I'm firmly in the camp that when it comes to basically anything related to thought, the answer is always somewhere in the intersection of an effectively uncountable number of axes. To suggest that any one way is superior is to forget the simple question "in what context?" And trying to answer that question is an exercise in attempting, and failing, to capture all the external forces that apply. As Ivan said, at some point we have to stop claiming X is better and instead start saying "this is useful in some context that we potentially share." If a system captures enough of those useful cases, it eventually overcomes all the ways in which it is inadequate and we get the thing we complain about but we use anyway because it's "good enough." I've always found looking at the extremes useful - what if everything was text? what if nothing was text? what if everything was the most domain-relevant representation for itself? There's a lot to learn from each of them and they're all awful in their own ways - even the last. But I think there's something great waiting for us in the middle ground 🙂
1
👍 7
❤️ 3
☝️ 2
s
My reply went super off tangent 😄 so I'll tldr it but leave the text below. TL;DR it's not about total correspondence, but creating a virtual world where 'views' are a reified notion and so are partial correspondence between various views. Long tangent: Those are kind of the holes I'm stumbling around in. What we do presently is computation within various views. These views are related but the links are not reliably or uniformly maintained. E.g. we compute the intersection of two lines within a 'data structure view' and map the result back to the 'canvas view'. This is a partial mapping: some structures in the first view may map onto points on the canvas but other structures wont. Computation may not even exist in the 'canvas view world' - all views don't have to be computable. When we run a program and step through the debugger, we again map objects in the one view (compiled running code) onto objects in another view (source). Most of these connections across views are severed by default. Only if the debug symbols were injected and you have a program that can interpret those and have the right version of the source around, you'll be able to follow the correspondence between objects. The 'filesystem' is yet another view that we map a lot of objects into and later try to extract by reversing the process. My search here is not to find some kind of total correspondence meta model that can be used to drive all other models. I don't think that is fruitful, because we want to have many new models and new views that may not fit completely within old ideas. What I think is missing is the very notion of 'views' with 'objects', where there is a partial correspondence between objects across different views. This model should be the fundamental nature of the virtual world. We've made the virtual world much like the physical world (~objects with severed links, the idea of 'just data'). So instead of the traditional idea of "lets create a program that operates on data (e.g. database, paint app)" we only ever create "views with expose objects" and the system lets us connect these objects to others outside. To take the example of the drawing of the bridge. Say I receive this drawing, I should be able to pick up various view objects I have laying about, and apply them one at a time to the drawing. Much like picking up various lenses to look at a butterfly. Say I find a suitable view that shows a list of all corner points in the drawing. Now I can leave this view attached and on top apply other functions/programs I have laying about that can work with the 'point view'. Now I've modified the 'point view' and my change gets mapped back onto the drawing via the view. This creates a new drawing (attached to the original one for provenance, and attached also to other paraphernalia I have used) which I can then share. One idea here is we haven't sent around a 'dead drawing' but something more like a live object. Another is that we have stable references to versions of objects, and perhaps even subobjects.
👍 1
1
💯 3
k
I guess this kind of discussion has been going on for a long time (for a 90-year-old contribution, see https://en.wikipedia.org/wiki/The_Treachery_of_Images), and is not going to stop. Any analysis, by whatever means, depends on the choice of a point of view, and there are always other points of view that require a different analysis. Computing doesn't change this, it merely makes it more obvious by multiplying the number of points of views enormously. And by driving wedges between similar but not quite identical points of view - which I think is the problem we are really discussing in this thread.
💯 2
🤯 1
s
Wow Konrad got right to the point ☝️
Computing doesn't change this, it merely makes it more obvious by multiplying the number of points of views enormously. And by driving wedges between similar but not quite identical points of view
Yes exactly! 'Views' are points of view, and we have a bajillion of them in computing - each showing only a slice of what is, in its own language.
d
The article makes a non-point: text is great for communication. But that doesn't mean you can't Base-64 encode an image. Format conversion isn't a choice, it's a given. And the author wasn't talking about text programming languages versus visual ones, which is what we're interested in here. As programmers, it's already true that nothing is text. You don't edit a stream of characters, you edit a bunch of trees, or more broadly DAGs, or actually a graph of inter-related semantics over time and space (functions, files, git history, DB structures, HTML, etc, etc). Presenting that graph in 2D, storing it in memory as an actual graph or serialising it for exchange and version control, or printing it in a book, or compiling it for a machine, are all incidental medium transformations which hope to maintain most of its semantics.
2
💯 2
t
As the author of a visual coding tool, I'm not optimizing for whether a program can be read 5000 years from now, nor for file size, nor the ability to convey the meaning of "human rights". I'm optimizing for the understandability of logical relationships. Also, I see nothing in the laundry list of 'advantages' at the end of the article that's exclusive to text or inapplicable to graphical editing (and in fact many/most of them are easier if you're working with structured data rather than a byte stream).
1
c
"Text / natural language" is the lowest common denominator and I believe will always be the glue (or the documentation) for unique systems. And rich media like a pasted image, a DWG/CAD file for a bridge plan, a music file, or a SQL command are simply other DSLs that can bridge the gap when you consider "rich text" with "embedded information" as the middle ground.
👍 2
s
@tbabb You say optimizing for “understanding” but when you look at a visual coding tool you are always translating the visual into textual/vocal be it in your head or vocally when collaborating. I would love to join a call with you to see how your summarize/explain something created in LynxTool as an experiment regarding this topic — My hypothesis is the textual summary will be more easy to understand than the visual. Visual will translate into: “first we get data from this source, then we scrub it, then we save it here, if this is true… etc” — Then we can show a third party the visual and the textual summary to see which is more understandable. Source <=> Understanding is more optimal than Source <= Translate => Understanding. For clarity, I’m not saying your approach is invalid or poor — I’m more interested in the hypothesis and the relative truth here. The statement you expressed about optimizing for 5,000 years from now is cherry-picked, while you may have skipped over some obvious benefits of text that I would like to explore with you.
d
@Steve I challenge the statement: "_when you look at a visual coding tool you are always translating the visual into textual/vocal_". • There are different styles of cognition. To use conventional programming languages, you use a symbolic thinking style that is slanted towards text/syntax/hierarchy. Visual thinking is different from this. This is the style of thinking used by graphic/visual artists, but it can also be a powerful mode of thinking for architects and engineers. • It is not a given that a visual coding tool must force you to translate the visual into the symbolic in order to think about what you are doing. A visual coding tool should allow you to think visually. Here are some quotes from Scott Kim, "Viewpoint: Toward a computer for Visual Thinkers".
It struck me as odd, and deeply wrong, that we were building tools for visual artists in a programming language that was utterly symbolic.
So I set about wondering what a visual programming language might look like. If computers had been invented by artists and visually oriented people, instead of by mathematicians and engineers, how might they write programs? It seemed to me an important question, but one that hardly bothered most computer scientists. I read about a few attempts to build visual programming languages, and decided there was something fundamental I needed to understand.
My journey took me deep into the foundations of computer science, where I asked fundamental questions like “what is programming” and “what is a user interaction” — questions that often get passed over in computer science (any definition of “programmiing” that starts with “a sequence of symbols that…” is not deep enough to encompass visual programming languages).
http://www.scottkim.com.previewc40.carrierzone.com/viewpoint/index.html
❤️ 1
s
@Doug Moen Classical programming language need translation too, this is why we put comments and name variables the way we do things — We try to help readability IE understanding by reducing the barrier of translating logical from the digital space to our own personal understanding. Let’s not take my argument as one encompassing all domains; I only argue that visual representations require translation into inner-dialogue which IMO is the source of many of it’s downsides. For example, a box around some logo and line to another box around another logo — That means nothing until you “think” about it and say internally “Ok, first this happens and it passes data to this other thing”. Then you see a clock logo and 5 and think “Ok, now it waits for 5 minutes before doing the next step” <= All that is translation which needs to be learned by understanding the meaning of the visual relationships of data.
😑 1
t
@Steve I wouldn't really choose "describing a fully-functioning program over the phone" as a meaningful measure of whether my system is effectively helping the programmer solve their problem, but we could be optimizing for different things— that test seems a bit constructed to favor language. When working with a program, the user is going to be faced with questions like "why doesn't this result look the way I expect?" or "what information went into this calculation?" or "what happens when I change this value?" or "what's the input or output value for this part of the calculation?" I am trying to build something that makes those questions easy to answer, so for my purposes, a test against those would be more meaningful. Probably the most effective and reliable way to answer whether any particular design succeeds is to build it, and try to use it to solve a problem. The good news is that we're both doing that, so no extra effort required! ;) You've made an interesting assumption with "you're always translating the visual into textual/vocal in your head"! There seems to be a bifurcation in the population of visual-thinkers and language-thinkers, and each side is often unaware of the other side's existence/experience. For me, it is (with few exceptions) the opposite of what you say: Words are continuously translated into mental pictures, the "native format" of thought for me. This is true for many people! That being said, I don't think Lynxtool will be useful only for visual thinkers, since the ability to easily answer some of the problem-solving questions in the second paragraph would be useful to anyone. My particular solution involves manipulable diagrams, but that doesn't imply there's no other way. And optimizing for a different measure of success will lead you to a different solution. What would you say you're optimizing for?
💯 1
😄 1
s
I’m not saying text is better, I’m not saying visual is bad, I’m simply putting emphasis on the medium to express complex logic must be deliberate as it influences the entire product above that single decision. Take for example the work happening at NeuralLink and other like-products. They have, more or less, a direct brain to computer interface. Will you thinking in boxes and wires? No. You don’t think in boxes and wires; you think in problems; in dialog, in natural language. These next-gen HCI products should not make the mistake of understanding our thoughts and turn them into diagrams, unless the domain requires it (like architecture, civil engineering, etc.) but for mathmatics and workflow logic/automation — IMO Text is best.
👎 1
Visual vs textual thinkings
💯 I agree, I personally may be blend… IDK lol — I’m not convinced that this matters however because there is still some buffer of consuming content and internalizing it. I find that most visual representations of logic require textual explanations. Books are the best example here, I’ve read a fair share of books and they are nearly 99% text to convey complex concepts with sprinkled in visuals to help where necessary — It’s not the opposite. Even a book on mastering bonsai trees was mostly text with some random images just to make it pretty. Is the domain of creating software much different?
Optimizing for
We are optimizing for HCI (human to computer interface) — Reading then writing. I want to optimize for the future technologies like direct-brain interfaces to express, query and create complex logic — Like Jarvis in Ironman (don’t take this comparison literally, but we are building an AI-assistant).
e
I took a workshop years ago from Fernando Flores. Mr. Flores was secretary of state under Allende, and put in prison for many years. Anyway he worked with Stafford Beer (the author of "Platform for Change", one of the most important books of the 20th century), and was a renowned expert on the german philosopher Heidegger. Evidently, Heidegger believed that all thinking was linguistic in origin, and that thinking is speaking to yourself. Certainly the human race got nowhere until we invented written language.
t
Is the domain of creating software much different?
I'd say that a book on bonsai is about as similar to a computer program as a sculpture is to a steam engine. To be slightly more literal: Unlike a program, a book on bonsai is a static collection of ideas, with unchanging relationships. A program is a dynamic thing, with data flowing through it, constantly changing, with new relationships materializing and dissolving. And the program itself moves too, as you rearrange its parts and add new ones. The two kinds of things will necessarily be designed and built quite differently.
😂 1
@Edward de Jong / Beads Project See also: The Whorf-Sapir hypothesis (now regarded with some skepticism— or at least less literally true than first proposed). I think no one disputes that language is critically important to civilization/communication, but to a fraction of the population the notion that "language" and "thought" are equivalent is self-evidently untrue. 🙂
💯 3
d
When I think, I'm using three bits of my brain in parallel: • the animated model: probably visual, it allows me to, for example, visualise a ring buffer being filled and emptied by two processes • the narrative: "so as long as this process is only responsible for writing data, and this process for reading, I probably won't need to lock anything" • the feeling: "wub wub wub .. uneasy .. got hurt before .. doubt .. are you sure about that? I'm sure I'm not the only one that thinks this way, but I can imagine some have a greater emphasis on the model, some on the narrative. And we all get that nice feeling when tests pass or the crash stops crashing.
🍰 2
d
@Steve Here's another approach to understanding visual thinking: https://www.lifehack.org/275993/7-things-only-visual-thinkers-will-understand The author self-identifies as a visual thinker, and explains that it is easier for her to process node-and-wire diagrams than text.
❤️ 1
👍 1
c
Most diagrams have a poor sense of time, whereas text/language (laid out normally) has a strong one. This makes text better for encoding things with a strong sense of time. People have also learnt an enormous vocabulary, which is a type of caching that gives (a potentially misleading) density. This covers most "processes" including programming code. For information without a strong primary "time" - for example, the London tube map, which is a tangle of bi-directional lines - the visual representation is far superior to any description in language. Can you imagine trying to verbalise the tube map? Even when people give verbal directions in London ("go to Oxford Circus, change to the Jubilee Line...") What you think about internally, is not a mental image of trains and stations, but a mental image of the lines and colours of the map. Similarly, if the diagram does maintain a strong sense of time, it can also be superior. For example, a multi track video/audio editing program has a timeline diagram that is very clear. To describe it over the phone would be clumsy ("then back up on the top track there's a load of little snippets, all with a small gap, at the same time on the bottom track there's three longer...")
s

https://youtu.be/XWYaoAoijdQ

d
Any chance of a TL;DW Stefan? I don't know how anyone has the time to watch videos longer than 20m (and I watch those on double speed!)
d
Lakoff wrote "Women, Fire and Dangerous Things", on embodied cognition and language. Here is some text: https://pdfs.semanticscholar.org/9c85/d2dd7e6d924a1078fb93cac9baaa8a850d3e.pdf
s
Text is still visual, unless read aloud (another benefit to text is that it can be read; diagrams cannot). When I read text (both books and code), I form my own emotions and graphs mentally. The Harry Potter series was famous for sparking powerful visuals while reading. Harry Potter had no images, no diagrams yet the readers created a world around them. When I read an OSS library that I did not write, I navigate through the code like a story; building my own visual understanding of the domain and usage. Harry Potter turned into a movie, which may have cut the imagination of the original plot in half. If you diagram a code base, from my experience, you lose more than half of the understanding/context which needs to be put behind menues, tooltips and other UI complexity. @tbabb you say the Bonsai book was far from programming… yet it has conditions, variables, error catching and other logical concepts very well articulated in text. It’s an analogy, and I think a pretty good one. You’ll love my Harry Potter one above 😉 On the topic of visual thinkers — It’s naive to think that anyone can create the same visual constructs that every visual person understands. The way I visualize complex logic is not the same as how You visualize it. Just like how we have a hundred programming languages focusing on different complexities; visuals are the same. There is great depths and domains to tackle — there is no universally understood graph/visual. I’m deeply concerns for those building visual tools for that reason alone: the way they see the visual space is not ubiquitous, and I would argue more unique than text. My argument is based off research and discussions with individuals in this FoC as well as leaders in the industry. I won’t name them, but there are several companies that understand their visual representations of logic must change as their product is just as confusing as programming is today… it’s just shinier.
s
@Duncan Cragg Well, that ~90 min video (just ~60 mins of it are his lecture, although I wouldn‘t miss the Q&A) is still better than reading the 632 page Women, Fire, and Dangerous Things. @Doug Moen, this lecture touches on some of the more recent findings about x-nets, frames, and neural modeling of metaphor, which came after the book.
The gist is that we have quite interesting models and theories of how the brain does its thing that are well researched (it‘s not just Lakoff, by far; he provides tons of references) and which show that any discussion about either text or visual representation is just arguing about if you fall off the edge of the flat earth or not. Science has moved on and while we still don’t know everything, we know a lot more about how brains think. If you are genuinely interested in learning more about how the brain works, I can only encourage you to spend some time with Lakoff, Johnson, Bergen, Nunez, etc. I linked to this particular video because it has both a great example for spatial metaphors that should resonate more with those who like to be more on the visual side as well as a detailed walk-through of how the brain interprets letters, which should offer an interesting perspective on @Steve‘s arguments about text. Ultimately, both language and visuals bring a lot to the table and you can easily find situations in which one works better than the other, as we frequently do here. I spent way too much time in this rabbit hole, but I definitely got a lot out of it that changes the way I think about modeling systems and designing tools for thought and/or programming.
To offer a specific example: one of many very basic spatial metaphors is more-is-up which structures a lot of language as well as visuals: if you talk about a stock price going up or look at a graph depicting that, it only makes sense because that metaphor is deeply embedded in your reasoning and has been learned very early in your childhood from basic activities like recognizing that if you fill a glass with more water the level rises, or if you pile up legos the pile grows upwards, etc. (https://metaphor.icsi.berkeley.edu/pub/en/index.php/Metaphor:MORE_IS_UP). These metaphors structure our understanding and are several levels more fundamental than the much higher-level text/visual debate. Both language and visual representations make use of these building blocks to enable sensemaking and reasoning. I find that extremely interesting and so far I couldn’t find much work that transfers these already several decades-old insights from cognitive science to our field — which is why I‘m bullish on there being some interesting ideas ripe for experimentation.
💡 1
👍 3
y
> On the topic of visual thinkers — It’s naive to think that anyone can create the same visual constructs that every visual person understands. The way I visualize complex logic is not the same as how You visualize it. Sure, we all have different internal representations, and that’s cool too. But this doesn’t stop us from sharing representations. In fact, we are really good at it:
👍 2
(Maybe we can start a mirror thread on this topic where we can only debate with pictures and gestures)
s
That‘s an excellent point that aligns with the research on metaphorical structuring. Many of these structures are learned and therefore differ between cultures, however, there are also many lower-level structures that rely on embodiment and are universal across all languages and cultures.
i
(resisting the urge to all-caps) — Like the fact that visual representations can be animated literally, whereas text can only be animated _figuratively_: https://www.lunchboxsessions.com/materials/counterbalance-valves/counterbalance-vs-pilot-operated-check-valve-simulation
👍 1
(I'm reminded too of a tweet where Dan Piponi, if memory serves, asked Wolfram Language to rotate a matrix.. and it produced a picture of the matrix table at an askew angle_)_
🤣 2