Future of Coding • Episode 69 Mary Shaw • Myths &a...
# share-your-work
i
Future of Coding • Episode 69 Mary Shaw • Myths & Mythconceptions 𒂶 https://futureofcoding.org/episodes/069 In the spirit of clearly communicating what you're signing up for, this podcast episode is nearly three hours long, and among other things it contains a discussion of a paper by author Mary Shaw titled Myths & Mythconceptions which takes as an organizing principle a collection of myths that are widely believed by programmers, largely unacknowledged, which shape our views on the nature of programming as an activity and the needs of programmers as people and the sort of work that we do as a sort of work, and where by acknowledging these myths the three of us (Mary Shaw primarily, and by extension Jimmy and I, those three people, that's it, no other people appear on this podcast) are able to more vividly grip the image of programming with our mind's eye (or somesuch) and conceive of a different formulation for programming, and in addition to these myths this paper also incudes a number of excellent lists that I take great pleasure in reading, beyond which I should also note that the paper does a job of explaining itself and that hopefully you'll find I've done a similar job, that's the spirit, please enjoy.
As usual, most of the action is in the show notes • MUMPS (the medical thing, not to be confused with mumps the medical thing) is used by Epic (the software company, not to be confused with Epic the software company).
e
MUMPS is the only old school programming system I’ve ever learned that’s actually come in useful at work!
j
A mythconception that chaps my bottom: I think programming, software engineering, and computer science are three different things, but people insist on using the terms interchangeably. 😞
i
Yeah. I see the distinction, and I'm guilty of sloppily using the terms. Good reminder.
e
I see distinctions but am not sure I’m landing at a similar set of distinctions as other folks — do you all feel that there is a sort of “happy path” or “right meaning” to each term?
j
Just finished the paper, looking forward to this monster of an episode. Thinking about the diff between vernacular and professional, and how one might move between those roles and the contexts in which that would happen.
l
@Jack Rusher together we are unstoppable. the coding trinity
j
@Eli Mellen there’s definitely a specific way that I think about them, but it doesn’t conform to any standard understanding, so… 🤷🏻‍♂️
j
programming, software engineering, and computer science are three different things
Yeah that is a good myth 😉
p
Well, I was listening to the episode and then I went and watched the Cell Pond talk and during the questions I learned about the game Baba is you which I then spent a lot of time figuring out how to give to some younger people I know. Eventually I will pop the stack and finish the episode, but it's been fun so far!
j
In my field there is a language called Catala, written in a functional language, designed so that one step of the transpilation process from Catala to other languages can be proved correct. One step. Why? As far as I can tell, academic novelty. It doesn't solve any real world problem in the domain it is supposed to serve, and the downsides of encoding laws this way are considerable. That is an example of mathematical tractibility for its own sake. The language was designed by a person getting a PhD in formal methods. There are levels of formality, but that level just isn't helping. Do they believe the myths? No, probably not. But they might benefit from reading the paper anyway.
j
Perhaps worth noting that there are a few small subdomains where formal methods are both important and helpful. Two that come to mind are cryptography and synchronization primitives.
On an adjacent topic, this talk by John Regehr about the economic aspects of correctness in compilers is a nice twist and generally pretty impressive.

https://youtu.be/tMYYrR-hazI?si=Lt6who6-xv1ZVPBe▾

I haven’t finished the podcast yet (3 hrs = multiple sessions!), but while listening I was wondering about bringing fuzzing to vernacular programming. At its heart it is a simple technique—list some things that ought to be true and then let the computer burn cycles double-checking your work—and one that systematically saves time and catches bugs for professional programmers (or at least for those who bother to use it). And fuzzing is almost an anti-formal-method: just throw lots of stuff at a program and see what breaks, practical engineering trade-offs everywhere you look. :)
a
The distinctions between them are rarely discussed, but they're deeply orthogonal
Quasi-related and illustrative: I find being able to type really fast is helpful in my career. It's almost never load-bearing but sometimes I know it helps a lot that I can get stuff out "on paper" very quickly
p
Alan Kay once said science is what's real, math is what's true, and engineering is what's possible. I'm not convinced that computer science is science. Here's one amusing discussion about this. https://wiki.c2.com/?IsComputerScience I also love the first minute or two of this SICP lecture where Hal Abelson talks about the term computer science. (This was before he sold his soul in the Aaron Swartz investigation, so I like to believe that the person giving this lecture was a good man at the time.) https://m.youtube.com/watch?v=-J_xL4IGhJA&list=PLE18841CABEA24090&index=1&pp=iAQB I'm also not convinced that software engineering is engineering. Here's another good read from C2. https://wiki.c2.com/?SoftwareEngineering Personally, I just use whatever words the people around me are using. They probably aren't working from a rigorous definition anyways.
a
If you haven't watched/read it yet Hillel Wayne has been doing great work exploring the question of real engineering vs. bullshit software engineering (my preferred term for what I do 😉) ... in part by interviewing quite a few real engineers who also write software: • blog post 1/3 (2021): https://www.hillelwayne.com/post/are-we-really-engineers/ • talk (2023):

https://www.youtube.com/watch?v=CmIGPGPdxTI&ab_channel=GOTOConferences▾

◦ slides: https://onedrive.live.com/view.aspx?resid=7E5188E64FF71F3A%21457159&authkey=!ABog_uhJFm6grFk
m
A wonderful paper by Mary Shaw, one of my great inspirations. Looking forward to your insights. One of the really puzzling things about her earlier software architecture work was the insistence that although connectors deserve first class status, programming languages and architecture description languages were considered completely different. Though they are clearly analogous and the architecture side a generalization of the programming side. And then the whole CMU gang went off the SEI end… To me, this paper and the talk were Mary coming back to true form, a tour de force with many crucial insights. My favorite one of course was the one about the myth that programming is writing procedures. The Gentle Tyranny of Call/Return one might call it 🙂.
p
My favorite one of course was the one about the myth that programming is writing
procedures. The Gentle Tyranny of Call/Return one might call it
Oh it's not gentle! Tail calls enable entire architectures that are impossible when everything needs to be hierarchical.
j
On the idea of measuring "good enough", I think we need to look to design for advice. We need to empathize with the users, and design toward improving their subjective experience. A user told me once that my tool made him feel like he had super powers. The tool was (and remains) buggy as hell (a block-based visual DSL), but it was good enough, because that feeling motivates use. Use generates feedback, and feedback will help you find and fix incorrectness. So the software doesn't need to even be good enough right away. The process just needs to be self-sustaining and tend toward good enough. So I would argue new ways of measuring good enough may still be too correct a goal. The real thing is fuzzier. Have a user you care about. Fuzzy. Try to make them feel something good. Fuzzy. Learn with them. Fuzzy. Repeat. Correct is a myth, but do the fuzzy thing, and good enough will eventually emerge.
p
Well, I'm 2 hours into the podcast and I've read a smattering of pages from different parts of the paper. My sense is that the author is actually criticizing modern academic programming language researchers, but falsely generalizing her criticisms to all programming language researchers and implementers in all parts of history. This paper seems to have been written in a world full of languages like Pascal, ML, and Haskell where all professional programmers work top down from a specification through a process of stepwise refinement as advocated by folks like Wirth and Dijkstra. It seems to completely ignore the massive amounts of casual, exploratory, and bottom up program design that have been a major force within the Lisp, Logo, Smalltalk, Forth, SNOBOL, Perl, Python, Ruby, Unix, Emacs, and awk communities for decades. (And I almost certainly missed at least a few others.) It also ignores older academic and industry work focused on writing programs that are easy to modify in the face of changing specifications (Parnas, SICP, Dijkstra, Brodie, etc.). It also ignores the long history of Unix programmers saying "the environment IS the IDE." During the editor wars, it was common for the Emacs fans to point out that VI was very limited by comparison unless it was calling out to external programs. The VI fans would counter that calling out to external programs was the proper Unix way to do things rather than reinventing the world within your editor. Unix programmers have been hyper aware of their development and execution ecosystem, and not just their programming language, for the better part of 50 years. It also ignores escape hatches within vernacular programming systems. Does someone stop being a vernacular programmer when they decide that Excel formulas are no longer enough and they need to use some VBA? I think there may be some validity to this paper as a criticism of mathematical naval gazing within the ivory tower, but I'm not sure it says much about the industry as a whole or even programming language designers in general.
m
[The Gentle Tyranny of Call/Return] …is not gentle! Tail calls enable …
Maybe I should have titled the paper “The Subtle Tyranny of Call/Return” instead? As in you are trapped by it even when you think you’ve escaped…
falsely generalizing her criticisms to all programming language researchers and implementers in all parts of history.
I don’t think that is warranted. She is always clear that what she describes and criticizes is “typical”, not “all”. For example: “Traditional general-purpose programming languages, or at any rate the ones of most interest to programming language researchers…“. Next: “Mainstream languages are usually designed…“. And so on and so forth.
p
> don't think that is warranted. She is always clear that what she describes and criticizes is "typical", not "all". For example: > "Traditional general-purpose programming languages, or at any rate the ones of most interest to programming > language researchers..." Next: "Mainstream languages are > usually designed...". And so on and so forth. > Perhaps I should have said "most," but I think my point still holds. The most popular programming languages among programmers are not of much interest to today's academic programming language researchers, and are not particularly mathematically tractable.
Well, I finished the episode and when Jimmy started asking "who actually believes these myths?" I was right there with him. I'm not even sure that most academic language researchers believe them. I think they might just act like they believe them because their attention is naturally drawn to things that they can write mathematical looking papers about because mathematical looking papers appear to be more "serious" research.
j
An anecdote on that front: I'm not on the Go team, but I contributed very heavily to the toolchain and have a fair amount of visibility. When Go was adding generics, it went through multiple years of deliberation over clarity, usability, usefulness, and compile time. Everything was focused on the developer experience, with lots of concern given to approachability for less experienced programmers. Only once the team was reasonably happy on that front did they go find some type theorists to double-check that what they'd created wasn't going to blow up on them later. It felt (from the outside) more like a box-ticking exercise.
k
I did a lot of cooking yesterday, which gave me the time to finish this long episode 🙂 I had read the paper a while ago, and liked it overall. I saw the criticism directed specifically at academic CS, and PL research in particular. It is, after all, a paper written by an academic and published in an academic journal. So it was interesting to hear how it is interpreted by... may I call you "software practitioners", to avoid the labels under discussion elsewhere? One aspect that takes a lot of room in the paper and in the episode is the PL research focus on correctness and mathematical rigor. Neither the paper nor the episode points to the historical roots of this focus. When CS emerged as an academic discipline in the 1960s and 1970s, computers were rare, expensive, and available to few specialists. The dominating applications at the time were the "COBOL branch" (banks, insurances, ...) and the "Fortran branch" (science and engineering). In both branches, correctness is very important and mostly well-defined. The interface between computation and the messy real world is handled by humans. That changed in the following decades, as computers became commodities accessible to everyone, and were increasingly used to process sensorial data and control physical devices, in direct contact with the messy real world.
e
Something I find myself thinking about listening to this episode and reading this paper is what different readings of “power” reflect about a programming system, and the culture it supports. The paper mostly posits a relationship where a mathematically tractable general purpose language is “more powerful” than a DSL or similar. What does that suggest about where the root of power is? Why?
I think it may relate to something Jimmy mentioned that is totally missing from the paper — any mention of the lived reality of working within capitalism. A general purpose language is, generally, these days, usually open source and outside the direct control of a specific company (laughs in Microsoft owning the entire stack, GitHub, npm, vs code, and type script) whereas a DSL or other sort of “less” powerful tool is typically bound up with a service subscription or licensing fee. I think Racket maybe makes a move towards an alternative — but trades the control of capital for academia, mostly.
l
I thought I'd share this message from Dave Ackley (from the T2 Tile project discord)
k
A general purpose language is, generally, these days, usually open source and outside the direct control of a specific company (laughs in Microsoft owning the entire stack, GitHub, npm, vs code, and type script) whereas a DSL or other sort of “less” powerful tool is typically bound up with a service subscription or licensing fee.
Real programmers have agency over their tools. Second-class (vernacular) programmers have to live by the rules defined by (capitalism | real programmers). Interestingly, before Open Source, general purpose languages were already outside the direct control of a specific company, through standards bodies.
e
I’d push back against the idea that programmers necessarily have agency over their tools. They have more open tools, but that openness may not equate to agency. For instance I now work adjacent to the medical industry, while I use larger general purposes languages I have no real say in the configuration of those tools. Likewise if you develop for a platform like iOS you use what the platform demands with very little room for wiggle. I’d posit that the differences between general and vernacular are waaaaay more culturally constructed than they are rooted in the tools themselves.
k
Oops, I wasn't clear enough. I didn't mean to say that individual real programmers have agency over their tools, only as a collective. In other words, software tools for real programmers are written by other real programmers. In contrast, spreadsheet users don't write spreadsheet engines. At best, they get a questionnaire for providing feedback.
d
One aspect that takes a lot of room in the paper and in the episode is the PL research focus on correctness and mathematical rigor.
The historical context is important. Partially because it directly impacts the vernacular programmer who comes to practice and starts wondering what’s the difference between a function, a method, and a procedure. The terms of art are directly situated in these (often long) historical contexts. I watched Shaw’s presentation at HOPL and I was surprised that she didn’t mention terminology as a barrier to tooling. Thus the podcast also didn’t discuss it. In my experience teaching, I can get pretty far with a REPL but the terminology creates immediate obstacles. The promise of associativity is a form of mathematical rigor that we all enjoy. But I’ve found folks grasping
map
well before they really understand why associativity is actually important. At least mathematical rigor gives me (the teacher) a precise definition. But take the term function into the programming world and sometimes it means that I must always return one value, sometimes it means that can return multiple values, or sometimes it means I might return no values while modifying some other value in some other part of the system.
b
I didn't mean to say that individual real programmers have agency over their tools, only as a collective. In other words, software tools for real programmers are written by other real programmers. In contrast, spreadsheet users don't write spreadsheet engines. At best, they get a questionnaire for providing feedback.
This gives me interesting perspective on Spreadsheets in Boxer paper 🤔 I first read it as deflecting the implicit criticism "spreadsheets' data-flow model is a proven road to give users computational agency; too bad Boxer doesn't include that" by showing "look, it's easy to build one inside Boxer, just write a re-compute loop doing
change cell.value tell cell formula
(p. 7) & if you really want attach a `modified-trigger`" (p. 18). And I felt they completely missed the point. There is a deep difference between having a language-level reactive model where you can just assume derived data will auto-update, vs. imperative model where you have to think about having to recompute, e.g. install
modified
callbacks... (and indeed the simplistic "spreadsheet" they built in the paper has limitations p. 18, single top-down left-to-right pass won't deal with other dependency orders...) I still feel an ideal computational medium ought to support reactive behavior built-in. But I guess I should re-read it because I see now they did not argue "being able to build it means we don't need it built-in". They merely used spreadsheets as a complex example to show off other strengths:
Writing a spreadsheet is not a toy problem, and yet in Boxer it becomes a reasonable exercise.
and more relevantly here, their goal in building your own spreadsheet is not to so much to use one but to _understand what makes it tick_:
Boxer (1) supports a close connection between a conceptual model of a spreadsheets and a working implementation,
(2) provides a concrete, visible, working model that can be inspected, explored, and modified and
(3) eliminates distracting display and editing issues and concentrates cognitive load on the fundamental CS issues
Hmm, I see now this is continuation of the SICP programme 😍 — that some core ideas in CS about structure & computation are worth learning, and that a great way to learn them is to understand and modify a substrate of computation. It's hardly realistic to teach school kids the later parts of SICP: how a LISP interpreter works and how to say tweak to support non-determinism with backtracking. Is is more realistic to show them how to build a (simplistic) spreadsheet, still teaching some great ideas about computation! 👏 With this goal, weaknesses like not cascading dependencies, not dealing with cycles etc. are not bugs — they are opportunity to do some "computational thinking"!
Third, they show off how being able to modify the implementation of a spreadsheet gives you agency e.g. make a spreadsheet operating on vectors, or graphs, or music... These are arguably more of a weakness of typical spreadsheets (support rich data types in cells). And yet, they do touch on @Konrad Hinsen’s point? I'd say no, it's not pragmatic for even "real" programmers to modify a spreadsheet engine or build a custom one. The inter-operability concerns make that rarely economic. And yet, the experience tweaking one gives one courage and awareness that your tools are malleable. 💡IMHO a non-negligible part of the agency we gain as "real" programmers over "end-users" is that courage, and it's not easily bought. I can't just give end-user programmers a motivational talk; it takes a lot of well-built, scaffolded, experiences like this one? [I'd even argue that a central benefit of Free/open source was not directly "anyone can modify their OS" 😆 but indirectly, opening as many on-ramps to grow as a developer as possible. We're lacking on the thoughtful educational front though... Mark Guzdial is worth reading on the latter.]
I just saw

https://www.youtube.com/watch?v=5rg7xvTJ8SU▾

yesterday and it's a beautiful use of spreadsheets for "explorable explanation" of how a CPU works. But it also did something I never thought to do — "misusing" a spreadsheet to model circuits with cycles, and flipflops, directly 👏 To do that, he had to go into Excel's settings and customize how it deals with cyclic dependecies — recompute exactly once on every F9 press. That is, even as "end-user of a spreadsheet" [though a real programmer for certain, by the classic silly definition of knowing how a CPU assembly work], he needs to sometimes think about the computational model underlying the spreadsheet, give up the "magic reactive" model and use an escape hatch to turn off the autopilot 🕹️ and take manual control, pressing F9 as many times as he needs.
☝️ This is a delightful reversal, as arguably he could have been "professional chip designer" and "end-user of spreadsheets" at same time; the fact chips underlie all computing is accidental. That's not the case, he knows a lot about software [https://github.com/InkboxSoftware] but I'm certain experienced vernacular Excel developers with no formal software training do learn at some point how Excel handles cycles and how to tweak that. I, as industry programmer with CS degree, didn't think of that because I'm not a sufficiently experienced Excel developer 😜 (I would have done some miserable purely-functional "tableau" of past states, like they teach in Turing machines course. Stretching the pure data-flow model to be so awkward it's not worth building the thing.)
k
Thanks @Beni Cherniavsky-Paskin for a pointer to this paper! I agree that the main interest of such a spreadsheet implementation is the pedagogical value and the experience of building a tool for one's own needs. As an illustration of the pedagogical value:
I still feel an ideal computational medium ought to support reactive behavior built-in.
That question has been debated in the computational notebook community for years. Mostly among developers, much less among users, which often don't quite see the trade-off. Reactive behavior provides better feedback, but it's better only as long as update times remain reasonable. The question thus entangles UX, computational complexity, and hardware performance. It would help a lot if power-users-but-not-developers had a good understanding of these issues, if only for better communicating their preferences to the real programmers that wite their daily-use tools.