A big issue in programming is that a program is an...
# thinking-together
w
A big issue in programming is that a program is an extremely lossy record of history. Even with good comments, it’s hard to look at a program and know: • What are the alternatives that were considered, but ignored? Tried, but later discarded? • What order was this program written in? Is a particular line or function the focal point of this code? • What is the set of resources the author used to write this code? • How weathered is this code? How many bugs have happened in this code? How scared should I be to change it? What are some ways in which programming environments could help us record this info / answer these questions without requiring additional effort from the author?
1
💡 3
k
👍 2
c
Unit tests help some of this. They encode the constraints of the program and the expected behavior - assuming they are well written! In particular the last point - how scared should I be to change it? That can be answered by how robust the unit tests are. That is their great value; refactoring becomes much easier.
💯 1
o
Maybe one way to address some of your points is to store all the code editing history. One missing point will the tracking of the resources and the alternatives that have been considered but not tried. I guess using environments that use CRDT can have this for free. CRDT used as editing actions log can be used to show what happen at what "time". And adding other not-code artifacts (resources and record of not-tried alternatives) in the environnement might also help covering all your points.
j
I end up using git annotate as a proxy for some of this. It’s limitations are being line-based and being just the most recent thing to have happened to each line. I’m picturing something that turns the section of code I’m looking at into a series of slices that let me step through commits while seeing what has been stable and what has been turbulent.
h
i think one big area that's possible right now without crazy advances in PL design is revealing production behaviour in the editor during development. it doesn't help that much with designing business logic and answering the questions you asked, but for managing all the accidental complexity of actually running code in production, there's a lot of data generated by logs, tracing systems, etc these days that i wish editors revealed much easier
💯 3
think hovering a name in your editor and getting a popover that shows values that name held for various traces in production, or the values it tends to hold for successful requests vs failed requests, etc
❤️ 3
pretty sure some advanced Java development setups allow for this kind of thing with remote debuggers running in prod, but it doesn't seem to have taken off
it really does sound nice to work with a medium that reveals all the variations it could be shaped into next or all the variations it used to be in the shape of, instead of just the current shape, but i also think maybe our feeble brains can't deal with yet another dimension of abstraction when working with code. the concreteness of the code in front of you, the lossiness might actually be valuable in service of making programming possible in the first place
k
One thing I've wished for is the ability to ask, "do we have any tests where this variable has value _?" Often when I have a bugfix to add I'll first comment out existing business logic, just to figure out the best place to put the new test.
💯 2
j
This is an area of I have thought a lot about. My own personal programming project is aimed to help with this. But I've also been white boarding through ideas on how to help with this in the current world. Not sure I could summarize well the ways I want to actually help this problem here, but I do want to at least contribute a few meta thoughts. First I think one really important thing to consider is if we can in fact recover the important parts of a programs history at all. Peter Naur in his paper "Programming as Theory Building" claims that re-establishing the theory of a dead program is strictly impossible. I'm sure so people might question this, but I think this is actually very important to consider. If we aim at trying to save a history, we have to consider our aims. If our goal is to recover the theory behind the program that could be a fools errand. Instead maybe we should be thinking about what parts of a programs history are important for helping us establish a new theory? Does that history have to be an actual history or could it be something like what is proposed in the talk "Idealized Commit Logs" (

https://www.youtube.com/watch?v=dSqLt8BgbRQ

)? Second I think we actually should really consider learning more from historians, particular historians of artifacts (like art history). There is a great book on art history methodology called Patterns of Intention by Michael Baxandall. One of the points he makes is that when we are describing art, we are converting something visual to something linguistic. This is a lossy and biased format, one easy example he gives us that paintings don't actually a beginning and end point. We have to choose how to describe it and where to start. When we are studying art works, we are really studying them under some description of them. (He makes this much clearer for of us who are not artistically inclined by starting with a history of the forth bridge. His history of this artifacts asks as a really interesting example into what goes into an historical explanation.) I think these same things hold true for programs. In capturing a history, we can't just be recording facts in a database, we are interpreting those facts, selecting the relevant ones and determining an order. What sorts of facts, in what sorts of orders can help people gain a new theory? Will different interpretations click better with different people? What things do we exclude by selecting that facts in this way? Is the loss worth the pay off?
What are some ways in which programming environments could help us record this info / answer these questions without requiring additional effort from the author?
I think this is particularly interesting. We all know that getting people to provide documentation is hard. It is even harder to make that documentation good. Short of completely automatic though, what if we could make this easier? What if we could make it more part of your current flow? What if you didn't have to leave your editor, but you also didn't have to mess up the code by leaving some big huge long comment in the middle of things? What if you could decide that something was a good example for later and save it off with a single button press? What if you could make tours through your code bases, or document in those code bases various ways the data flows? What if you could make an interactive introduction to codebase. I think these are the sorts of starting points we should consider. Once we can make these things work and work well manually, then we can start plugging in automations, start making automatic curation.
❤️ 3
l
in a big codebase there might be some utility in using ML. For example, applying frequent itemsets to change sets could identify chunks of code that are changed together more frequently than expected. This might be suggestive of risk, or of refactoring opportunities.
❤️ 2
w
@Jimmy Miller the reference to idealized commit logs is really interesting. I’ve also been exploring a tool that uses dynamic program slicing for program comprehension (https://github.com/willcrichton/inliner), although it’s a little more general in that it uses several source-to-source compiler techniques for simplifying programs, not just dead code elimination. The idea of using cumulative tests to construct diffs between slices is pretty cool, haven’t thought of that before.
j
Looks awesome @Will ! That is the kind of thing I'm aiming to support in my language. But ideally it should take like 10 lines of code and be something you can just write on the fly. :)
w
@Jimmy Miller as in a goal of your language is to support high-level program transformations? Do you have an example?
k
I've been thinking lately about how our minds have infinite levels of conceptual hierarchy, but our tools are inevitably limited to some finite number. That discrepancy inevitably leads to loss of information and entropy.
1
j
I talk a bit about the idea at the end of my talk here.

https://youtu.be/9fhnJpCgtUw

I've got a prototype I'm working on. But only so much free time to do it. Basically the idea is to base the semantics on term rewriting. So the whole language is just data transformation. Then you can have a notion of meta-execution which is rules that match on the execution data. So you can write a rule that just says, tell me the current expression, what it transforms into, and which rule and clause matched that caused that transformation. In my prototypes that means I can make a stepping debugger in 3-5 lines of code. And then a time travel debugger in only a few more. Should soon be able to do what we are talking about above as well. Basically in order to program without a blindfold we need access to the unfolding of the execution of our programs, so why not let it be data that we can match on, query on, etc?
👍🏼 1
d
@Kartik Agaram can you expand a bit more on this:
I've been thinking lately about how our minds have infinite levels of conceptual hierarchy, but our tools are inevitably limited to some finite number. That discrepancy inevitably leads to loss of information and entropy.
k
@Jimmy Miller I just watched the idealized commit logs talk. Very well done, and it takes me back, because my PhD thesis involved program dependence graphs and constructing tiny slices out of very large programs that computed precisely the hard-to-cache memory accesses so that they could be prefetched on a separate thread (computer, actually). Slices certainly have lots of applications, but there's a reason why there's no empirical evidence that they're valuable for comprehension: they aren't actually the most elegant ordering as he claims. The reason: data transformations. Halfway through the evolution of a program someone redid all the data structures to organize by one axis rather than another. Slices get killed there. In general, slices focus on code but the compiler techniques we have (as of 12 years ago) are forced to approximate data access. This loop accesses field
x
in some node of this linked list, so let's assume it accesses it in all nodes. That sort of thing. Regarding the specific idea he proposes: it doesn't actually result in a very idealized commit log, because a) you still have to provide a heuristic test sequence, and that's non-trivial, b) lots of times you get a more comprehensible result if you combine sets of tests in a single 'idealized commit' (but that blows up your search space for a) even more), and -- most important! -- c) no program has complete test coverage. If you focus only on tests you lose valuable insight along the way. Me, after spending 8 years trying to treat programs as black boxes and apply tools to them, I go back to the thing he dismissed at the start. There's no way to understand programs efficiently when the author didn't design for it from the start. If a program had multiple authors, it's as easy to read as the author least interested in comprehensibility made it. Programs have to be designed for comprehensibility. So toss out the modern social organization and its incentives for creating programs primarily as black boxes for people to use. (I'll share my approach to idealized commit logs for the umpteenth time, just in case somebody hasn't already seen it: http://akkartik.name/post/wart-layers) @Daniel Garcia I don't recall the context at work where it came up, but I'm increasingly noticing myself constructing larger refactorings out of what my Java IDE provides, like renaming variables. I renamed a variable here, another variable there, did a few other things, and the end result was that I split up a class that handled some input space into two classes that partition the space between themselves. It would be nice (in a first world problems sort of way) if that was obvious in the diff. This isn't a strong opinion, just a random idea.
👍 3
c
Interesting thread. My requirements are for storing code points to capture serendipity. In my tool, I have the beginnings of a system which stores code deltas in the project while live coding. I want to enable users to capture that moment when a visual effect or sequence of sound is great and rewind to that point afterwards (i.e. when not on stage giving the actual performance). This is easier in a live coding environment because things are more constrained in a single session/tool. Good live coders often experience that moment where they are in the zone, and a perfect combination of audio/visual has occurred, but then it is lost forever behind the complex edits they used to get to that point.
w
@Kartik Agaram one interesting part of the idealized commit log idea is that it doesn’t actually use slicing, just code coverage (which is trivial to compute with sufficient runtime instrumentation). Also because it’s dynamic, there’s no need for conservative approximations. I agree that the heuristic ordering of tests is problematic. Another alternative might be to think more statistically. Think about each test as a document and each line of code invoked as a term. Then a test’s focus would be the lines of code with the highest TF-IDF score (i.e. lines that occur in the test that happen less frequently in other tests).
❤️ 1
c
Just watched that video - really interesting since I’ve never heard of code slices before. My only reaction though is that it comes back to how well the unit tests are witten, as noted above. It’s probably unrealistic to expect tests to cover even half the code base in most projects. I think I still like the idea of a complete record of every program edit with a scrubber. Perhaps highlighting the code lines that are still in the TOT version, and showing heat maps for areas that are changed often, etc. Perhaps I could deep dive into a region of code and see how it evolved; with optional hiding of low code coverage areas...
👍🏼 1
j
@Kartik Agaram Really interesting to hear more about the limitations with slicing. Not 100% sure I followed why data transformation causes an issue. But definitely agree that slices don't solve the whole problem. As for the test suite idea, I definitely agree that it has some issues. I do think trying to take an existing test suite and applying this wouldn't be super useful. But you can imagine making your own test suite, specifically designed to exercise various parts of the system. You can imagine taking some lisp interpreter, feeding it simple expressions, getting the trace, and the moving on to more and more complex ones. Giving you in the end something that you can follow. I think your work on wart is really interesting and actually really aligns with what I was aiming at talking about. What we need is not a factual, uninterpreted history of how the program changed. We do in fact have the facts. They are recorded in source control. What we need is an interpretive, idealized history. You are providing exactly that in wart. Though I will say, I'm not sure it goes far enough, but that is mostly related to this:
There's no way to understand programs efficiently when the author didn't design for it from the start. If a program had multiple authors, it's as easy to read as the author least interested in comprehensibility made it. Programs have to be designed for comprehensibility. So toss out the modern social organization and its incentives for creating programs primarily as black boxes for people to use.
I'm tempted to agree. But I think the key word here is efficiently. I have worked in codebases that were clearly not meant to be understood (one was decompiled source). Where the code was absolutely insane. Understanding them was very very difficult and took a ton of work. But what I found in the end was that the tools I really needed to capture this information weren't there. I could of course write docs, and I did. But that wasn't enough. Nor was my coming to an understanding of the code aided (much) by tools. Of course, a slice wouldn't have told me everything, but it would have helped me start with a simplified case. Being able to ask counterfactual questions of my program would have aided me in understanding. In general, I can think of a number of things that computers are fully capable of that would have made my journey of understanding condensed. And then, once understood, I could convey some of that to people, but helping them get into the right states where they too could repeat my learnings. Designing programs with comprehensibility from the start is definitely a goal. But I just don't think it can be our end state.
💯 1
This post resonates so much with me on how I feel working on a codebases. Thought others might enjoy. (good despite it being on medium) https://medium.com/@wm/the-generation-ship-model-of-software-development-5ef89a74854b
❤️ 1
🤯 1
k
I realize you can put either side in the foreground, like a Necker cube: * Designing ahead of time for comprehensibility is definitely a start, but it can't be our end state. * Tooling for comprehension is definitely a start, but it can't be our end state.
1
w
Also worth observing: we have no idea what program comprehension is or how it works. A theory of comprehension would likely be an essential component of designing for comprehension at the tool or program level.
k
This is probably a subjective opinion, but I think we tend to do things first, and theories catch up over time as lots of people try out lots of things, and definitions converge last of all. Given that "ready, fire, aim" worldview, I try above all to focus on rewritability. If rewriting seems risky, that feels like the first problem to try to fix. That way the things you build can benefit from future theories, and you can be disrupted by yourself rather than external Johnny-come-latelies.
j
Personally I think Naur's "Programming as Theory Building" and "Intuition in Software Development" are good starting points. In general, I think we have a lot that we could gain from including insights in philosophy into software development (I have a talk at splash this year advocating for exactly that). Something I think Naur does particularly well. I do agree with @Kartik Agaram. Our practice generally outstrips our theories. But I think we can gain a lot by reflecting on practice and understanding. For example, as software engineers, we do a lot of what is called "Conceptual Engineering" in the philosophy world. Understanding what we do in light of that framework and how it relates to other things in the world can help us quite a bit. https://www.amazon.com/Fixing-Language-Essay-Conceptual-Engineering/dp/0198814712
w
Naur’s article is a good worldview, but (IMO) the devil is in the details. We have no idea how programmers build theories, what the shape of those theories are, what language constructs promote or hinder theory construction, and so on. We’ve had 50 years of distributed practice in programming, but I don’t think that’s been sufficient to meaningfully understand the psychological nature of programming beyond intuition. You can see this failure most acutely in intro to CS classes, where our top universities will fail to teach even the most basic skills like debugging or decomposition in a meaningful way.
j
Curious @Will to hear your view of how we'd go about starting that project. Would we solve these issues by doing empirical studies and finding statistical results? Or would it be more a matter of reflecting and exploring our own processes?
w
This is the subject of my research, so my views evolve rapidly over time. But right now, my sense is: there’s a lot of inspiration we can draw from cognitive psychologists up to anthropologists/sociologists on how to study human behavior, particularly for skilled tasks. There’s a lot of theories that could be applied to programming (e.g. I’ve been thinking about working memory recently, just submitted a paper to CHI about that). And there’s a lot of methodologies that can be used to study programmers. For example, Barbara Tversky’s book Mind in Motion is a really good overview of how simple experiments can build a really rich model of how people build mental models of space and action, and how action informs perception.
Theory-building inevitably involves both controlled lab experiments (whether protocol analyses of quantitative studies) and data mining (of software repositories, education statistics, etc). But I think an underexplored area is experiment design: what are interesting situations in which we can put programmers, and methods to observe them that reveal patterns in their behavior?
❤️ 1
k
@Jimmy Miller Naur's paper is one of my two or three "anchor" influences, so absolutely relevant. However, the "theories" in it are for understanding individual codebases or domains, as I recall. A general theory of comprehension is far more ambitious.
j
Yeah, but he was basing his work on Ryle and James who probably took themselves to have something like that more general theory. (I don't particularly agree with them) But yes I agree. I think naur offers constraints that have largely been ignored. I'd be interesting to know you other anchor influences.
🤔 1
k
I'd say Christopher Alexander and Richard Gabriel (the latter for making the former accessible to me). But I have a longer list at the bottom of https://github.com/akkartik/mu
j
@Will This is an area I have trouble expressing my thoughts on. I skimmed chapters 1 and 2 and read chapter 3 of minds in motion. I will even taking the studies as a given, the conclusion drawn from them seem a bit off to me. But that said, I have very bad spatial awareness, have aphantasia, and in general see abstract thought as a retreat away from the spatial/visual world that I feel mostly uneasy in. I see some parallels for sure, but I sure hope my abstract thought isn’t based on my spatial abilities, because I like to think I’m better at the former than the latter. I do think this more experiment based side of things is something I need to read more on though. I have a pretty large bias in this area. Perhaps it comes from reading one too many papers declaring free will to be disproved because of some fmri scan 🙂. I will try reading a bit more of minds in motion with an open mind. Feel free to drop any other works you think are worth reading :) Thanks @Kartik Agaram for the references.
w
@Jimmy Miller IMO the two most influential pieces that synthesize cogsci ideas are TRG Greene’s “Cognitive Dimensions of Notation” and Bret Victor’s “Learnable Programming”. Not sure if you’ve read but both are excellent. Some other fun papers: • Don Knuth invents the term “profile” while studying FORTRAN programs in the wild: http://doi.wiley.com/10.1002/spe.4380010203 • Plan composition shows us how programmers combine abstract program templates to create individual programs, and often fail in the process: http://portal.acm.org/citation.cfm?doid=6592.6594 • Really good summary of a lot of experimental work on programming psychology in 70s/80s: https://telearn.archives-ouvertes.fr/hal-00190531/document
j
Thanks @Will I have read Victor, but none of the rest of them. Have added them to my reading list. Thanks
k
Listening to Tudor Girba today reminds me of this thing I said here a few days ago:
our minds have infinite levels of conceptual hierarchy, but our tools are inevitably limited to some finite number.
Perhaps Smalltalk's object model + GToolkit is able to scale here.