Max Krieger
01/13/2021, 8:07 PMMax Krieger
01/13/2021, 8:08 PMSrini K
01/13/2021, 9:48 PMEmmanuel Oga
01/13/2021, 9:51 PMChris Laffra
03/05/2021, 8:20 AMMax Krieger
03/06/2021, 11:19 PMWatching this program execute (c / ctrl-c) at adjustable speed (ctrl-t / alt-t) it becomes clearer that the process of printing an image is basically a pipeline that goes: IDCT → Y′CbCr to RGB → decimate → sharpen → block render.
Max Krieger
03/06/2021, 11:21 PMKartik Agaram
Has anyone worked on something similar?I'm inspired to write out as much of the idea maze as I'm able to glean. The problem of chunking complex systems into comprehensible units seems to get commonly attacked in two ways: by chunking static source code, or by chunking dynamic run-time information. I tend to categorize both approaches along two axes: whether they're general-purpose or for a specific project, and the level of effectiveness (a subjective metric based on my hope for how far we can get). Chunking static source code: In this direction lie all HLLs, the original advance of chunking machine code into higher-level operations. They're general purpose (lots of general-purpose HLLs), and they get customized by individual programmer decisions around how to organize HLL code into hierarchical layers of abstraction. I tend to think of these as conservatively at around 10% effectiveness. For a newcomer any large codebase has at least 10x noise in exposing architectural features in the neighborhood of
main()
. (Personally I'd actually put it at 0.1% and 1000x respectively. But I'm giving it the benefit of the doubt.) Then again, I'm not very experienced with Haskell or OCaml. It's possible one or both are the correct answer[1] and we just haven't come around to them yet.
An alternative direction is APL. Also general-purpose, ostensibly [2]. However, it eschews abstraction and relies on the language itself for most chunking duties. The communal goal is to minimize the number of function names and instead learn to recognize phrases of operations. Compare Forth as an exemplar for HLLs, where the goal is short functions[3]. I don't know much APL, so I'm not sure what the effectiveness score would be. It's possible APL is the correct answer.
The second major approach is chunking run-time information. Most approaches here focus on visualizing a single run. Imagine a processor emulator that emits a stream of instructions, potentially with registers and memory state after each instruction. This is inefficient, but more seriously also noisy. (I worked at this level in grad school.) Debuggers are one approach to address both. They chunk at the level of the HLL, so I tend to give them the same effectiveness score. Logs are a bit better just because they're more programmable than debuggers. A program with well-cultivated logs can provide a lot of sign-posts to aid insight. Maybe 15% effectiveness.
The state of the art in visualizing run-time information seems to be Smalltalk, particularly with @Tudor Girba's Glamorous Toolkit. It applies to arbitrary programs and allows arbitrary visualizations using the facilities of the whole language at run-time. Maybe 30% effectiveness? I don't know enough. It's possible this is the correct answer.
Cacophonia is an interesting point of comparison. It's extremely specific to a single codebase, but in exchange, based on one video, I'm inclined to give it 60% effectiveness. In principle GToolkit could do everything it does, but GToolkit visualizations seem anecdotally to be less ambitious. On the other hand, you don't have to restart the object under test. Like I said, GToolkit could be the correct answer.
But wait, this is all for a single run. All these systems still leave it entirely up to the reader to synthesize a mental model of the state space of a program from many interactive runs. How the complex system behaves in different regimes of inputs or its environment. I'm not aware of any work here so far. Maybe types would help here, maybe Haskellers are careful to use category theory to ensure their programs' state spaces don't need running to comprehend. Alternatively if run-time information is needed, tests seem like a promising direction. Each test is a run that you can apply visualization to, and chunking all tests into groups with "similar" runs feels like a fruitful direction to explore.
That's my idea maze. I'm actually skeptical of all these approaches because of their shared desire to work on arbitrary programs. I tend to observe a pattern where roads created to fix traffic jams only help to perpetuate them. Suppressing wild fires or bubbles only causes the rare ones to be more destructive. Normalized risk stays the same or may even go up.
I think there's an underlying problem with the way people write software, and better tooling alone feels unsustainable as a way to attack it without some effort to also influence how people grow programs. Which leads me to a question: given what you've learned about Eclipse and the effort needed to build Cacophonia, is there anything you'd want to tell its programmers if you could travel back in time? What could they have done to make it easier for people to comprehend its big picture? Assume they have infinite resources and virtuous incentives 🙂
[1]
[2]
[3] https://merveilles.town/@akkartik/105697290330678192Tudor Girba
03/07/2021, 7:56 AM