Sonification of eclipse's internals :o <https://gi...
# linking-together
m
Sonification of eclipse's internals :o https://github.com/laffra/cacophonia
🤯 1
😹 2
🆒 1
s
omg this is freaking amazing
e
sounds a bit like Lalo Schifrin
🤔 1
c
Hi All. The author here. I think we should work on making complex systems more insightful by showing what is happening in the system when it runs. Cacophonia is a concrete example. Has anyone worked on something similar?
😍 2
m
@Chris Laffra one of them I really, really admired recently was https://justine.lol/blinkenlights/index.html . Kinda a gdb TUI but way more fun.
Watching this program execute (c / ctrl-c) at adjustable speed (ctrl-t / alt-t) it becomes clearer that the process of printing an image is basically a pipeline that goes: IDCT → Y′CbCr to RGB → decimate → sharpen → block render.
❤️ 5
k
Thanks for the bump! I'm watching the Cacophonia video linked from OP (

https://www.youtube.com/watch?v=-lYNQ7SXGKY

), and finding it very thought-provoking.
Has anyone worked on something similar?
I'm inspired to write out as much of the idea maze as I'm able to glean. The problem of chunking complex systems into comprehensible units seems to get commonly attacked in two ways: by chunking static source code, or by chunking dynamic run-time information. I tend to categorize both approaches along two axes: whether they're general-purpose or for a specific project, and the level of effectiveness (a subjective metric based on my hope for how far we can get). Chunking static source code: In this direction lie all HLLs, the original advance of chunking machine code into higher-level operations. They're general purpose (lots of general-purpose HLLs), and they get customized by individual programmer decisions around how to organize HLL code into hierarchical layers of abstraction. I tend to think of these as conservatively at around 10% effectiveness. For a newcomer any large codebase has at least 10x noise in exposing architectural features in the neighborhood of
main()
. (Personally I'd actually put it at 0.1% and 1000x respectively. But I'm giving it the benefit of the doubt.) Then again, I'm not very experienced with Haskell or OCaml. It's possible one or both are the correct answer[1] and we just haven't come around to them yet. An alternative direction is APL. Also general-purpose, ostensibly [2]. However, it eschews abstraction and relies on the language itself for most chunking duties. The communal goal is to minimize the number of function names and instead learn to recognize phrases of operations. Compare Forth as an exemplar for HLLs, where the goal is short functions[3]. I don't know much APL, so I'm not sure what the effectiveness score would be. It's possible APL is the correct answer. The second major approach is chunking run-time information. Most approaches here focus on visualizing a single run. Imagine a processor emulator that emits a stream of instructions, potentially with registers and memory state after each instruction. This is inefficient, but more seriously also noisy. (I worked at this level in grad school.) Debuggers are one approach to address both. They chunk at the level of the HLL, so I tend to give them the same effectiveness score. Logs are a bit better just because they're more programmable than debuggers. A program with well-cultivated logs can provide a lot of sign-posts to aid insight. Maybe 15% effectiveness. The state of the art in visualizing run-time information seems to be Smalltalk, particularly with @Tudor Girba's Glamorous Toolkit. It applies to arbitrary programs and allows arbitrary visualizations using the facilities of the whole language at run-time. Maybe 30% effectiveness? I don't know enough. It's possible this is the correct answer. Cacophonia is an interesting point of comparison. It's extremely specific to a single codebase, but in exchange, based on one video, I'm inclined to give it 60% effectiveness. In principle GToolkit could do everything it does, but GToolkit visualizations seem anecdotally to be less ambitious. On the other hand, you don't have to restart the object under test. Like I said, GToolkit could be the correct answer. But wait, this is all for a single run. All these systems still leave it entirely up to the reader to synthesize a mental model of the state space of a program from many interactive runs. How the complex system behaves in different regimes of inputs or its environment. I'm not aware of any work here so far. Maybe types would help here, maybe Haskellers are careful to use category theory to ensure their programs' state spaces don't need running to comprehend. Alternatively if run-time information is needed, tests seem like a promising direction. Each test is a run that you can apply visualization to, and chunking all tests into groups with "similar" runs feels like a fruitful direction to explore. That's my idea maze. I'm actually skeptical of all these approaches because of their shared desire to work on arbitrary programs. I tend to observe a pattern where roads created to fix traffic jams only help to perpetuate them. Suppressing wild fires or bubbles only causes the rare ones to be more destructive. Normalized risk stays the same or may even go up. I think there's an underlying problem with the way people write software, and better tooling alone feels unsustainable as a way to attack it without some effort to also influence how people grow programs. Which leads me to a question: given what you've learned about Eclipse and the effort needed to build Cacophonia, is there anything you'd want to tell its programmers if you could travel back in time? What could they have done to make it easier for people to comprehend its big picture? Assume they have infinite resources and virtuous incentives 🙂 [1]

https://www.youtube.com/watch?v=fbsCvoWK1bE

[2]

https://www.youtube.com/watch?v=hzPd3umu78g

[3] https://merveilles.town/@akkartik/105697290330678192
😍 3
t
This is an interesting writeup. I particularly like how you focus on effectiveness. And you point out that for a narrow case, a dedicated tool outcompetes a generic tool. Now, the thing is that in software, all cases are narrow. We need custom tools for each of them. What’s more, even if some tool exists, there will be questions that will require building the next level of abstract mental model, which in turn, require a new tool. The Cacophonia video very nicely shows why these questions build on each other. The mapping on auralization is brilliant and can address different questions. That is mapped through a drop down. Removing parts from the map is another customization tool that allow us to focus on a narrower scope. Presently that happens through clicking. This is interesting as long as that is what you want. Colors emphasize different parts as long as that decomposition is what you are interested. It’s also very interesting how going from the visualization to the debugger becomes powerful. Of course, the next wish is to have those integrated. Why would you even want to look at the stack necessarily once you have a more interesting mental model? Stepping back, all questions, at all levels of abstraction and granularity, benefit from tools. That’s why we think that to advance the global maximum, we need to focus on how we can systematically build these tools. There are great advancements regarding to what can be expressed visually for arbitrary domains and various purposes. Much effort is focused on usability. With our work on Glamorous Toolkit, we show that we can raise the level of abstraction of what we express for the purpose of building mental models. Glamorous Toolkit is not a tool to visualize, it’s a language with which to construct the tool to visualize. And Moldable Development is the practice. I am very much certain there are other languages like these that will be created in the (near) future.
👍 3