Random thought about Things Spreadsheets Got Right...
# thinking-together
s
Random thought about Things Spreadsheets Got Right – I've heard a lot about "no invisible state" etc, but another interesting thing about spreadsheets is that they treat data as a first-class citizen in a way that most programming models don't. In a conventional programming language, you model processes and leave the data implicit; in a spreadsheet, you model data and the processes are implicit. The data is what you see in the cells, and the processes are plumbing that you only see by inspecting the cell that the data lives in. What other programming systems start with the data you want to process before you write the process itself? And maybe spreadsheet programming's lack of implicit state is actually a consequence of this data-first paradigm, rather than an end in itself. In fact, even calling data "state" implies a kind of process-first mindset, only conceptualising data through its relation to a process. And this could be a key blind spot for programmers creating systems for non-programmers, or rather process-first people creating systems for data-first people.
o
Algorithms and Data Structures. These are the ying and yang of computer engineering. I always felt that it's missing the actual data. We have a trinity: Algorithms + Containers + Data. In a way, first data engineer.. then big data and now LLMs is computer science discovering data.
k
In Glamorous Toolkit, I often work mostly data-first.

This short video

shows how this works. Once you have the raw data in your system, you write views to display it in a domain-specific way, and then you go on writing the code to process. Full data-first, like in a spreadsheet, is possible only if your programming system provides entry forms for exactly the data structure you work on. E.g. a grid of cells in a spreadsheet. In any other scenario, you first need some code to get your data into the system.
j
Again, imho, worth thinking: what if your data consists of millions of rows. Data-first aspect gets a twist then.
i
^ also, worth ignoring. There’s a popular rejection of ideas that don’t have line-of-sight to use “at scale”, which manifests as a pressure to validate ideas based on how they scale and reject other kinds of validity. It’s quite harmful in situations like the one Sam raised where we’re interested in human factors, because considering scale is often a consideration of the machine factors (or business factors). I don’t want to disregard scale entirely, but just caution that it’s too often given too much weight (especially here in FoC circles).
On the other hand, I hope some people see it as a design challenge and go do cool human-centered things that scale to massive datasets. That’d be interesting. But it’s not necessary, and often misleading.
j
It is worth thinking and gaining perspectives. Some thoughts are then worthy - or not, it depends. For instance for me, it is clear that data-first has certain limits, and eg. model-first thinking is more powerful approach. If a cell in a spreadsheet has a lot of data, there is basically no practical way of inspecting it. If it has a lot of functions, there is a little chance to make sense out of it, which is at least possible with programming langs
k
Size is a distraction, in my opinion. The two most relevant asepcts of data are (1) its representation in the computer and (2) its representations suitable for humans, both for inspection and for manipulation. The relation between (1) and (2) is highly domain-dependent, and that makes it difficult to come up with general principles.
d
feels bigger than data vs process.. spreadsheet models space and leaves time implicit, vs etc
j
PANE elevates data above process in a softer way, doing a “data on nodes, processes on wires” thing.
k
@Joshua Horowitz How does PANE manage structured/complex data? An address book, a protein structure, a CAD blueprint, ...
j
@Konrad Hinsen It doesn’t do anything particularly special to visualize complex values. It shows nested boxes by default, and lets the user (tediously) define custom visualizers if they like; see 13:42 in the video for an example of that. My point was about the way the overall interface prioritizes viewing data over viewing code – spreadsheets hide code behind values, PANE relegates code to edges rather than their typical prominence inside nodes.
i
"relegates" / "prominence" — this sort of dismissive talk is why I became a staunch advocate for edges. They get such short shrift, and I've had enough of it! EDGES HAVE THE EDGE!
j
ok
k
@Joshua Horowitz The reason for my question was to explore how general PANE's approach is. Custom visualizers are code, but that code is not on the edges if I understand the video correctly. And for both data and code, there's the question to what size and complexity this approach (or in any fact any visual approach) scales in practice. Spreadsheets are made for very shallowly structured data and short code snippets. Which is fine, there are lots of use cases. But as someone working with complex data and complex algorithms, I wonder where the limits if visual techniques are.
g
The core virtue of spreadsheets is kinda RESTful: every value has an address. From there: the addresses have a 2-dimensional structure, which leads to various ways of treating a block of addresses as a table and so on. The addresses support navigation. This, again, is a RESTful sort of idea.