Inspired by <https://futureofcoding.slack.com/arch...
# thinking-together
k
Inspired by https://futureofcoding.slack.com/archives/C5U3SEW6A/p1672756060085649 and https://futureofcoding.slack.com/archives/C5U3SEW6A/p1672816690782479, plus my daily work with a Smalltalk system, I started thinking about high-level architectures of information processing systems. Spreadsheets are two-layer systems, with a data grid on top and a code grid below it. That's a good architecture for dealing with heterogeneous grid-shaped data and shallow computation. For homogeneous grid-shaped data (arrays, data frames) you'd prefer to compute with the grid as a whole, and for complex/deep computation, you want a more code-centric architecture. You can of course prepare the complex code in some other architecture and just call it from a spreadsheet. High-level architectures can be composed. Dataflow graphs, of which Data Rabbit is an amazing implementation, have nodes containing code and data flowing through the edges. They can deal with irregularly shaped data, even messy data, but, like spreadsheets, they are limited to shallow computation. A Smalltalk image is a code database built on top of a an unstructured "object lake". It's great for dealing with complex code, but has no high-level structure for data. You can, and have to, roll your own. From this point of view, a Smalltalk image is the perfect complement to a spreadsheet or a data flow graph, having opposite strengths and weaknesses. So... are they more such high-level structures that have proven useful in practice? Is there just a small set whose elements can be combined, or should we expect a large number of unrelated architectures being good for specific purposes? Note that I am thinking about "good for", not "applicable to". All Turing-complete systems are equivalent in principle, but that doesn't make them good tools for all purposes. My question is ultimately one of human-computer interaction.
d
I don't watch videos normally, because I can scan text in a tenth of the time, so I'm not entirely sure what you're asking here: do you mean are there architectures that structure both code and data in the UX?
k
Could be UX, as in spreadsheets or in Data Rabbit, both of which target end users. But in a scenario where developers and users are distinct, it would be the top level at which developers work when assembling a system.
g
Have you looked into pandas python library. it has a data frames class that operates kind of like the data layer of a spread sheet.
j
The Smalltalk "object lake" (love the term) is, crucially, organized as a tree. There's a similar structure to the objects in CLOS. I would say that a tree of things that inherit (for some definition of "inherit") traits from their parents is one of these "mother sauce" sorts of things.
j
I’ve actually been thinking a lot about this topic as we’ve been building a new spreadsheet web app. What we came up with was to nudge folks to input in their data properly and separate data from their analysis (all the math and functions). So that’s why our app looks like table grid blocks on an endless canvas vs the more traditional spreadsheet editor which is just one big table grid
k
@Jason Chan Separating data from analysis is very much the mainstream approach. What's interesting to me is how they end up being put together in the end. The mainstream there is having the code manage the data in an ad-hoc fashion, through the OS' file system. So how does this work in your new spreadsheet? You have the data grid as in traditional spreadsheets. How is it hooked up with the code?
@Jack Rusher Yes, tree structures are important. But I don't agree that Smalltalk or CLOS objects form a tree. Cycles are permitted, and actually quite frequent, at least in Smalltalk. So at the lowest level, it's just objects pointing to each other. BTW, I didn't make up the term "object lake", I saw it somewhere, but don't remember where!
@George Pandas works much like other data-frame-based languages and libraries (R, ...): the data takes the form of tables (homogeneous in rows, heterogeneous in columns), and the code operates on the aggregate (table, row, column), not the individual data item as in spreadsheets.
j
I'm talking about the tree (which can reticulate, of course) represented by (in the Smalltalk context) the class browser. The objects themselves relate to that tree-shaped thing in specific ways, &c. There's similar stuff going on in CLOS, but with different affordances for navigating those relationships.
s
Hey @Konrad Hinsen. I work on a reactive architecture library called Behavior Graph. https://github.com/yahoo/bgjs It has the interesting dataflow dependency properties of spreadsheets but is just normal code. We use it heavily in a complex production codebase. I love talking about it, so let me know if you want to hear more.