Last night I spent some time documenting an almost...
# thinking-together
k
Last night I spent some time documenting an almost-trivial state machine that I keep getting wrong: https://docs.google.com/document/d/1EbK4AxDCDWonMa8KyGJFX4jllXXLew0qBsGxsmqoYqk State machines are hard! • We usually have one in our head when we program. • The abstract state in my head is usually made up of multiple concrete variables in the program. • Mutations happen to variables, but it's not obvious at each how the abstract state is changing. • The state machine in my head often evolves, which makes documentation challenging to keep up to date. Which is why I think nobody builds documents like these. Tell me how immutability or State Charts will solve all my problems 🙂
i
A visual programming system ... with state charts ... and good control over execution ... will solve all your problems. 🪄 🎩 🐇
c
What about "visual programming system only of state charts" – (too hard to learn or start something new with?)
m
Something like xstate? Although it is javascript focused which might not suite everyone. In my own project I want to combine statecharts with more regular nodes and wires, hope to demo something "soonish" 😊
k
@Maikel yeah for this thread I'm willing to try any tools or languages. I'm just curious to see examples of how others might organize the state machine in my doc to be easier to maintain.
m
k
I'd love to see what the TLA+ solution for this looks like. It wouldn't be directly executable, right? But it's one step more rigorous than the document, is that the right way to think about it?
m
executing it means proving it's a well formed specification
There's a big list of specifications here: https://github.com/tlaplus/Examples/tree/master/specifications
Lamport's TLA page has a video course that I found really useful
k
For me, executing it means when I click on the button I get a plot. And it doesn't hog disk reads too much, it's a good citizen of my computer. If the state machine lives outside the program that executes, how can I get an automated alert when I add a state to the state machine but forget to change the code. That's basically the problem statement. My sense is that formal methods aren't quite here yet. Which is fine, not a criticism. I just want to focus this thread on the best possible solution for a single example. We like programming by example here, right? For this example and these design goals, what is the best representation we can think of that's "hard scifi" (doesn't have to be working code yet, but also doesn't require solving hard unsolved research problems). And I want to see the (pseudo)code 😛 Around the same level of detail as the 5 10-line versions I wrote in the doc.
i
I like programming by hand-waving at things that don't exist yet in the hope that someone else builds them.
k
Go away, Ivan 😛 <obi wan>This is not the thread you are looking for.</obi wan>
m
One idea I would love to have time for is a parser for TLA+ that draws it some way and allows to step through the execution of different "traces"
same for quickcheck style property based testing specifications
j
I have yet to see a trace visualizer for a logical language that was remotely legible, or a way of visualizing the execution of logical code using its source. I'm desperate to use them if they exist, so let me know. The closest I have seen is a system that maintains a set of successful search routes, then turns each into a tree-structured justification where each node of the tree and it's immediate children correspond to a rule clause. There might be things you could do to link those trees to the source code more explicitly in an interface. I've done something similar in Blawx where elements of the tree are linked to source legal text. But that's not execution tracing, it's answer justification.
I have been obsessed with making logical code easier to use for a few years, now, and I don't even know what I would WANT out of a trace visualizer. The only things I can imagine would work, like a zoomable, color coded, annotated search tree, seem like they would be of limited utility unless what you wanted to know was why your code ran so slowly. And you pretty much don't care about that when you are writing things like TLA+ specs.
For the record, I haven't actually used TLA+, but I have taken a lot of inspiration from Hillel Wayne's promotion of it, and of formal methods more generally.
j
Do I understand the state machine correctly in that it starts checking for the file at <interval> when the button is pressed, but never stops if an error occurs in the image generation process?
k
Yes indeed! 🤔
w
Over the years, I've confronted and reflected on this very sort of problem more times than I'm happy to recount. In confrontation, I always hack things together and always have subtle bugs and race conditions. Likewise, pretty much every GUI I've used glitches or breaks under specially-timed sequences of events. In reflection, the difficulty comes from a simple, ideal sequence of events (press button, fill image) running into a messy temporal reality (delay between button press and image being ready) vastly increasing the "what if" state-space. (See also validating input, and dealing with parse errors.) Half of the "what if" could be handled in a principled way, half is more tricky. How much of the UI should freeze up while waiting? Is there any reasonable way to cancel the operation? Added is the simple fact that API/frameworks/tools for building UIs are full of the same kind of glitches.
k
Is the use of a state machine a requirement here? Personally, I'd prefer dealing with your situation in terms of events, both for the code and for the documentation. I guess one can always convert event handling into a state machine, with each event becoming a state transition, but states are not my preferred way of thinking about this.
j
One common way to make this sort of thing suck less is concurrency. When you hear the Go people making a distinction between concurrency and parallelism, it’s because being able to having multiple threads of execution (which, again, needn’t run in parallel) allows you to write code like this in a more straightforward way. You can, in some sense, think of the state machine you’re writing as something like manual loop unrolling — you play compiler in your head to maintain the state of multiple tasks, rather than letting the language/runtime do it. Here’s some clojure code to do something like this using
future
to start a second task:
Copy code
(def image (atom nil))

(defn plot [thing]
  (future
    (reset! image (if-let [filename (plot-thing-return-filename-or-nil thing)]
                    (load-image filename)
                    error-image)))) ; presumably a sad computer face :(

(defn application-loop []
  ,,,) ; call plot when they push the button
m
Feeling the urge to do the whole TLA+ course for no reason other than this
c
As I was reading through your refinements I was reminded of Ian Horrock's description of "the bottom-up approach to user interfaces":
Typically, each event handler starts with a basic sequence of actions that will be executed in response to an event. And because an item may respond to an event in different ways, depending on the context in which it is used, conditional statements are added to determine the context in which the event has occurred and therefore which actions to execute.
- Constructing User Interfaces with Statecharts (1999)
It seems like the LOVE game engine is is nudging you to reason about the behavior of this UI directly in its event listeners (e.g. `load`/`update`/`render`). Specifically, I don't fully understand why it's necessary to check whether the file exists on every frame (i.e.
update
) and defensively guard against it? Feels like complexity is creeping in from that. While a statechart is not going to solve all of your problems, I do think there are compelling reasons it could be useful at the very least to reason through the behavior of this UI.
Copy code
machine Rplot {
  on PLOT -> .querying do resetImage

  initial state idle

  state querying {
    spawn R_query

    from R_query {
      on IMAGE_READY -> drawing do setImageFromFile
    }
  }

  state drawing {
    on RENDER do drawImage
  }
}
This example is of a little language called StateML that I'm currently building out. It enables us reason about behavior higher up the ladder of abstraction and provide "named holes" to fill in implementations when we pull it into the host language latter on. • The
PLOT
event causes a transition no matter what state the UI is in. • In the
querying
state we defined that an
R_query
activity is spawned. The lifetime of an activity is bound to the state it is spawned in and the statechart can react to events sent directly from that activity. The implementation of
R_query
is left open, the only behaviorally important aspects are that it emits an event that tells us when the image file is ready and has logic to cancel/ignore the query if another PLOT happens. If you want to implement that with polling or some other method it's up to you. • Once the query finishes we enter the
drawing
state and draw the image on each frame. • Even for this simple example there are cases that are not covered like if the image fails to generate or maybe takes too long what should be do. We can easily specify that additional behavior:
Copy code
machine Rplot {
  on PLOT -> .querying do resetImage

  initial state idle

  state querying {
    spawn R_query

    from R_query {
      on IMAGE_READY -> drawing do setImageFromFile
      on ERROR -> idle do displayError
    }

    after 30s -> idle do showTimeoutError
  }

  state drawing {
    on RENDER do drawImage
  }
}
g
Roughly, this is how I would solve the problem, as I understand it. I’ve created the code for the control flow portion of this and have left hooks for the non-control-flow stuff like image rendering functionality. I include the way that I would test this, too. https://github.com/guitarvydas/kartik/blob/dev/kartik-RPlot-black-box.drawio.svg Deep in the bowels of this thing, I draw a State Machine something like https://github.com/guitarvydas/kartik/blob/dev/image-cache.svg code repo: https://github.com/guitarvydas/kartik
Screenshot 2023-08-15 at 10.24.57 AM.png
l
I'm just curious to see examples of how others might organize the state machine in my doc to be easier to maintain.
By drawing lots of pictures (on paper, post-it notes, napkins, and sometimes tldraw). Here's the state machine we drew when we were building 'cursor chat':
A week later, we decided to change the state machine, and I was able to communicate this by adding to the drawing. (spot the extra blue arrow)
We also found a bug, which we were able to communicate by using the drawing:
Doesn't have to be tldraw of course, just any drawing tool, physical or on-screen. But that's how we usually document state machines, and our changes to them :)
k
I got some very nice suggestions, thanks everyone. Keep them coming! Sorry I haven't responded here in a bit. Life has intervened. But I'm not going to forget it.
w
I stumbled on an absolutely golden example (link is cued up)

https://youtu.be/XZuKakxAQ9o?t=848

.
m
Since we are in game-dev land – and even if limited to animations/interactions – I find the way in which Rive allows users to build state machines to be so explicit my brain doesn't have to hurt thinking about them. This is an intro to it:

https://www.youtube.com/watch?v=r17OBczdK4c

j
If two different "states" can be active at the same time, then it's not a state machine any more, right? It's a petri net, or a nested state machine, or a process diagram, or something else?
l
If you're referring to the diagrams I posted: A state chart! But really, only one child node can be active at once. We just save ourselves drawing lots and lots of arrows in the diagram, but we could draw each one individually if we wanted to. It's just short-hand.
k
One reason I've been slow to respond is I've been reading https://www.state-machine.com/doc/PSiCC.pdf I didn't really understand state charts at all until now. Particularly the hierarchical aspect. The first few pages of this book have already helped me a lot.
j
The charts seem straightforward enough. And nested state machines are a thing. But Rive, for example, uses the word "state" to refer to the "currently running animation". If you want two animations to start at the same time, or overlap, all of which Rive allows, then they aren't "states" really, are they? I find there is a correlation between how understandable a state machine diagram is, and how "not really a state machine diagram" it is. 🙂
There might be a state machine you could find, at some layer of abstraction. But if your "states" have things like "duration", then they aren't really states, right? Or am I out to lunch.
I ran into this problem just yesterday, with the standard method of conceptualizing and visualizing logic programming traces, which is a "state machine" in which each "state" is a predicate with two transitions in and two transitions out. byrd boxes, they're called. The conceptualization and visualization is maybe helpful, but describing it as a state machine is violence against human thought.
k
if your "states" have things like "duration", then they aren't really states, right? Or am I out to lunch.
I tend to think of this as state machines with more or less accidental complexity. But maybe there's some canonical definition of a state machine that I'm missing..
l
Ah yeah, I mean "state" is a really overloaded word in programming :) At tldraw we have a State class and library, which is an entirely different thing from our State library 😅 In the example you mention, it could be referring to the state of some data, simply. Or it could mean that there are multiple & separate state machines going on at the same time (which I often do). Another misunderstanding I often see is people thinking that all state machines need to work the same way. At the end of the day they need to solve a problem that you have. tldraw uses a state chart because it allows us to keep the editor's tools separate (which are just state machines of their own).
k
Oh, so tldraw supports state charts? Or is implemented using state charts? I wasn't aware they were available in production somewhere.
l
It's tools are implemented as part of a big state chart. It's structured something like this:
There's more info on the canary docs site, but undergoing a lot of changes at the moment (so there's a lot of broken links): https://canary.tldraw.dev/docs/tools
j
My understanding of a canonical state machine is that nodes represent all the interesting and possible point-in-time conditions of the system, for whatever part of the system you are modeling, and the edges represent the possible transitions between them.It's the temporal mutual exclusivity of the nodes that Rive seems to violate. If you're interested in formally modelling something including the passage of time, you need something more like LTL
Arguably, "each state is an animation" is marketing speak, and you can justify calling it a state diagram by saying "each state is the animation or animations that are currently running and what will happen next, and when and why." So the node and the details on the outgoing edge would be the actual "state." But that wouldn't help anyone understand what will happen. So the more accurate it is as a canonical state machine, the less useful it is as a visualization. QED. 😉
(Ignore me, I'm cranky.)
Jasons advice in a nutshell: draw helpful pictures, and don't worry about whether or not what you drew is a state machine.
l
"Sorry to interrupt. I've just realised that this isn't an example of a state machine. It's actually a..."
i
Seems like my first comment actually captured where @Jason Morris landed. Good stuff. Glad I just hung out on the sidelines and let y'all come around on your own 😉
k
Yes @Ivan Reese, you are out of the penalty box and now permitted to speak. I'm not sure I follow who is responding to whom 😄 but I tend to agree with you @Jason Morris that multiple active states don't a state machine make. If the set of states is partitioned you could say you have two state machines running independently. If there's overlap the state machine metaphor is a net liability for understanding. On the other hand, I think attaching attributes like a duration is fine? And an animation by itself is fine; clock ticks feel like fine events to base state transitions on. Going back a bit:
The charts seem straightforward enough. And nested state machines are a thing.
I've been seeing these state chart pictures all my life, and I've even tried reading the papers a few times. But I totally miss (or quickly forget) one key property. In a state chart enclosure is backstopping. If a state encounters an event it can't handle, it passes it up to the enclosing state. It seems obvious now that I write it out like that. And it's isomorphic with my previous understanding: that hierarchies are a way to extract common transitions and simplify the state diagram. But it still felt surprising. I think I have some theoretical knowledge but zero "fingertip feel" for state charts based on personal experience. Which makes all my opinions on it suspect..
Going back a bit, @Konrad Hinsen I want to probe at your position in support for events. It seems to be the opposite of @Jack Rusher's suggestion of using concurrency, which I'm more sympathetic to, and which has a long line of recommendations from Unix pipes, Avi Bryant's use of closures in Seaside to Javascript callbacks and async/await syntax, Erlang, Golang, etc.. Simon Tatham made a compelling case long ago that code tends to be simpler if you can write it as the caller, in a top-down, outside-in manner, rather than as the callee. Events seem to go against that recommendation. So I'm curious to hear more about why/when you find events simpler. With hindsight, I'm inclined to blame some of the complexity of my original document on LÖVE's event-driven model. Yes, events can sometimes obviate the need for reasoning in terms of a state machine, when the problem goes with the grain of the events available to you. But the set of events you have available to you is not typically in your control.
j
Note that “several state machines operating in parallel that can send each other messages that cause state changes” is both how much of biology works and Kay’s original vision for OO.
k
@Jack Rusher I really liked this talk by Rich Feldman which asks, "how is it helpful to turn every program into a distributed program?"

https://www.youtube.com/watch?v=6YbK8o9rZfI

j
Agree that putting an actual network in the loop is seldom a good thing for most purposes, though essential for certain kinds of reliability (e.g. wide-area disaster recovery).
k
@Kartik Agaram I see events more as a way to think about and visualize state and state changes. Event handling in the form of frameworks and callbacks is something I have used so little that I don't really have an opinion about it. For implementing an event-based model I'd messages, if necessary with concurrency.
g
IMO, Events are not bad. Unstructured use of Events is bad. “Complexity” is not caused by Events, but, by the unstructured use of Events. The Holy Grail of Programming Language design is the encouragement/enforcement of the use of Nesting. It is folly to treat, both, control flow composition and data composition in the same way. Pub/Sub, CPS, FP, UNIX pipelines, etc. are unstructured. ... more ... https://publish.obsidian.md/programmingsimplicity/2023-08-23-Structuring+of+Events
k
Thanks again, I really appreciate everyone's comments. Here's an extremely abstracted set of notes from this thread that I'm hoping will help me not forget lessons learned: http://akkartik.name/post/20230826-state-machines. It may not be obvious (and I'd love to answer questions if anything isn't clear) but everything in there came from all of you in this thread (and @guitarvydas's Discord, where I asked the same question).
l
g
• I find that just drawing the diagrams, even using cave-man technologies, makes code “fall off the bone” (i.e. easy-to-write). Is that your conclusion, too? • Every time I feel unsatisfied about code that I’m writing, I scribble a diagram of my intentions. • The {cellpond, coding, drawing} diagram immediately makes me think of Harel’s Statecharts. Are you aware of StateCharts? If not, here is my reading of Harel’s original paper https://guitarvydas.github.io/2020/12/09/StateCharts.html • Thoughts such as “It looks quite complicated...” can be tamed in the same way that good PowerPoint presentations are tamed. What I call The Rule of 7. No more than a few items on any one slide. Elide, defer, hashtag in a structured manner. In Programming, I conclude that The Rule of 7 translates into Nesting (goal: Russian Dolls for code (aka Matryoshka dolls)). If I squint in just the right way, I see Lambdas as being Nesting applied to textual code, and, Structured Programming is Nesting applied to textual code, and, “{...}” is the ASCII Art form of Nesting applied to textual code. Global variables and free variables are Anti-Nesting. https://en.wikipedia.org/wiki/Matryoshka_doll
l
- I like "fall off the bone" as an analogy (although I'd love a vegetarian version 😄). I do find that diagrams like this help with it a lot. Not only state machines - any diagram helps. But usually, instead of drawing first, and coding after, I prefer to do them roughly at the same time, bit by bit. Like breadcrumbs through a maze! Or afterwards, so I can communicate how it works if I'm working with a team. - I used to be a teacher. We tried to encourage kids: "If you don't understand a maths problem, draw a picture". I think it's been drilled in to me as well. - Yes I'm aware of state charts :) We use it in tldraw. It helps to organise different 'tools' a bit better. Thanks for the link though, I'll check it out! Cellpond would have been much easier to make if I had modelled it as a state chart. I ended up rewriting a lot of code, and it became hard to make changes to it. But for my own personal projects, I've never landed on an abstraction for state charts that feels nice. I like my state machines to be quite loose and easy to mess around with. State charts sometimes feel more complex and ingrained. And I've never seen an implementation that really makes the most of the nesting. I'm sure there's a way of achieving it though. Not enough time! - The Rule of 7 seems to apply here :)
j
I typically use pencil and paper as a “tool for thought” during the design of… almost anything. 🙂
g
I disagree with Feldman’s conclusion in

https://www.youtube.com/watch?v=6YbK8o9rZf

. At about 36:00, he repeats a tired fallacy and uses it as a basis for his conclusion. Contrary to what is commonly believed, Smalltalk Message-Passing is not the same as Message-Passing in distributed systems. There is a huge difference between synchronous Message-Passing and asynchronous Message-Passing. The over-use of synchronous Message-Passing begs for Accidental Complexity (aka ad-hoc workarounds, aka make-work, aka epicycles). Smalltalk’s synchronous Message-Passing is just function-calling using a case-on-type dispatch mechanism using named parameters. I believe that the phrase “Message Passing” should be only used for describing asynchronous message passing.
k
@guitarvydas Is your point that Smalltalk is more complex than distributed computing because it's sync? Or that it's simpler?
k
I suspect that the confusion comes from (1) Smallltalk and its jargon being older than most of today's widely known technology and (2) most people who talk about Smalltalk having only a superficial knowledge of it. But fundamentally, the common language notion of "sending a message" does not imply "and wait for a reply", so Smalltalk's use of the term is not helpful.
g
Sync programming is complicated Async programming is complicated. But, they are not the same. I argue that adding the meme of little networks to your toolbelt can result in better solutions, and, can, at least sometimes, be more efficient. Using the wrong meme to solve a problem causes complication (Accidental complexity, workarounds, epicycles). Choose your weapons, don’t have them forced upon you. Don’t approach every problem using only a single meme. Sync is good for building fancier calculators, e.g. ballistics calculations for cruise missiles, cryptography, chatbots that hallucinate and create B.S., etc. Async is good for GUIs, internet, robotics, blockchain, p2p, etc. I claim that async is a superset of sync. Async can be thought of as a bunch of sync nodes sending async messages to one another. Feldman claims that FoC will come about through the vehement avoidance of the little network meme. He bases his conclusion on the faulty reasoning that the sync meme is the only way to create little networks. I claim exactly the opposite. FoC must evolve towards little networks - networking in the small. Maybe closures+queues plus a way to compose solutions using multiple memes. Much more here: https://publish.obsidian.md/programmingsimplicity/2023-08-28-Sync+Programming+vs+Distributed+Programming
k
@guitarvydas I don't think you have enough evidence to make your claims quite so strong. Is it really fine for a little network to squirrel away little bits of state all over the place? Is it really just the fault of all those dang synchronous returns? You need a lot more evidence for both sides of that claim. Why is it only synchronous returns that couples all parts of a system together? Why does async magically make them decoupled? Why does state bear zero responsibility for coupling?
Async-based senders don't wait for a result by default, and sometimes - but only sometimes - need to bother with sync'ing later.
This seems completely at odds with my experience. The number of times in my life that I've made an async call truly fire-and-forget is at most 2. With a generous margin for forgetfulness. Invariably I want to check for errors. At best there is syntax to hide the error checking -- but it does that for sync calls as well. So I fail to see what the fuss is about. So far, sync vs async seems like an implementation detail. I don't think calling things epicycles or tired or fallacious or vehement is very persuasive by itself. At best you can claim, "look, here's an alternative prognosis for these symptoms that I'm working to falsify." And tell a story about how that might be the case. So far I haven't seen either enough evidence or a compelling alternative narrative.
m
Seems relevant here https://stately.ai/
j
Probably a fine tool, but also rolling my eyes at the “.ai” domain for a thing that seemingly has no AI component… 🙄
l
Hey DavidKPiano (mr stately) is a lovely guy, he bought me a drink, he gets a pass :)
k
An interesting reference I came across on Mastodon today: https://scholar.social/@surabax@mastodon.ie/111020010181091959
HTTP was design[ed] as a distributed realization of the Objective C (originally Smalltalk) message passing infrastructure [...]
Uniform Resource Locator is just the result of squeezing the term object reference through the IETF standardization process.
j
“Concurrent programming in this style is interesting for reasons not of efficiency but of clarity.” (A nice writeup on what I mentioned above) https://swtch.com/~rsc/thread/
m
tldraw team was reading this thread? 😄 https://twitter.com/tldraw/status/1725083976392437894
i
I believe this was @Lu Wilson (who works at tldraw and AFAIK is their 🧑‍🌾 of tweets) But (getting ahead of the inevitable tangent) if we're gonna start talking "make real" we should spin off a new thread.
m
Oh ahah!
l
hey. that one was Steve Ruiz's tweet! it's around 50:50 me and him on the account :) feel free to make a new top-level post if you have any other questions/comments