Imagine a finished system that runs (in the sense ...
# thinking-together
f
Imagine a finished system that runs (in the sense of there are no runtime bugs) that has lots of threads/routines with all kinds of message queues/channels that interact. I am wondering what kind of visualization tools exist that help me understand where possible bottlenecks are, which parts are active and where there is congestion? The closest I can kind of think of is VisualVM , but it is still kind of basic in the sense that it only gives visually access to stuff like fine grained memory usage, performance of subparts, which threads are active, idle, parked but does not give me a lot of insight in how the different system parts interact with one another. I admit that I also don't know exactly what I am looking for and therefore don't know what I should be typing into my favorite search engine. I am sure somebody has built more elaborate tools in that direction and would like to hear your thoughts and suggestions of what I should be looking at. So as an example, let's take a simple producer/consumer model on a channel. Let's say the producer is producing more stuff of than the consumer can take from the channel. One could imagine a node (consumer/producer) and a line (channel) visualization where one sees how active the producer/consumer is (via a red to green spectrum), maybe also one could zoom in to see which subpart is active and how full the channel is (maybe also via a red to green spectrum). That is just an example that came to mind in which the channel would be become dark red over time as it becomes "congested". Keen to hear your thoughts/ideas.
m

https://www.youtube.com/watch?v=KVbTjlZ0sfE

https://www.youtube.com/watch?v=MYHf_BXWuOc

f
Do I understand it correctly that you need to instrument your code for this to work? It's not something that is sort of build in into the lang/env.
m
it's just the visualization part
t
Thinking of a generic visualization can be interesting, but I would suggest to consider starting from your context and the hypotheses you already have. Chances are that you rely on one or more frameworks, some of which might be custom made. Maybe you even have naming conventions for different processes or queues. To get value from the analysis, you will want to start from those and walk your way up. A challenge will be the scale. However, before you tackle the scale, I suggest to consider tackling a small scope. For example, focus on a single message. Can you trace that through the system? If not, you might want to change the system to ensure you can - perhaps logging. Once you have that information, you will find that it’s much easier to aggregate it across multiple queues. You might also find that it’s cheaper to create a tool that works for you than to adapt to a generic tool that might not fit, regardless of how fancy the look and feel is. A tool is most effective when it allows you to (in)validate your hypothesis.
d
It sounds a lot like the Erlang observer. It is not a graphical display but can give you an idea of what information is important
https://zipkin.io could also be interesting to you