I'm slowly making Mu safer and more testable, but ...
# two-minute-week
k
I'm slowly making Mu safer and more testable, but it's now at the point where I'm also starting to have some fun every week. Hopefully y'all agree. https://archive.org/details/akkartik-2min-2020-07-01 Mini rant on prototypes. While a prototype like this would possibly be less work to build on say Python, I think building it atop Mu provides some significant advantages. Research prototypes are not entirely throw-away; the goal with them is to accumulate learning even while the codebase churns. Because it relies on so few dependencies, Mu is more robust to bitrot. This makes it easy for someone else to reproduce an experiment. Even relatively far in the future. I really hope we can converge on a shared platform something like this for research prototypes. It does still need graphics and sound, though..
šŸ‘ 2
i
"Oh, you're still here?" Kartik, I am extremely here.
ā¤ļø 1
What's the thinking behind only having hex numbers? That
0x5f5e100
might as well be a roman numeral, compared to the adjacent
100 ms
comment.
g
yeah, i’m never going to stop following this project
ā¤ļø 1
k
Hex started out as just a temporary thing, but changing it now creates cascading consequences. * It's weird to support it in Mu but not SubX.
10
may mean different numbers in one file vs another. * Supporting it in SubX requires writing more SubX without syntax sugar. Including improving error messages in some situations. * Supporting it in SubX requires thinking about all the 2-digit opcodes in SubX that I've been writing in hex because that's how the Intel manual represents them. But yeah, it is a gotcha that
10
means 16. There's a reason why I always prefix numbers with
0x
unless they're a single digit. I might revisit this. A PR would certainly put me over the edge..
šŸ‘ 1
m
is there a "layer" where you plan to drop specifying the registers?
do you have a plan to support other CPU architectures?
k
The Mu stack minimizes notations, but my plan is to have one more (called 'level 3' in http://akkartik.name/post/mu-2019-1). It's still very much vaporware. The hazy plan is some combination of shell (http://www.oilshell.org/blog/2017/02/05.html), awk (http://www.oilshell.org/blog/2016/11/13.html) and lisp (http://paulgraham.com/bel.html). But it'll stay interpreted so the implementation is easy to understand. To do anything that requires performance I expect people to write small pieces in the existing level 2 (with registers) that can then be glued together using level 3. I'm not attached to any particular syntax for level 3, I just don't want separate languages for commandline vs scripts. And I don't want the commandline to accrete features without a coherent design the way shells historically did. In the fullness of time I fully expect a MuPy fork, a MuSmalltalk fork, and so on. I'd love to contribute to them if someone else starts them. CPU architectures similarly will need their own fork. From http://akkartik.name/akkartik-convivial-20200607.pdf (section 2.1):
Portability is explicitly a non-goal. Portability guarantees require extra effort to maintain, and can be particularly challenging for newcomers who may not be inclined to ensure their changes work on platforms they don't regularly run. Rather than attempt to fight a losing battle, we retreat from it entirely.
That doesn't mean it's somebody else's problem. I'm very interested in contributing to a RISC-V fork or ARM fork (which has been increasingly on my mind). But it's unlikely to be a priority in 2020 at least.
m
how much do you envision those forks sharing? or once forked they have to be kept in sync "manually"
at some point in the abstractions (I know you don't like that word :P) do you think some upper layer can ignore which of the forks is below?
k
I hope we can come up with tools to reduce/automate the manual burden of merging, particularly if forks have good tests. But I don't plan to mandate some 'standard' that all forks have to follow below some level. I think that gets us to the same situation as today, with creeping complexity. Anybody can do what they want within their fork, and the goal is to make it easy for forks to stay internally consistent and comprehensible. Going between forks is important but secondary.
c
The point of the project is to teach people knowledge from the 'ground up' right? I guess adding abstractions moves further away from that goal? Nice video, as always šŸ™‚
šŸ’Æ 1
r
Next step, a re-implementation of ncurses šŸ˜› I find your choice to stay with hex and the choice to maintain the numerical op codes instead of mnemonics very interesting. I understand why you made those decisions, but it does seem intuitively "weird" to me. Is it just because it's foreign to me and different from the status quo, or do those choices fundamentally make the language more or less clear in some way? You already make a distinction between SubX with syntax sugar and SubX without syntax sugar. Is it worth it to add another layer of syntax sugar for translating bases (or translating mnemonics)? I've been thinking about your mnemonics debate. IIUC, You are worried about ambiguity, but if it's a "separate pass" it could be easily user modifiable, allowing the user to choose names that are less ambiguous to them. You could have "syntax skins". More layers of syntax sugar arguably goes against your principle, "It uses as few notations (languages, syntaxes, intermediate representations) as possible." But it might help with the principle of "It focuses on encoding intention". Or at least encoding understanding. I'm not sure. I very easily fall into the trap of assuming "more productive" is equivalent to "better understanding".
k
Funny story on ncurses: I started out with a Lisp prototype of Mu back in 2015 that depended on a wrapper around ncurses. Since then I've gradually moved down the stack to: * a C program calling ncurses * a C program calling termbox, which is a simpler replacement for ncurses * forking termbox to rip out stuff I didn't use, gradually understanding its internals, still sending patches upstream (http://arclanguage.org/item?id=20221) * dropping my fork of termbox and making syscalls directly. What was interesting here was realizing how few syscalls I need. ncurses actually doesn't do much for many features we associate with it, just prints bytes to stdout. Bottomline: ncurses is quite old, and a lot of the stuff it does was valuable in a world where everyone had thin terminals and very slow internet connections to their servers. It manages an in-memory copy of the screen just so it can compute the minimum it needs to change and send only those bytes over the wire to the server. But for most of us it now runs locally and there's nothing to send. It's all complexity we pay for but don't need. Another source of complexity is support for a huge database of ancient terminal types, none of which anyone uses anymore. So check out http://akkartik.github.io/mu/html/103screen.subx.html. It's been quite a journey to boil ncurses down to ~300 static instructions of machine code. And there's exactly one
ioctl()
in there, see if you can spot what it's needed for.
šŸ‘ 1
@Ray Imber "Mnemonics debate" sounds like you've been reading more deeply here than most people I talk to. Can you point me at any links to stuff you've read on my site or repo, just so I can respond at the best level for you? Or let me know if you're just going by the paper, and I'll respond.
r
@Kartik Agaram Your journey with ncurses and termbox is fascinating. Thanks for sharing. I've tried both for some toy side projects, including dealing with bindings to other languages. TBH, even at the api leveI, I came to similar conclusions. They were big dependencies and I ended up having to manage most of my own state anyway. There was a point where I was looking into creating a terminal emulator, and I ran into the "huge database of terminal types" problem. There is an almost absurd backwards compatibility burden these things impose. It's such a weird artifact of history with seemingly little modern benefit. This is a tangent, but there is a lot of hidden conservative corporate politics that keep that stuff alive. The recent news about Cobol programmers still being needed for maintenance should tell you that these old mainframes and their dumb terminals are still around, and have a surprisingly long shelf life. Re:
ioctl
Damn screen size šŸ˜› you could just force a fixed terminal size. It would be like programming a graphing calculator. It adds to the authenticity of the experience /sarcasm. (I'm full of terribly bad jokes. Ignore me and don't actually do this lol.) Re: the "mnemonics debate" Your project has really caught my attention! I think it's a very cool idea. I've read two of your blog posts: http://akkartik.name/post/mu-2019-1 http://akkartik.name/post/mu-2019-2 and the github issue here: https://github.com/akkartik/mu/issues/39 I'm about a 1/3 through your paper. I'm slowly reading it in my free minutes through-out the day.
šŸ˜„ 1
k
I knew it! You've already read almost anything I could say about mnemonics. The only argument left to mention, I think, is that ambiguity complicates error handling, and again this is error handling one must implement fairly close to the metal, in unsafe machine code. https://www.laarc.io/item?id=1945 This, incidentally, is also why I try to minimize syntax sugar: any part of the pipeline that emits error messages can be confusing if the code it's referring to was generated by some syntax sugar it's oblivious to. That said, syntax skins that don't persist in the source code could be interesting. None of this stuff is black and white. SubX does use syntax sugar, as you point out. We just have to take costs into account. I don't yet feel confident that I am estimating costs right in any of my design choices so far. That'll require a newcomer to actually spend time with Mu and tell me how easy it has been for them to learn/retain.
šŸ™‚ 1
r
Fun tidbit, because it was mentioned in the laarc thread you reference, I first learned assembly with "the Art of Assembly" book and HLA. I have fond memories of it. That probably gives away some bias I have šŸ˜„ You have definitely made me question some assumptions I had about assembly language. I intuitively want mnemonics. I think your worry about making the user have to look up extra tables in their mind may not be as big of a problem as you think (especially if you provide good documentation). But you bring up some really good points about the implementation complexity, especially around error messages at this level. If you do support mnemonics, also supporting a base 10 to base 16 translation seems like a natural next step. The two syntax sugars compliment each other. There is probably still value in having a base 10 translation pass without a mnemonics pass, but maybe less so. "That'll require a newcomer to actually spend time with Mu and tell me how easy it has been for them to learn/retain." lol. I get the hint. Curiosity might get the better of me and convince me to pull the repo and play with Mu. But I make no promises šŸ˜