What "How to create your own programming language"...
# present-company
s
What "How to create your own programming language" tutorial/resource would you point somebody to, who is not doing it for the first time, roughly knows what they're doing, and is looking for insights on how to do it well?
w
What kind of programming language are you planning to build?
s
My own recommendation would be https://craftinginterpreters.com, but that's more of a "first time" intro. Although you get to build both an interpreter and a compiler + vm, which I think is more advanced (and more what I'm after here). What I also like is that he explains everything without any dependencies required.
👍 1
@Will Appreciate you asking that — we software people are often quick with offering solutions before really understanding the problem… the thing is, in this case I'm actually looking for a wide variety of different solutions, because I can't describe my problem well. Yet. I'm looking for different perspectives on how language implementors split that problem into different parts and why. Do you write an interpreter or a compiler? Do you hook into something like LLVM? Do you use a virtual machine? Do you explicitly design for multiple stages/passes (tower of interpreters-style)? Maybe "tutorial" is framing it the wrong way and I'm more looking for design documents and field reports; something like https://www.goodreads.com/book/show/1168500.LISP_in_Small_Pieces or https://www.lua.org/doc/jucs05.pdf.
I'm particularly interested in the separation of the "static" and "dynamic" parts and implementations that make conscious design decisions to split them up and have well-defined interfaces between them. So an interpreter-based language would less likely qualify than a compiler+vm approach.
w
The best resource I’ve found for the high-level design decisions of programming languages is HOPL proceedings. https://hopl4.sigplan.org/
👍 4
c
You mentioned LLVM - they have a tutorial that walks you through implementing a simple PL ("Kaleidoscope"); https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/index.html I haven't done it yet but am planning to in the next few weeks so can report back then
n
FWIW I’m just planning to translate to Wasm bytecode because it’s far simpler than LLVM, and high quality Wasm compilers and JITs exist (and will be a big focus of the next decade).
So the resources I need are along the lines of “how do I efficiently map my language to the Wasm VM? (Or similar VMs?)”
Because my language is declarative, the challenge is in the serialization of the declared computation to a "sequence of instructions". For the most part I think this is an open problem: there are ideas from existing functional language compilers like "loop fusion" and dataflow analyses but I think many of the compilation approaches will depend on the fine details of the language itself. So I'll probably start by reading the literature on FP compilation and then probably have to invent the rest. (Traditional compiler books spend a lot of time on topics like parsing, lexing, type checking, and register allocation, none of which I need: My language isn't based on text files, it doesn't have static types, and Wasm doesn't have registers.) So that's my current perspective anyway.
e
https://www.oreilly.com/library/view/masterminds-of-programming/9780596801670/ is interesting as a high level view of the decision taken by authors of major programming languages
https://www.amazon.com/Language-Implementation-Patterns-Domain-Specific-Programming/dp/193435645X has a nice collection of patterns, the scope is not super wide but I think it is a pretty useful resource anyway to get acquainted with a particular way of solving "language problems"
a
For implementation: Bitwise - a stream series where Per Vognsen implements a software-hardware stack from scratch, including a couple of languages. The languages he's creating are fairly low-level, but he brings up considerations that I think would be relevant for most languages. https://github.com/pervognsen/bitwise/ https://www.youtube.com/playlist?list=PLU94OURih-CiP4WxKSMt3UcwMSDM3aTtX
The Jonathan Blow streams are often good, but past his design streams they're probably a bit too sparse in terms of insight for language development
Hopefully this isn't too flame-war-like 😄, but I'd also want to encourage the notion that "performance is a feature", particularly when you're creating software infrastructure such as a language compiler. (This is more relevant once you have a design/prototype you're happy with and are going to make it available for public consumption.) ESBuild provides an example of the importance of caring about performance in build tooling (pic attached). Speedups of this order of magnitude significantly improve the ways in which you can use the tool.
t
Read SICP to understand how simpler a compiler can be and to make your own language with lisp notation. https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book.html
1
d
"Modern Compiler Implementation in Java" (also versions written for C, ML, etc) covers traditional compiler stuff, but delves really deep into specifics, various strategies for optimization (e.g. register coloring, loop unrolling, etc) and modeling different language paradigms at the low level. I thoroughly enjoyed this book more than others I've read (e.g. vs what's called "The dragon book") https://books.google.com/books/about/Modern_Compiler_Implementation_in_Java.html?id=EEyQQgAACAAJ&source=kp_book_description
a
SICP
MIT recently put up the lecture series for this by the book authors: https://www.youtube.com/playlist?list=PLE18841CABEA24090 I'm about to go through it myself
d
I also really enjoyed this one for an imperative (C-family) looking syntax with LISP semantics, and models the execution using continuation passing style under the hood (e.g. so you don't see it - like a language that has async and await syntax to make callbacks seem synchronous, but it's just that way by default with no extra syntax). This provides for very interesting semantics that's not possible without CPS, such as being able to implement things that break or resume flow (e.g returns, exception-throwing, continuations), using plain old expressions that aren't baked into the compiler. http://lisperator.net/pltut/
s
Thanks for getting SICP back on my radar. I had read it a long time ago. Turns out just reading the Foreword today tickles a lot more synapses than it did back then…
❤️ 1
Also found this, which is somewhere in between tutorials and what I was asking for, but if anybody interested ever finds this thread again (hello, future me!), probably worth knowing about: https://c9x.me/compile/bib/
b
Maybe LiSP (Lisp in Small Pieces)?
e
lisp in small pieces could use some modernization, last time I tried to work through it it was pretty tricky to get a working environment for the book's code, but it is recommend enough that I'd like to try reading it again 🙂