this might be too much for a text based medium but can anyon Future of Coding #thinking-together

this might be too much for a text-based medium, bu...

Garth Goldwater

10/25/2019, 3:07 AM

this might be too much for a text-based medium, but can anyone explain to me why macros (eg in a lisp) can’t just be implemented as functions that read data (a parsed file) and return data (another parsed file, transformed) and pass it to the interpreter by overriding its import references/module dependencies to replace the file on disk with the file post-transformation? like isn’t it just middleware for the interpretation process? checking my mental models here

💯 1

Kartik Agaram

10/25/2019, 3:24 AM

Certainly. Macros are just glorified search and replace. Whether the process happens entirely in memory or via a stage on disk doesn't seem important. What's important is that the search and replace is specified with -- all its generality -- in the same language. There are also a couple of mechanical details to get right: * Macros are defined in the same files that define functions. The ordering constraints need to make sense. * The expansion of macros needs to itself be expanded.

Garth Goldwater

10/25/2019, 3:28 AM

do you think it’s necessary to define them in the same files? i have this crackpot intuition that you can kind of invert the ‘infinite towers of interpreters’ idea from racket (cf.

https://youtu.be/Ywy_eSzCLi8▾

) or like invert the idea of fexprs from kernel and just have data describe the transformations you’d like to make to other files containing data, provided that you get the execution order right (maybe this is too abstract)

shalabh

10/25/2019, 3:37 AM

Yeah I've come to believe that a macro is just a function that is applied on the AST and applied early (compile time). Other features are just convenience. E.g. they could have a special syntax for AST pattern matching, and being built-in will apply them at the right time. In fact Nim macros are even described as 'functions' (https://nim-lang.github.io/Nim/manual.html#macros). In systems without an explicit 'program compilation' phase, the macro idea doesn't exist.

Dan Cook

10/25/2019, 3:42 AM

Exactly! Honestly there's not even a real need for "the compiler" to be anything more than "just a function", either. So your program would be functions applied to code, and then fed into a compile function. Software makes software, and really it's all just functions and data all the way down. So the boundaries between macro & function, code and compiler, etc, is entirely artificial

👍 2

Kartik Agaram

10/25/2019, 3:45 AM

You can certainly define macros in a separate place. But you give up some power by not being able to call regular functions when defining macros. Oh, I am also reminded that f-exprs are an extra level of power and expressiveness, letting you pass macros like functions to higher-order functions.

Dan Cook

10/25/2019, 3:49 AM

So like, mapping a macro over all the elements in a list?

Kartik Agaram

10/25/2019, 3:50 AM

Yup! My lisp used to do this, so kinda shocking that I forgot.

Kartik Agaram

10/25/2019, 3:50 AM

It's a nice abstraction because you're never saying convert this code and then call it.

Dan Cook

10/25/2019, 4:08 AM

A bit of a tangent: if the interpreter is tiny, couldn't the "compiler" just create a program that contains the compiled interpreter, ended the rest of the program as actual LISP lists, and then feeds some initial code into the interpreter?

Kartik Agaram

10/25/2019, 4:09 AM

That is in fact present in Lisp compilers like SBCL.

Kartik Agaram

10/25/2019, 4:10 AM

The final binary includes the entire compiler.

Dan Cook

10/25/2019, 4:11 AM

Ha :) I'll bet if one could write such a compiler in LISP, then one could write a LISP program that creates code from code, and feeds the result into a "compile" function, and then there'd be no more distinction whatsoever between macros and functions

Kartik Agaram

10/25/2019, 4:12 AM

Except speed/performance..

Dan Cook

10/25/2019, 4:13 AM

... of compilation? Maybe that's a fair trade for power ... (and I don't imagine that it would be much faster to edit live lists from a running program, because LISP is simple to parse)

Kartik Agaram

10/25/2019, 4:15 AM

Sure. But if you don't care about performance you could just stick with an interpreter.

Kartik Agaram

10/25/2019, 4:15 AM

Bundling the entire compiler can certainly be worse than a raw interpreter for certain programs.

Dan Cook

10/25/2019, 5:02 AM

... unless the point is to replace textual "source code" (and macros) with a living breathing thing that is its own interface for editing itself and ultimately generates (via a compile function) some end product :) In this case, instead of macros, it could provide ... anything else, perhaps something you wouldn't recognize as "code", and programmatically generate code from it

Kartik Agaram

10/25/2019, 5:05 AM

If it's its own interface, why does it need a compile function? Doesn't that disqualify it from living and breathing?

Kartik Agaram

10/25/2019, 5:05 AM

I don't really follow your statements in the last couple of days. I had questions on the #CC2JRGVLK thread, but wanted to give the guest of honor some space first..

Dan Cook

10/25/2019, 5:12 AM

Tying it back to the original topic, my point is that even the compiler & language is non-essential middleware, because it's all just code generating code from other code, so why not just bring it all in to a single program where you have full control? In that light, a macro really is just a function. The only distinction is whether a function is embedded in the "output" program, or used to generate it. Any function can be both (e.g. mapping over code, but the same map function is also used post-compile)

Kartik Agaram

10/25/2019, 5:20 AM

What constitutes "single program" for you? A single process with a unified address space? Something else? Why not go the other direction like Smalltalk and have lots of tiny processes/objects/tiles cooperating?

Dan Cook

10/25/2019, 6:16 AM

I mean "single program" just like it sounds. Like a SmallTalk environment made just for a single product or product suite, all embedded in a single executable program or script. Everything in it exists solely to support one specific thing being made: The code for the thing, and for compiling it, and for editing & visualizing it, etc. Again, "thing" could be a set of related things, like a suite of products, or different layers of a tiered application (similar to how Dark erases those boundaries) The general idea is that there'd be something minimal that you clone per use, rather than something like an "operating system".

10/25/2019, 6:50 AM

in ClojureScript, macros are commonly defined in a different file, self-hosted ClojureScript being the exception, see https://clojurescript.org/about/differences#_macros

Edward de Jong / Beads Project

10/25/2019, 7:37 AM

Macros are used extensively in Assembler, and one of the key tools is to see a printout of the expanded code ,because complex macros can do unpredictable things. This is the reason why macros which were present 50 years ago in great abundance have gradually withered. There is little interest now in seeing printouts of your code, and people are working more directly with their source code, thus Macros are inconvenient because they force a user to expand the macro in their head. This is really about a super powerful feature that is being disused today, because we don't print code out like in olden days.

Konrad Hinsen

10/25/2019, 9:44 AM

Probably the most principled approach to macros is the one taken by Racket, which introduces the concept of "phase levels" (https://docs.racket-lang.org/guide/phases.html) which is a generalization of the traditional distinction between "compile time" and "run time". You have a multi-step computation in which each step takes the output of the preceding one as one of its inputs. This point of view also covers related approaches such as code generation.

Stefan

10/25/2019, 11:02 AM

This classic distinction between compile and run time made me wonder if there are other “times”, and I like the idea of generalizing it. If you think about it a little differently than what Racket generalizes it for, say for a program lifecycle: 1. You design it (think, write code), where the development environment can support you, 2. Then it gets compiled, where a compiler supports you with error detection and optimization (this can be further divided into smaller, more specific steps, which is what Racket does and what you were discussing above in terms of macros), 3. Then there's configuration time, when you start a process with parameters that are not hard coded into the binary but also guaranteed not to change over the process' lifetime, 4. And then there's runtime, which of course could also consist of several distinct phases if you designed it that way. That reminds me a lot of currying in a functional language, where each curried parameter is the current lifecycle stage’s “state” in a chain from hard coded design time invariances until you get to the runtime parameter for user input in that moment. Ha, game programmers could probably think of each frame as a stage within the broader runtime phase. Or if you don’t like the functional perspective think of a dynamic program that goes deeper into a stack of scopes, where each scope holds the environment for that particular stage and can access all the environments up the chain. Does that sound familiar to anyone? Do you know any papers or talks that think of it in this way?

❤️ 2

Kartik Agaram

10/25/2019, 11:32 AM

@Konrad Hinsen I was aware of Racket's notion of phases but hadn't connected it up with code generation. Can you elaborate on that aspect, or give some examples of projects that have used phase separation for code generation? Macros are to code generation as interpreters are to compilers? That feels like a fertile analogy, though I can't seem to do much with it at 4am.

Kartik Agaram

10/25/2019, 11:41 AM

https://futureofcoding.slack.com/archives/CLYCGTCPL/p1572002424061800 feels extremely relevant to this (great!) thread in particular. Perhaps it's romanticists and classicists all the way down. Perhaps my preference for keeping it all in one phase is a lack of architecture that's going to bite me over time.

Konrad Hinsen

10/25/2019, 1:29 PM

@Kartik Agaram Macros are in-memory code generators. They generate Lisp code from DSLs written in S-expression syntax. Compilers are code generators as well, of course, as are preprocessors and other variants on the same idea. And many compilers have multiple internal phases. So every program is processed by a multi-stage pipeline in which each step is a code generator, and the final output is code for some interpreter (which may be implemented in hardware, called a processor).

Konrad Hinsen

10/25/2019, 1:30 PM

@Stefan I see a big difference between the first step, done by humans, and all the others, which are automatized.

Stefan

10/25/2019, 1:41 PM

@Konrad Hinsen I can see where you’re coming from and I agree that if you come from a perspective of automation, the design step feels different. I was looking at it more from a perspective of augmentation and interactivity. All steps offer various degrees of tinkering, even a tool chain has different settings, so the programmer can be involved in every step. Especially the first step, design, benefits from a toolset which augments that experience. And that is where IDEs could shine much more than they do today. Thinking in the direction of “moldable tools”. On further reflection, that is where this lifecycle becomes an actual cycle I guess…

Stefan

10/25/2019, 1:43 PM

Well, I do realize that I went quite off-topic, sorry. Still happy to hear about any pointers to related material. I can’t possibly be the first person with that view.

Doug Moen

10/25/2019, 4:12 PM

Curv has 4 phases, rather than 2. There is the conventional compile phase (macro expansion happens at this time, but macros are only used internally, they aren't user defined). Then conventional run time. Most Curv programs evaluate to produce a shape value. A shape contains a signed distance function and a colour function. In the third phase, these functions are compiled into GLSL shader code, which is run on the GPU in the fourth phase. During 3rd phase compilation, many compile time constants were actually computed during 2nd phase evaluation. So 2nd phase evaluation could be viewed as a kind of partial evaluation.

Doug Moen

10/25/2019, 4:21 PM

Collapsing macro and function call evaluation into a single phase is possible, but there is a huge performance penalty. It means that all arguments are passed to functions as uninterpreted syntax trees. A lexical environment is also passed in every function call. Most functions will just evaluate their arguments using the parent lexical environment, but some functions will do weird stuff. For example,

lambda

is just a function in this system, since macros and functions are the same.

lambda(x,x+1)

is a lambda expression which binds the first argument, an identifier, as a local variable in the environment of the second argument, an expression, and then returns an ordinary function.

Doug Moen

10/25/2019, 4:25 PM

Collapsing macros and functions into the same concept also imposes an abstraction penalty: it makes it exponentially more difficult to figure out what a program is doing. You can't assume that

f(4)

and

f(2+2)

are equivalent expressions. Brian Cantwell Smith explored these issues thoroughly in his thesis on "3-Lisp", and it is recommended reading for people reading this thread.

Doug Moen

10/25/2019, 4:27 PM

The "reflective tower of interpreters" concept, and the general concept of "reflection", comes from Brian Smith's thesis.

💡 2

Stefan

10/25/2019, 4:29 PM

@Doug Moen Is it this one: https://www-master.ufr-info-p6.jussieu.fr/2006/Ajouts/Master_esj_2006_2007/IMG/pdf/Smit84.pdf

Doug Moen

10/25/2019, 4:35 PM

That's not the paper I read, but it covers the same ground. I remember a more detailed analysis of why it is impossible to understand and debug 3-Lisp programs, this paper only briefly mentions the issue.

Doug Moen

10/25/2019, 4:49 PM

I read his 1982 PHd thesis. Definitely a mind-expanding experience. Note it is > 750 pages. https://dspace.mit.edu/handle/1721.1/15961

👍 1

Konrad Hinsen

10/25/2019, 6:37 PM

@Stefan Your point of view is definitely interesting to me as well. Macros are a rather opaque way to do program transformations and I have often wondered if a tool-based human-supervised approach could lead to better understandable code.

💯 1

✔️ 1

Garth Goldwater

10/25/2019, 8:42 PM

as long as we’re citing epic papers: https://web.cs.wpi.edu/~jshutt/kernel.html kernel goes into this a lot and the way the red/rebol parser works is very similar

👍 1

Garth Goldwater

10/25/2019, 8:43 PM

i think if we actually lift the notion of evaluation into like user-land (or even IDE-land), plus work like

https://www.youtube.com/watch?v=Ywy_eSzCLi8▾

we could maintain different levels of interpretation/compilation and move through them during dev, while only shipping a binary during prod

👍 1

🤔 1

Stefan

10/25/2019, 8:55 PM

And by “epic” you clearly mean epic proportions… jeez… another 190 pages.

Garth Goldwater

10/25/2019, 9:07 PM

yes that was exactly what i meant lol

Garth Goldwater

10/25/2019, 9:08 PM

guess i wasnt too clear

Garth Goldwater

10/25/2019, 9:08 PM

just realized i shared the same video link twice—sorry everyone. it IS very cool though

Konrad Hinsen

10/26/2019, 8:32 AM

@Garth Goldwater Lifting evaluation into user land is probable not that useful most the time, but having it as a debugging option would have helped me a lot in a very few cases. Something else that I would like to get out of black-box compilers is optimization. Many times I have tried to out-guess a compiler and write high-level code precisely in the right way to get the optimization I wanted. This would have been much easier at a lower code level. I'd have loved to have an instant compiler that compiles high-level source as it is edited, and displays lower-level versions in parallel for inspection and editing. Not trivial to do, but so useful!

👍 1

Garth Goldwater

10/26/2019, 3:44 PM

hmm. i’d argue that you could look at structural editing as lifting evaluation into user land, especially if you can do vim-style macros or an extended version of refactoring tools we see with statically analyzing IDEs

Garth Goldwater

10/26/2019, 4:31 PM

the point there is that you emit a transformed ast (which in the context of this conversation is the same as programming, right?) but you’re doing it like before even compile time

Garth Goldwater

10/26/2019, 4:32 PM

could also imagine doing exactly what you’re describing konrad

Doug Moen

10/26/2019, 8:16 PM

@Garth Goldwater said: "structural editing as lifting evaluation into user land". @Cyrus Omar’s <hazel.org> is a structural editor, and the LiveLits feature evaluates user code in the IDE in order to visualize graphical values, with the graphics embedded in the source code view. It's a feature that I've also been planning for my Curv project.

Garth Goldwater

10/26/2019, 8:28 PM

similar: sketch n sketch lets you edit a picture to edit the code to create that picture IIRC. essentially evaluated the new input from the canvas and then replaces it in the source code

Konrad Hinsen

10/27/2019, 4:10 PM

@Garth Goldwater That sounds like an interesting point of view: everything is a code transformation, from user interactions in a structural editor to macros, compilers, code generators, etc. The two categories of transformations would then be meaning-preserving (refactoring, optimization) and meaning-changing (new features, bug fixes, etc.).

❤️ 1

☝️ 2

Garth Goldwater

10/27/2019, 4:33 PM

thank you for providing a much clearer explanation of what i was trying to say! i’m going to steal that as my elevator pitch for my research

Garth Goldwater

10/27/2019, 7:25 PM

@Konrad Hinsen can i make this my twitter header lol

gman

10/30/2019, 8:48 AM

I'm not sure this is relevant but my background in memory constrained apps with lots of data means that macros have to happen at build time not runtime. An example macro might get the size of N files, take tha max size of those sizes, and check that that size is under some specified max size, then use the size of the largest file to statically allocate a specifically sized piece of memory. There is no memory free to include any of this logic at runtime nor could it be done at runtime since if maxsize(files) > maxSizeAvailale the program will crash. That's something we need to catch at build time and fix the files (take out some data, simplifiy something, etc..). And, even if it could be done at runtime it would be a bad user experience waiting for the system to get the max size of N files adding to the wait time for the app to start. So at one level I agree it's code transformation all the way down but each transformation stage has a purpose.

Garth Goldwater

10/31/2019, 3:15 AM

i think you’re exactly right: like with partial evaluation, each stage has a different amount of information/invariants available from the host and user

Dan Cook

11/02/2019, 2:47 AM

I think everyone can agree on the necessary of different phases of processing (compile time, build time, deploy time, burger time, run time, etc.) But (per the original topic), the only real difference is WHEN the processing occurs -- otherwise it's all just code with no fundamental difference. Separating code into different "kinds" of things for each "time" (e.g. functions vs macros) is absolutely NOT intrinsic to that kind of phased processing, although that is certainly one (very convenient) way of doing it. Practically aside though, there is no reason the same cannot be done with plain old functions that generate code which is then saved off for later execution.

Dan Cook

11/02/2019, 2:51 AM

So for Gregg's example, the need has everything to do with separate phases of execution, and nothing to do with macros per se -- although they may have been the best fit or most convenient way to go about it.

Garth Goldwater

11/02/2019, 5:03 AM

right, but i think(?) i’m arguing that by generalizing something similar to the infrastructure required for lazy evaluation, we should be able to draw a dependency graph for what code needs to be evaluated programmatically rather than forcing the programmer to determine the staging order ahead of time in a source file

Konrad Hinsen

11/02/2019, 9:25 AM

@Dan Cook I completely agree, but I’d also add that convenience matters very much in practice. And there is one fundamental difference of some macro systems (e.g. scheme-style hygienic macros) to everything else: several phases share the same namespaces for identifiers.

Konrad Hinsen

11/02/2019, 4:45 PM

@Garth Goldwater Lazy evaluation is not sufficient to get the equivalent of macros. A lazy function can decide to evaluate an argument or not, but it cannot change the semantics of evaluation. A macro can do anything at all to its arguments. A Lisp macro could implement Fortran with s-expression syntax, for example. You can probably do something similar to JIT for macro expansion, but it needs to be distinct from function evaluation.

shalabh

11/02/2019, 6:07 PM

@Konrad Hinsen - some languages allow passing around 'unevaluated expressions' where the recipient decides even how to interpret the expressions. E.g. lisp fexprs, red/rebol blocks etc. Do you think these are equivalent to macros?

Konrad Hinsen

11/02/2019, 7:46 PM

@shalabh Yes, those are equivalent to macros, but differ in details.

Garth Goldwater

11/02/2019, 8:15 PM

that’s kind of what im describing as a “generalization of lazy evaluation”—the value of the unevaluated code expressions themselves could be created via a function that produces code only when demanded

Garth Goldwater

11/02/2019, 8:16 PM

might be too far in abstract-land, though

Doug Moen

11/03/2019, 12:38 AM

@Dan Cook: "the only real difference is WHEN the processing occurs -- otherwise it's all just code with no fundamental difference. Separating code into different "kinds" of things for each "time" (e.g. functions vs macros) is absolutely NOT intrinsic to that kind of phased processing" I disagree. The reason that stages exist is that we need different representations of the program at different times. That's why there are different kinds of things in different stages. In early dialects of Lisp, FEXPRs were the original version of macros, and there was no stage restriction. Everything happened in one stage. This meant you couldn't compile Lisp, you could only interpret an abstract syntax tree. In order to compile Lisp into efficient machine code or VM code, the compiler changes the representation of the program to a form that runs faster, but the compiled form doesn't support macro expansion. So it was necessary for Lisp to add a macro expansion stage. When FEXPRs are restricted to running in the macro expansion stage, they are called MACROs. The reason that Curv has so many stages is that a Curv program transitions across 5 different representations, and certain operations on code are only possible on certain program representations: source code -> VM code -> shape value -> GLSL shader program -> GPU machine code.

Dan Cook

11/03/2019, 1:05 AM

Sure, convenience is definitely good. But just purely to answer the original question: Yes, the boundaries between compile time and run time, and functions and macros, etc, is completely 100% artificial. Super practical, successful, proven, convenient, etc? Absolutely! But fundamental or inevitable? Absolutely not!

Dan Cook

11/03/2019, 1:20 AM

Just watched that video about collapsing towers of interpreters -- great stuff! Thanks for sharing!

❤️ 1

Doug Moen

11/03/2019, 1:33 AM

I feel that I've given a good basic argument, with several examples of where the boundaries between stages (such as compile time and run time) are fundamental and inevitable. You all don't know Curv, but the FEXPR example is pretty well understood.

Dan Cook

11/03/2019, 1:54 AM

@Doug Moen - I 100% agree, but with a heavy footnote that "such as" is not the same as "which can only possibly be accomplished in terms of". And that's my entire point: that the separation and phases are necessary, but there's nothing fundamental about how or where that separation is defined, or who or what gets to draw those lines. When you do it yourself in your own code, it's just code transforming code. But if it's mandated by a separate program beyond your control, then it's called "a compiler" (and the hard lines are called a "programming language"). That part is not fundamentally necessary.

❤️ 1

Dan Cook

11/03/2019, 2:34 AM

As for different representations (e.g. the "3-lisp" complexity), that sounds like a real mess to have to deal with, so that advice is very appreciated! But also, I see no reason that wisdom cannot be applied in any context (e.g. not just a compiler) where code manipulates / generates other code.

Dan Cook

11/03/2019, 2:39 AM

I'm actually thinking that Racket is probably the best compromise at the intersection of all this. It enforces separation through layers of "language", but lets you be in control of defining them how you like. Hypothetically, I bet you could even make your own language/syntax for specifying languages ... though I'm honestly not sure how practical that would actually be.

Konrad Hinsen

11/03/2019, 10:30 AM

Words like “necessary” and “inevitable” refer to tacit requirements. A Turing-complete language doesn’t need any add-on for doing computations, but additional requirements can make macro-like processing phases necessary. I see such features as a form of modularization. Macros can be considered compiler plugins, for example, but also as tools for implementing DSLs.

Dan Cook

11/04/2019, 4:25 AM

Good point. I guess we're taking about "necessary" in the context of any ahead of time processing, or ahead of time code transformation/generation

3 Views

Open in Slack

Previous Next