I've been thinking about an alternative approach t...
# thinking-together
n
I've been thinking about an alternative approach to "function calling" (particularly, for declarative languages). Instead of having functions, where every invocation must explicitly feed in every argument, what if: • we define a language that supports only expressions containing free variables, and • we treat invocation as merely (lazy) syntactic substitution, thus arguments are "retrieved" from the scope where the expression is invoked For example, I write
let f = x * x
(note the absence of a parameter list) and then later write
let x = 3; let y = f
, which sets y to 9. It's important for the substitution to be lazy, since this retains the semantics of traditional functions, including the ability to make recursive definitions. Given this, I don't believe I'm re-inventing macros, though I welcome enlightenment. You can combine this with the usual nested syntactic scoping rules, so it's definitely not the same as using global variables, or the same as closures (where all the free variables are bound at the definition site). The only practical difference seems to be the UX, and the biggest UX downside is probably readability (which invocations consume which values?), but I'm confident an IDE can make the data flow more explicit. Advantages of this approach: • Certain programs can become much more concise. For example, in drawing code, you can set the drawing
color
just once, and it will be automatically consumed by all drawing functions (i.e. expressions-with-free-variables) that are invoked in that scope. This actually obviates the "type classes" feature that functional languages often pursue. (Type classes aim to enable exactly this kind of "implicit argument passing", but achieve it in a more convoluted way.) • You've removed the concept of a "function call" entirely from the language. You only need substitution (i.e. referencing). Less features is better, when all else is equal. Disadvantages of this approach: • As mentioned earlier, it may be harder to understand which values are being consumed where. But I'm confident that an IDE can make this clear. (I'm building a language based on ASTs and structured editing anyway, so it's guaranteed to be combined with a sufficiently-smart IDE.) • You need to worry about "accidental variable capture", where a variable which just happens to be in scope is accidentally used by a function (i.e. expression) invocation. I think this is a real problem in languages where variables are identified by text strings, but FWIW, the language I'm designing is not based on text strings, and so can be designed in such a way that a local variable will never accidentally have the same UUID as a free variable found in a function (i.e. expression). In an AST-based language, every function definition should by default use unique UUIDs for its free variables. Only related functions from the same library should share UUIDs, for example the
color
variable used in a drawing library should be the same across all drawing functions. I'd love to hear others' thoughts. Has this been tried before? Are there additional advantages or disadvantages that I've missed? Is it worth a try? Note that I'm thinking specifically about declarative languages here. Imperative languages may add complications (but may not!).
🤔 2
k
Is this the same as unhygienic macros? (I love unhygienic macros, but I haven't tried to program with just them.)
n
It shares similarities with macros, but you should still be able to define recursive functions using this approach, and thus substitution needs to be able to occur "lazily". I'm not sure if there are macro systems which can do that; I'm not very familiar with macro systems in general. I added the word "lazy" to the original post.
k
Interesting. One phrase for literature surveys may be "lazy call by name" _Edit_: Apparently call by name as defined in the literature is always lazy: https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_name. So this may be subtly different in some way I can't brain atm.
i
this is how our english stuff works, since in natural language you often don’t refer to arguments if they’re obvious
the important part is that we show you our explicit interpretation and allow you to correct it in case we picked the wrong thing
👍 1
👏 1
v
There is some similarity with APL trains, hooks and forks, IMHO. Unless I’m not understanding this correctly. There is a bit of a learning curve with APL trains. Note that the analogy to English holds - for most people English has a few years long learning curve going from characters to words to sentences.
🤔 1
Wouldn’t this be a problem where f = x*x and f = y*y are not equivalent?
n
Are you talking about comparing the two expressions under an _equality operator_:
x*x == y*y
? Equality of functions is already undefined in most programming languages.
If you mean that swapping one invocation for the other does not yield the same program, then that's true, but I don't think that will be a problem. The user of the function just has to indicate which variable they want to bind at the invocation site.
And you can always rename the variables by saying
let x = z; let y = z
if you want a set of functions to all use the same variable name, post-hoc.
k
I believe the point might be that constraints on the names of variables could affect composability. If you want to call
f
you have to have a variable called
x
. But now say you have another function that needs
y
for the same variable. Or
x
for a different variable. None of this is insurmountable, but might add noise.
n
I think my above comment + my second dot point under "Disadvantages" addresses those things 🙂 (I just extended the dot point to add more detail)
k
Ok, maybe. Capture typically involves reasoning about a single call in isolation. I'm thinking about situations where each call is fine in isolation but there's impedance in putting them into a single block. So you end up creating extra block boundaries.
Ah, I see your edit. Yeah, that's basically the impedance mismatch I was alluding to. Not a huge deal. Typically when we worry about 'capture' we're concerned about insidious bugs. This is more a case of fixing a new kind of syntax error. I have to deal with some additional constraints akin to these in Mu. It's a cost, but sometimes it's worth paying.
👍 2
w
This concept is pretty similar to dynamically scoped variables. e.g. emacs lisp:
Copy code
(defun f () (* x x))
(let ((x 2))
  (cl-assert (= (f) 4)))
See Richard Stallman’s paper about emacs on why dynamic scoping promotes modularity: https://www.gnu.org/software/emacs/emacs-paper.html#SEC17
1
👀 1
👍 2
Worth noting that while most programming languages use lexically scoped variables, a variety of programming frameworks include dynamically scoped abstractions. Most notably React contexts.
n
Yeah, this will probably "feel similar" to dynamic scoping, though importantly, you can still determine which bindings will be used by an expression using purely syntactic reasoning. I'll read that paper, thanks 🙂 (edit: The arguments Stallman makes for dynamic scope are pretty weak, IMO, and don't overlap with the arguments in my proposal.)
☝️ 1
g
i could see problems with accidentally reusing a variable name and causing unintentional problems.
what about nested for loops
Copy code
for i in ...:
  for j in ...:
    # eval a function that expects an index to be assigned to i but you want it use j.
n
I cover that in "Disadvantage" dot point 2. In short: you wouldn't ever be using "i" as a parameter name. You'd be using a unique UUID, and perhaps you'd use "i" as a human-friendly label for it, to be displayed in the IDE. Also, I didn't explicitly mention that I'm thinking about declarative languages (I always make this mistake). It may be the case that imperative languages have specific qualities that introduce additional problems.
💡 1
w
Yeah, I think I missed how your proposal differs from dynamic scoping @Nick Smith.
n
Well, if you put a function definition within another function definition, the inner function should still be able to "use" variables from the outer function, i.e. binding the variable in the outer function might also bind it in the inner function. In other words, you can choose between lexical scoping and dynamic scoping on a per variable basis. Indeed, with every variable usage, you need a means to indicate which parent scope you want the variable to be bindable from; essentially you want every scope to indicate its "parameter list" (free variable list). This puts you somewhere in-between lexical scoping and dynamic scoping. It will require a smart syntax to be comprehensible, of course! It would work similarly to how "quoting" works in Lisp or Julia.
k
To me this proposal sounds a lot like term rewriting, except that typical implementations of term rewriting include more sophisticated pattern matching.
And I like term rewriting exactly for the reasons mentioned.
1
a
I've considered very similar ideas, down to the example of graphics/color and idea of library-specific symbols. You might also look into algebraic effects as an underlying formalism for variable lookup. (That should still let you pick which variables are dynamically or lexically scoped, but it's too late for me to be confident). Also effects are just really cool in general, which is why today I lean towards using them to handle this use case by-the-way.
🤔 1
n
@Konrad Hinsen Does that perspective yield new insight? I thought all language semantics could be expressed as a term rewriting system.
k
@Nick Smith In the end, everything that's Turing complete is equivalent. But the UX is very different. The particularity of term rewriting as compared to more popular functional computation frameworks is that it focuses in terms, which represent data and function applications, over rules, which represent the transformations done by the functions.
n
Yeah sure. I guess term rewriting might be a useful way to phrase the semantics for the purposes of explanation.
b
Hi Nick - have you looked at
implicit
s in Scala? It allows code quite similar to the example
let y = f
if
f
or its parameter is marked as
implicit
.
implicit
can be used at the value and at the type level. At the value level - they are mostly used to reduce boilerplate; for example; passing around a db connection throughout the callstack could be done by marking it as an implicit parameter; and then not having to explicitly pass it on a function invocation. At the type level - it is used to support a number of the advanced Scala language features. I would say its a very actively used feature of the language. Scala has a new major release coming in which a number of the lessons learned from Scala 2 are being used to refine some areas of the language design and
implicts
are getting quite a bit of a rework; so could be an interesting case study.
💯 1
n
Yup! I’m broadly familiar with Scala implicits (but of course, thanks for suggesting 🙂). I put them in the same bucket as “type classes”, which I mentioned in my original post. Is there some benefit of Scala implicits which you think may not exist in my proposal? (Ignoring the type stuff). Implicits otherwise seem like a much more complicated means to achieve the same thing.
It might be worth investigating the problems Scala users face using implicits, of course. Some of the problems may translate to this proposal.
b
There does seem to be quite a bit of overlap between the two ideas. Building type classes is one of the main ways implicits are used in scala; and as you say comes with a fair amount of complexity. But it is also possible to use implicts without creating type classes. As a prototyping idea - you could mark all parameters as implicit - and then have the reduced boilerplate on the function invocation; and it would also be a way to explore how often implicit resolution issues start coming up (obv that would be limited to Scala's resolution rules; but could still be an interesting worked example)
👍 2
n
Good idea 🙂. I’ll have a play around with it!
b
Scala 3 is also adding "implicit functions" - so as you say; could be a area to draw "lessons from the trenches" and possibly even some extra inspiration from as well. https://www.scala-lang.org/blog/2016/12/07/implicit-function-types.html
m
It does indeed sound a little like implicit arguments. As well as Scala you could compare with Agda and Idris.
I've come to think of implicitness as being "term inference" by analogy with type inference.
🤩 1
All the languages I know of which have implicit arguments require both the argument positions and the corresponding definitions to be annotated in some way . This has been bothering me for a while because it seems to confuse inference/elaboration with type signatures, and I've been wondering how a language might look if these annotations were taken out of signatures (everything is just an argument) and either indicated somewhere else, or alternatively the compiler just tries to infer as much as it possibly can given all in-scope definitions ... you could approximate this is scala by having a bunch of definitions where everything is marked as implicit.
🤔 1
j
I'll add to the others that this gives me dynamic scope vibes, though possibly with a dash of unification if you choose logic-programming semantics for filling in the missing values through inference?
m
Implicits effectively do encode logic programming semantics. In Agda and Idris that's an explicit aim (term inference is basically proof search). Scala is the same but somewhat accidentally.
👍 1
n
Interesting! I need to think more about the possible similarity to logic programming (which is already inspiring me in other ways 🙂)
@Brent @Miles Sabin Ok, so I've investigated implicits in Scala 3, and the related features in other languages (traits, protocols...) including Coq and Agda. All of them have one thing in common: they use the type system to deduce which arguments should be passed implicitly. Most languages do this in a very messy way: they look at all the implicits/traits/protocols in the current module, submodule, imported modules etc., and the compiler picks one of those. This is a static selection strategy. In my proposal, the implicit is provided on the stack like any other variable. There's no fancy "resolution algorithm" involved, and the implicit value can be constructed dynamically, which is a whole different kettle of fish. Now, Scala 3 offers something similar: there's a command to specify a "given Int" or a "given Bool" on the stack that should be used as an implicit argument. The problem with this is that you can only declare one implicit value per type (in a given scope), whereas my proposal doesn't care about types at all: it uses unique parameter names instead. Also, Scala's "given" declarations can't shadow (i.e. override) each other, whereas variables obviously can. This allows you to implement the equivalent of "default arguments", and override them on specific invocations. In conclusion: I think I've proposed a simpler, more general, and more intuitive approach to implicits than all the other manifestations I've seen.
g
nick. i regret to inform you that this is exactly how rebol and infra work
n
If by exactly, you mean dynamic scoping, we discussed that higher up in the thread.
Though I do remember Rebol's approach to scoping is unique, so now I'd better revisit it, you're right.
g
nope! dynamic scoping looks up symbols based on where they’re called from, right? infra and rebol let you do whatever you want with your substitution semantics. could be “look up lexically” or “look up the call stack” or “evaluate to a UUID and consult the editor for what the last-clicked on object was”. so i guess that the part that’s exact is the “absence of a parameter list” part, and the part where you’re performing arbitrary substitutions based on a set of rules from user input. they also work recursively, too
👍 2
for the record: i think it’s an extremely good idea. oh and the relation between uuids and functions in the same package sounds like the path infra would take—your modules have a structure, and you’re reflecting that structure via encoding it in otherwise arbitrary data. idk, smells like it’s almost a binary format to me 😉
n
Yes I believe under dynamic scoping you resolve all identifiers when a function is invoked, but I'm aiming for something in-between (which I'm not sure I've clearly described yet). My proposal has nothing to do with a binary format: users of the language won't give a damn about how a program or process is stored in RAM or disk, and the language won't expose it 🙂 (Unless they're profiling performance.)
A program definitely won't be serialized as ASCII, but that an implementation detail.
Ok, conclusion on Rebol: it's not the same as my plan. In particular, the following code prints "2" in Rebol (and in languages with dynamic scoping), but I mentioned earlier that I want to preserve lexical/syntactic reasoning, so I'd have it print "1" instead: x: 1 foo: [ print x ] x: 2 do foo
foo
would only look for
x
on the stack if it wasn't found at the definition site. So if you delete the first line,
foo
would print "2". This brings you closer to Scala's implicits than dynamic scoping, but it's still different, as I described a few posts up.
a
Having the source of the variable definition change based on the context of the function definition makes me nervous. It's the same feeling as accidentally declaring a new variable when trying to assign to an existing one (almost a dual problem). That's the kind of thing for which I prefer an error message (at whichever phase) rather than unexpected runtime behavior. For a text language, I would probably nope out on seeing that in the docs. I guess it's less of a problem if all your variables are secretly UUIDs, though. For text, I would be happy with a declaration at the function level that a particular var always comes from dynamic scope.
n
Yes, I agree it would be a mess for a textual language to try this. But in a structure-based language, I think it will work quite nicely, because you're never going to accidentally shadow a variable through naming coincidences. For every variable assignment you write, you're going to know what that variable means, and the places in your program that are able to "receive" it.
(Side question: is there an existing name for "non-text-string-based" languages? I'm using words like "structure" willy nilly.)
b
Interesting to read your clarifications on the differences you have in mind to Scala's approaches. I really like how this also removes the ability to use any inconsistent naming - passing "the same" variable to a parameter with a different name is often an area that generates a fair bit of code smell. Have you considered how this will work with 3rd party libraries? Its likely that libraries will use a different naming standard; so will some glue code be required to map to the naming in the rest of the code set?
n
I doubt that the parameters used by different libraries will often overlap exactly in meaning, so they probably won’t want to share UUIDs in the vast majority of cases. Stuff like “color” will probably be that exception. There could be a standard registry (like schema.org), or yeah, a little glue code could unify things (and is easier to implement).