Hi friends! I'm new to this slack, but I've been d...
# share-your-work
j
Hi friends! I'm new to this slack, but I've been dabbling in this space off and on for a while. I've completed a draft of a blog post about type inference in my projectional language / structured editor, and I'd love to get your feedback! "Type inference that sticks". thanks! 🙂
k
That looks very promising! Types are statements about code, so it makes sense to move them to an annotation layer and have tool support for that layer. Next: make that layer optional (perhaps it already is). Then: allow multiple type annotation layers for the same codebase. Allow programmers to add their own domain-specific type systems.
j
Great stuff! My favorite part, and I wonder if this would be the selling point for structure editors for pro-coders (I don't think it will ever be "avoiding syntax errors" etc.):
the compiler is involved in the entire process of development, responding to programmer feedback, and giving context-aware feedback in return. In this model, the source code is enriched with the compiler's inferences along the way, which the programmer can inspect & correct (but can also be elided from display for a more compact presentation).
About this:
When typing an identifier in Jerd, if you select something from autocomplete, the ID of that term is stuck onto the identifier node, and then the compilation pass doesn't need to do any name resolution -- it knows exactly what is being referenced. And, if you later change the name of a variable, none of the references are broken.
How do you plan on handling the case where a new higher precedence function overload is added after a id+arguments code has been 'bound' to an overload that should now be preceded by the new overload for the given id+arguments? So you could end up in a situation in which exactly the same function call expressions bind to different overload depending on which overloads existed at the time of writing (binding) of the call. I guess you could just run the overload resolution again for all the calls, but what if the result for a call depends on a (now hidden) user defined type annotation. You could still end visibly exactly the same function calls binding to different overloads. How would possible implicit casts/conversions fit in?
j
Glad to see you working on interactive approaches to this problem. 🙂 If you're looking into interesting prior art in the space, I strongly recommend this paper: https://plover.com/~mjd/misc/hbaker-archive/TInference.html
j
@Konrad Hinsen 🤯 domain-specific type systems? Do you have any more thoughts on what that would look like? As far as whether the layer is optional, it very much isn't as knowledge of types informs code generation 🤷 . @Jarno Montonen Thanks! So, in this system, we no longer have a notion of "precedence" in function overload resolution. If there is any ambiguity (multiple functions w/ the same signature), the answer is to always ask the user.
You could still end visibly exactly the same function calls binding to different overloads.
So that's already the case with
a + b
and
c + d
, if a & b are ints, but c & d are strings, right? Now I actually solve this by coloring all identifiers according to the hash of the definition, so the two function calls would very likely have different colors, allowing you to immediately recognize them as distinct. @Jack Rusher thanks for the link! fascinating read.
j
Okay, so say there exists a 'myfunc (double x, double y) -> double' and an implicit conversion from int to double. the user writes the call 'myfunc(1, 2)' which binds to the function. now user adds 'myfunc (int x, int y) -> int' and writes another call 'myfunc(1, 2)'. Do you ask the user which overload to use or bind to the int version? Do you ask the user to re-evaluate which overload to bind the first call to? Either way the user could end up in a situation that exactly the same calls (down to argument types) bind to different overloads, possibly without the user ever choosing that the first call should bind to the less-specific double overload. Of course if the only difference in the signature is numeric types, it's likely that the functions do the same thing, but you could end up with a more problematic case with user defined types.
k
@Jared Forsyth My favorite example of a domain-specific type system is dimensional analysis in science and engineering (https://en.wikipedia.org/wiki/Dimensional_analysis). It illustrates the general characteristics of domain-specific typing: (1) types apply to some but not all data/variables in a program and (2) each piece of data and each variable can have types from another type system as well.
y
Regarding:
Prior Art
I don’t know if they [Lamdu] store the inferred types in the syntax tree, or if they re-compute them on every edit.
Lamdu does a combination: • For every definition (global variable/function) it stores the types of every definition that it used, so that if that definition changes its old type is still used for inference until a manual (and informative!) “update type” action • Inside the definition it does re-infer it on every edit, and an edit that would had resulted in an error in a normal language gets inserted as wrapped in a “fragment” (aka “non-empty hole” which was our old term and was also adopted by Hazel) And btw, Lamdu also has the option to mark the new usage with fragment as the correct one and push the error to other parts of the code. It’s all explained/demonstrated in more detail here:

https://youtu.be/viF1bVTOO6k

j
oh fantastic, thanks @yairchu for the more info! exactly what I was hoping for 🙂
@Jarno Montonen If there's the possibility of ambiguity, always ask the user! Also, adding a new function never changes previously defined code.
j
So in my example, user would have to go through all the calls to a previous overload and select the correct one? Could be quite a bit of work in a big codebase.
j
If you actually want to change the behavior of all of your code, it will take some work, yes 😄 there'll be automated assistance however, that will allow you to "re-base" existing functions onto a new function (or indeed a changed definition of a previous function), while running any relevant tests. hash-based identification is definitely a different paradigm. By default, the behavior of a function is deeply immutable, but there are tools to migrate changes broadly if that is what you want.
You can think of it as pull-based vs push-based. The current general mode is pull-based (everything always auto-updates dependencies), but jerd (which takes inspiration from unison) is push-based (dependency updates are explicit). We're used to that pattern in external dependencies, and here it's applied to internal ones as well.
(this is all experimental of course, it may well turn out that operating this way is prohibitively annoying 😅 )