What are best practices for writing a projectional...
# thinking-together
d
What are best practices for writing a projectional code editor? What data structures do I use? Source code isn't text, it is a tree. Probably the tree is annotated with semantic information: What binding is this identifier bound to (if known)? What is the type of this subexpression (if known)? How do you keep the annotations up to date while the code is being edited? Commentary is obviously treated differently. In old school text based languages, a comment is /*...*/ or //...\n, is treated as white space by the lexical analyzer and discarded early: they don't appear in the AST. In a tree language, comments must work differently. They aren't a kind of white space that appears between tokens, they are either tree nodes or attributes attached to tree nodes. Examples of how this works? What is the syntax of a comment, and are there different kinds, depending on the part of a code tree they are annotating? For example, within a module (a list of definitions), a "section header" comment that introduces a block of definitions might be treated differently from a comment that describes a single definition.
a
I'm brand new here, but I've been spending the past few weeks cobbling together something similar to what you're talking about. I took inspiration from something Joe Armstrong said in this article regarding comments [1]. Namely, that comments don't really come before or after a block of text. They're an annotation on that block of text. In most programming languages, that means putting the comment directly above or below a block. With a projectional editor, you have the luxury of attaching a comment directly to a node in your language tree. In the tool I'm hacking on, I represent code on disk as json, and every object in the json file can have a "comment" property. [1] https://joearms.github.io/published/2013-05-31-a-week-with-elixir.html
I guess that doesn't actually answer your question. I don't have any formal justification for what I'm doing. Having different types of comments reminds me of the docstrings some languages use to auto-generate documentation.
z
How do you define a projectional editor? If you define it as a non text editor projecting code into text then yazz pilot works in exactly that way having a text editor projection, a vb drag and drop editor projection, a database schema projection, and smaller projections for individual components too. Also considering building a VR projection
d
I want to build a projectional editor for a general purpose programming language. Right now, I have an interpreter and compiler for a text based language. • The parser converts text to a parse tree; • the semantic analyser converts the parse tree to an intermediate representation (IR) that can be either directly interpreted, or • The code generator converts IR to code. To make a projectional editor for this language, I need different internal data structures. Rather than grope in the dark, I was hoping to learn lessons from an existing successful design.
s
@Doug Moen Disclaimer: this is not informed by any other projects or literature, but just what I came up with. Regarding comments, in alv I do also need to keep comments in the parsed tree rather than just ignore them during parsing (I need that for the editor prototype, although that is a minor interest, but mainly to be able to turn the parsed tree back into the exact string it was parsed from. You'll need that for a projectional editor too I guess). To do that, I firstly consider comments as whitespace. In my syntax grammar, all bottom-level constructs (literals) contain no whitespace, but there is a piece of "required whitespace" between arguments in an S-expression and a piece of "optional whitespace" after the opening and before the closing parenthesis. So an S-Expression contains N child-expressions and N+1 (potentially empty) whitespace segments. I do not parse the comments especially or associate them to other elements, but your whitespace could also be a structure that explicitly stores comments as separate entities. EDIT: in case it's not clear, the structure I use is an AST.
m
are you familiar with zipper datastructures? A zipper is a good starting point for the datastructure representing your editor state
d
@Aleks - I think you did answer the question, because the goal is not to have a perfect representation of textual code, but to capture the information that the textual source code is otherwise attempting to represent. For example, you don't so much need to represent a comment before, after, or inline with a line of code. Instead, such comments are the only means for providing metadata (annotations) to parts of a program. So do that, and then if you need to generate source code from it, then you could inject comments as the only way to capture that metadata in the output. Not the other way around.
g
slightly tangential, but HTML natively includes comments in the source tree: https://developer.mozilla.org/en-US/docs/Web/API/Comment
not recommending you take the same approach, just think it’s interesting
a
FWIW, when I wrote a browser-based projectional editor for a blocks language, I found it easiest to write it just like I would any other form on the web. I had one UI component per type of AST node. The components rendered text fields and drop targets and other standard web UI affordances to support editability. Variable bindings, types, etc, are derivative information that I could cache when necessary, or just recompute from the AST as-needed. I mostly did the latter. Because I designed the AST for projectional editing, there were comment slots on any node that would benefit from special rendering of comments, and I had a comment block node that you could wrap any other block of code in to describe what that block did. If you’re using someone else’s AST, you may not have that luxury, but you may be able to write your own AST for editing and a translation layer between yours and the executable AST. I have parsed JavaScript ASTs to guess which comments document which other nodes and with some heuristics you can guess pretty well. Hope this helps!
👍 1