Hey folks! I'm doing some research into code formatters/pretty printers, with a view to potentially building my own. Does any one have any examples of code formatters that they've found especially fast, or that they know of which have a special focus on performance?
I'm interested in learning about some of the potential techniques at play - what sort of caching and parallelization is being done, are they doing incremental parsing like tree-sitter etc..
11/02/2023, 6:09 PM
I’ve found fnlfmt, for the fennel language really readable.
11/02/2023, 6:33 PM
FYI, maybe - I’m not interested in speed, and this might not apply. I note that Ohm-JS has a interactive parsing mode (I haven’t tried it). I wonder if that means that one could attach semantics code (that’s the way that Ohm-JS is used) that does some formatting?
[reply from John] Again, thanks! I’ll definitely take a look. Most of the formatter parsers I’ve looked at (e.g. Prettier, Gofmt), will create the AST of the full file, create an IR, and then do the formatting.
I’m wondering if each of these steps can be streamed incrementally, and if you can do each operation in parallel.
My initial thought is, possibly, but not without some difficulties. For example, Prettier formats line breaks with some awareness of parent and subsequent sibling nodes in its IR).
I know that CodeMirror uses Lezer, which is capable of doing incremental parsing. https://codemirror.net/examples/lang-package/
If you're not familiar with CodeMirror, it's a text-editor component that is highly customizable.
11/03/2023, 1:47 AM
tree-sitter is really fast, I don't know if anyone has built a formater for it?