Daniel Garcia
01/30/2019, 4:41 AMFelix Kohlgrüber
01/30/2019, 9:49 AMFelix Kohlgrüber
01/30/2019, 9:49 AMIvan Reese
Ivan Reese
Ivan Reese
Ivan Reese
Ivan Reese
Ivan Reese
Daniel Garcia
01/30/2019, 7:06 PMFelix Kohlgrüber
01/31/2019, 7:19 AMFelix Kohlgrüber
01/31/2019, 7:59 AMIvan Reese
Text as a storage representation is indeed universal in a sense that text-based tools work with all data that's text based.... I know what you mean, but there's a subtle implication that (eg) all compilers operate on all programming languages. (Sit down, LLVM.) That is of course not true, and you make that point, but we need to be careful that we don't accidentally hold VPLs to that absurd standard of universality.
Ivan Reese
Ivan Reese
Ivan Reese
Ivan Reese
Daniel Garcia
01/31/2019, 8:56 PMDaniel Garcia
01/31/2019, 8:58 PMIvan Reese
Joshua Horowitz
01/31/2019, 9:51 PMDaniel Garcia
01/31/2019, 9:56 PMDaniel Garcia
01/31/2019, 10:02 PMJoshua Horowitz
01/31/2019, 10:06 PMshalabh
02/01/2019, 1:51 AMIvan Reese
Edward de Jong / Beads Project
02/01/2019, 9:59 AMgman
02/07/2019, 5:52 AMFelix Kohlgrüber
02/07/2019, 9:08 AM1
literal to a 0
), but fails in more complex cases (e.g. changing the order of elements, wrapping an element, ...). It's like comparing two strings by encoding them as a images first and performing a pixel-by-pixel diff on the images afterwards. Even small changes like adding a character would lead to large differences in the encoded images. We should really move past text-based diffs and use AST / data structure diffs instead IMO.
It's also interesting to note that more accurate diffs also reduce the number of conflicts. For example, if user A adds a parameter to a function and user B changes the return type, a line-based diff would yield a merge conflict. A diff that works on an AST-like structure could detect these changes accurately and combine them without producing a conflict.
Splitting code into smaller files seems like a hack to me. It forces git to only search for changes locally (within the file), which might indeed help getting smaller diffs for local changes. On the other hand, non-local changes like combining two files into one become even harder. If we'd perform diffs on data structures instead, local changes would lead to small diffs quite naturally...Stefan
02/07/2019, 9:33 AMWe should really move past text-based diffs and use AST / data structure diffs instead IMO.Yeah, agreed. This comes up every now and then so that it almost feels like we all agree on this, but yet I haven’t seen this anywhere. Why is that? Is it too hard to implement? Are there implementation challenges that I can’t see (because I haven’t tried)? Or maybe is there a good solution I just haven’t heard about that just needs to become more popular? Or is it because it’s too daunting with all the text-based tools (git etc.) in place that nobody really believes we can pull this off? This seems like a smaller, much more manageable version of the grand “let’s revolutionize programming” problem, so if that is not happening for whatever reason I wonder if there are any insights to gain for the even bigger challenges we discuss here.
duncanawoods
02/07/2019, 11:13 AMWe all agree on this, but yet I haven’t seen this anywhere. Why is that?I'll add some dissenting opinion for you 😄
duncanawoods
02/07/2019, 11:14 AMduncanawoods
02/07/2019, 11:16 AMmaxd
02/07/2019, 11:32 AMIt takes a lot of effort to look at an AST and actually match it to the code you wrote.to be honest, it takes me some effort to read usual text diffs, some highlighting, as GitHub and Git UI clients do, improve this a lot. So maybe by AST diffs we should always imply an ability to render those nicely?
duncanawoods
02/07/2019, 11:37 AMduncanawoods
02/07/2019, 11:40 AMFelix Kohlgrüber
02/07/2019, 12:27 PMduncanawoods
02/07/2019, 12:41 PMFelix Kohlgrüber
02/07/2019, 12:46 PMFelix Kohlgrüber
02/07/2019, 12:54 PMduncanawoods
02/07/2019, 12:55 PMFelix Kohlgrüber
02/07/2019, 12:55 PMStefan
02/07/2019, 2:19 PMduncanawoods
02/07/2019, 3:22 PMduncanawoods
02/07/2019, 3:23 PMa concrete example where semantic merge conflict resolution is worse than text-based merge resolutionThe typical "worse" result would be a type conflict leading to a dead-end whereas a text merge can just smush things together. It might be wrong but closer to the solution. I'll try and give the simplest example. Assume three way merge within lines: Original:
var x = 3;
User A: var x = 4;
User B: var x = "3";
Text merge: var x = "4";
Semantic merge: conflict - can't merge string literal and number literal
Essentially I needed pairwise merge rules between different types of semantic transformation. Text merge only needs to implement one pairwise merge rule for all text edits.
In this case, we could combine text and number edits by stringyifying numbers but doing this type of thing is relaxing semantic constraints rather than taking advantage of them so it now seems like a hard way to do a dumb text merge.Stefan
02/07/2019, 3:32 PMIvan Reese
Ivan Reese
So the interesting question to me is whether it'd be possible to create a "universal" general-purpose binary format that could be used to encode domain specific data structures.UTF-8? * re-smoke bomb *
Felix Kohlgrüber
02/07/2019, 5:15 PMshalabh
02/07/2019, 5:35 PM> We should really move past text-based diffs and use AST / data structure diffs instead IMO.
Yeah, agreed. This comes up every now and then so that it almost feels like we all agree on this, but yet I haven’t seen this anywhere. Why is that?I think you can't retrofit this idea onto systems built around the plain text medium. For any language, the language designer has put a lot of thought into the text syntax with specific goals such as comprehension. How editors and version systems treat the text files is not something that is designed at all - it's already established and an important context that the designer works within. The AST is designed, but not for the same purpose as the surface syntax. When talking about AST/graph based versioning, we're really talking about a context shift - a deeper shift in the primary surface medium - i.e. some structured editing medium instead of plain text blob editing. Really we want a top to bottom redesign here - languages designed for structured editing combined with a common medium and tooling for viewing and manipulation. I don't think anyone has nailed this yet to get the fluidity and flexibility needed for multi purpose expression. As an example of a frame with a different medium - if we were to say we're going to implement a language to express programs as 'spreadsheets', we're immediately thinking about what the rows here or columns there could represent, and not the high level plain text syntax. Diffs and versioning also falls out of the medium - we think of cell oriented diffs. Another aspect with structured editing is that versioning could be integrated with the editing experience. The idea is you want to preserve identity of the cells/nodes/items while you manipulate them. The possible power for semantic merging does seem higher than text. E.g. you might be able to encode that multiple items added to an entity are all merged and global identifier renames merged properly with other changes. But we'd want any custom merge logic to also be embedded in the medium.
Stefan
02/07/2019, 8:34 PMAidan Cunniffe
02/07/2019, 9:10 PMshalabh
02/07/2019, 9:21 PM