<https://www.dropbox.com/s/ihrg4fg3f95kakq/tyranny...
# thinking-together
s
https://www.dropbox.com/s/ihrg4fg3f95kakq/tyranny-apply-eval-convivial-workshop.pdf?dl=0
Can Programmers Escape the Gentle Tyranny of call/return?
A paper from Marcel Weiher arguing against the prevalence of the call/return pattern and for replacing it with something more general.
🤔 1
🤘🏼 1
👍 2
e
The article is very interesting, and shows a temperature converter as the sample project, where you will have two displays, one of deg F and deg C, and changing either one should update the other. Since i have used that example, i can see that his solution is basically identical to mine; you have an internal model of the temperature, then you have UI widgets that display that value, and dragging the widget updates the internal value which causes the widgets to get updated. However, the title of the workshop, that call/return is some kind of problem is not actually addressed in the article. Call/return (subroutines) are an intrinsic part of all programming, as it effectively allows you to repeat a section of code and only have 1 copy to keep working. The only reason we use subroutines is to save duplicate typing. You can always balloon up your code and have no sub-functions at all, but it would be exponentially larger in length. Almost from the first computer the call/return statement was implemented because it is so useful at constraining code length. It is not a tyrannical issue; it is one of the obvious need for eliminating redundancy which is one of the top sources of error in software.
👎 3
👍 3
m
the connectors feature looks like what svelte does https://svelte.dev/
the "Trigger efficient, granular updates by assigning to local variables. The compiler does the rest." part
e
Svelte is a wonderful way to get around some of JS's most glaring problems. One can have a very interesting discussion whether it is better to bite the bullet and go for a new cleaner language, or use the existing language and soup it up with some inline stuff (Svelte approach). One can see people getting into Svelte very easily from the existing JS code, while a new language could take 20 hours to learn.
n
I disagree with Edward's assertion that "call/return (subroutines) are an intrinsic part of all programming". There is no evidence that the only way to re-use code is to put it into a function that is used in such a way that the output of the function (and/or its control flow) is routed to the location of the inputs. In fact, we already have a counter-example in logic programming (Datalog moreso than Prolog, which is a weird imperative hybrid). In logic programming, data flows between rules, and only ever flows "forward". There are no "return values", unless you start mashing logic programming together with imperative programming (which many people have done in the past). Dataflow programming has a similar philosophy.
👍 2
🤔 1
I'm working on a logic programming environment that has no functions, yet (eventually...) aims to be practical and allow re-use of code. This means there will probably be a need for macros to avoid needing to have slightly-different rulesets for slighty-different use-cases, but there will be no call/return behaviour.
k
Here's the website for the project: http://objective.st I don't understand the paper at all. If you liked it, please post a summary in different words. Perhaps that will help me triangulate. As best I can tell, the author is advocating for manipulating objects directly rather than instructions to create the objects. Does that cover it?
w
@Nick Smith what exactly do you mean saying that Datalog in particular shows that subroutines are not an intrinsic part of all programming? Let me throw out some ideas that you might want ot latch on to. Certainly Datalog rules induce a tree-ish structure on how some facts are inferred from others. This inference structure has strong parallels to a trace of subroutine calls and returns. Now in a pure (functionally) Datalog, inference has no temporal component: can happen in any order, results can be precomputed or cached, who cares. Imperative programs, in contrast, are best understood within a strict call return tree — what with shared state threads immediately giving rise to endless trouble. Finally, good old fashioned cut Prolog has a strict search strategy, which to me at least, feels practically identical to imperative call/return albeit enriched.
n
@wtaysom The standard evaluation strategy for Datalog programs is known as "semi-naive evaluation". I've attached the algorithm to this message. Notably, the only function call is "Eval/Eval-incr", which is just a lookup table for a sequence of relational algebra (RA) operations for the current rule (the program code) being executed. The RA operations are applied to sets of input tuples to produce a set of output tuples. So Datalog evaluation requires no call stack! Accordingly, it's not helpful to think about subroutines when reading, writing, or executing Datalog programs. It's easy (and recommended) to think of Datalog programs as a bunch of rules that "fire" in response to inputs to compute an output (a deduction), for as long as new deductions can be made.
Prolog on the other hand (as you point out), was designed around a specific (convoluted) evaluation strategy involving imperative functions and call/return traversal. I don't have any good things to say about Prolog, beyond its role in inspiring better alternatives.
w
Weird that it discusses the obvious elephant in the room, FRP, only towards the end, and then in the form of non-native reactive frameworks, not actual FRP languages, then dismisses it as being costly/complex/non-native.. seems the authors have some research to catch up with
💡 2
that said, doing something FRP-like in an imperative context seems valuable to me, though I'd like to see it at a larger scale: functions that map intrinsic data to derived data (like functional relation programming - different from functional reactive programming)
i
@Kartik Agaram I'm also having some difficulty understanding the paper, both the problem it sees with call/return and the suggested solution. Here's my sense of it: Most languages assume functions or subroutines (call/return) as the smallest unit of organization. You can use call/return to implement other organizational schemes, like dataflow or data-binding. But if you do that, • you still need to think in terms of call/return when deeply analyzing the system — when debugging, or doing other similar things that require mechanical sympathy. • the character of call/return set the character of the language, affecting all code written with it. Instead, it would be better if a programming language had additional kinds of organization at the fundamental / smallest / bottom level.
✔️ 1
@Edward de Jong / Beads Project You said:
Call/return (subroutines) are an intrinsic part of all programming
From the paper:
One of the reasons programming is so hard and requires seemingly excessive amounts of engineering is that the (linguistic) tools we use no longer match the systems we are expected to build using those tools.
However, the assumption that this particular architectural style is the only one that amounts to “programming” is so deeply entrenched that we tend to describe alternatives as not-programming, so modeling, configuring or architectural-description.
/ Objection! meme /
k
@Ivan Reese I see. So it doesn't matter that these additional kinds of organization are built out of call/ret at the lowest level, because he's taking the hit for the rest of us?
i
Yeah, pretty much. As long as you never need to think about things lower down than the language primitives. * tugs collar * At least, that's my read. Not sure I got the right takeaways.
👍 1
k
Struggling with the strange syntax gave me greater empathy for readers of my own paper with its own strange syntax.
🍻 2
e
I don't care what language or system you use, you will invariably find repeated sequences in whatever you are building. A macro system allows you to inline what would have been a function call/return block. Macros have positives - they are faster than function calls, and can be customized, so they are more powerful in a way. Macros are used extensively in assembler, and with enough macros Microsoft Assembler can look a lot like a higher level language. But macros make debugging much harder, because now you can't step through your source code, but instead have to step through the generated source code, and thus you now have a second form of your source code (the output of the pre-processor) to consider. The other drawback of highly nested macros is that your source code size could explode geometrically. It all depends on how deeply nested the macros get. In some sense, you can think of a LISP program as a giant pile of nested macros, which makes them super powerful and also super hard to read. You can play games with your language primitives, and pretend you don't have IF statements, or subroutines with CALL/RETURN, or LOOPS, etc., but it must at the end of the day devolve into the instruction set of the 50 year old computer we are using, which has only a very tiny set of atomic operations such as a 2-way branch, a n-way branch, call/return, arithmetic, etc. Re-use of some kind of formatting or methodology is desirable (if not mandatory), and i don't care what you call it, it will be some kind of subroutine that is called, and what goes down into a sub-function must come back up. This isn't tyranny, it is like arithmetic, a fact of existence that sub-patterns will be identified, coded, and re-used extensively. The more sub-patterns you can identify, the shorter your programs will be. All great programmers code in exactly the same way; instinctively seeking the minimal surface area of the program, by balancing number of sub-functions, and placing them into layers so that they have as few interconnections as possible.
n
The things you said about macros only apply to textual/text-file languages wherein the expansion of a macro means inserting a bunch of text inline and feeding the result into a naive compiler. If you throw away implementation preconceptions and just think of a (parameterized) macro as "here is a pattern for achieving a behaviour, now fill in the blanks", then no problems immediately arise. The concept is not problematic, but realisations of it can be.
And I disagree with the rest of your assertions about call/return, but I already gave that counter-argument earlier in the thread.
e
@Nick Smith perhaps you can give an example of this other kind of macro you are referring to. I was referring to the classic macro/preprocessor systems used in MS Assembler, PL/1, and even C's preprocessor which can be used achieve inline functions. If you have a function that computes the area of a rectangle, it can be considered a macro if you expand into inline code. A lot of clever compilers will look at short functions and inline them automatically. But my point is that repeated sequences of steps exist in every practical application. You don't just draw one of something, you often draw multiple quantities of some widget and the steps to draw that widget are encapsulated into a function/subroutine so as to centralize the code. How it gets called, could vary by language. In Algol-family languages, you explicitly call the function(), but in other languages which are deductive, the runtime system decides when it is time to call that function. I am a big fan of deduction; it saves a lot of work to have the computer decide when to call what. I have a preprocessor system designed for my Beads language, but have yet to use it. It comes up more when you are making variations of a system for different clients, and that is where the #if allows easy customization.
s
I'll try and add my perspective as @Kartik Agaram suggested. Lets say you build a dataflow or reactive or (insert suitable style) on top of Python or JS or (your language of choice). Now you run it, step through your code, what do you see - call/return functions! There's an error, what do you get? Call/return function stack traces! Wait, what happened to the dataflow objects, or other objects of your programming model? They're there, but they don't correspond to the runtime/debugging/failure model. The standard answer is 'build extra tooling'. This is the additional overhead, of mapping between architectural model and call/return. A very common example here is using async handlers - the call stacks are meaningless because the 'process of interest' is fragmented across multiple stacks and runtime state (the process of execution). The paper gives a few examples where 'call/return' is not used in defining a system (e.g. constraint connectors and storage combinators). An excerpt from the constraint connectors language on page 12:
Copy code
ivar:f |= (9.0/5.0) * ivar:c + 32 .
This expression describes the constraint between
f
and
c
and importantly, does not employ call/return semantics. The key point is you can build other models on top of call/return, so we end up doing it everywhere:
While there are few attempts to generalise from call/return to connectors, there are many, trying to simulate different architectural styles using call/return style programming is very common.
The main thing the paper is arguing for is more research into support for other styles natively, and not simulation by call/return. Call/return is a very specific kind of pattern - data in / data out - it's a calculator. But where does the data go? And where does it come from? The system is not a call/return function - but why is all the wiring and ongoing behavior implemented with call/return calculators?
💯 3
n
@Edward de Jong / Beads Project I'm using the term macro broadly. I could also be using the term "template" in its place. I'm referring to any language construct that describes a pattern with details to be filled in. Accordingly, my definition is independent of any compilation or execution details. My original assertion is that any realisation of this broad concept can fulfil the role that you claim functions with call/return semantics need to fill. Unfortunately, I can't give you a concrete example of a macro/template feature in a language that is implemented by a means other than functions or a debug-harming preprocessing step. But that's because the status quo of programming is functions (and sometimes preprocessing), which brings us full circle. Hopefully you can see that this broad definition admits other approaches for a language that doesn't include functions. A compiler with "smart preprocessing" that preserves the mapping between the original source and the (opaque, implementation-defined) runtime representation is probably the basis for any approach.
c
The problem with a lot of heterodox programming paradigms is they fundamentally obscure their own mechanics, because they have to run on an x86 processor, (because x86 processors actually exist and easily accessible by a lot of humans). My most hated experience in programming is hitting some leaky abstraction, and my program not doing what I expect for some reason that is literally impossible to work out from "within" the system. Eventually you have to dig around at a lower level to find out what' "actually" going on. For example, two functions might appear mathematically identical, but in one form the compiler is able to transform tail recursion into a loop, drastically changing what actually happens in terms of memory. So, my only contribution in this discussion is if you are doing this to please, please, please make it possible to seamlessly "peel away" these layers of abstraction. For example, in the temperature converter - in terms of actual physical reality - are there TWO places in memory, one storing 0 (Celsius), once storing 32 (Farenheit), which are kept appropriately in sync, or is there ONE place in memory storing X, all operations on which are appropriately transformed? This distinction might seem trivial, but actually it is likely to have massive user-impacting consequences at some unanticipated place in the future.
👍 1
❤️ 1
i
@Chris Knott Are you familiar with this? https://queue.acm.org/detail.cfm?id=3212479
c
Yes it is great, but I feel like people often use it in discussions as a kind of Continuum Fallacy - "You think your language is low-level ha! What about CPU cache sidechannel attacks! C is no more low-level than Javascript!". Obviously, unless you are soldering wires you are abstracted away from what is actually happening in physically reality, but at the same time, clearly C is to a very great extent more "honest" about what is happening that e.g. Haskell or Lisp. This fact about C has obviously become much much less true in the last few decades with multicore processors and deep memory cache hierarchies, which C dishonestly conceals, but to me that just means there is need for a high-level language like C that is more honest, not need for a more abstract language where physical reality "matters less", or C compilers that slice and dice your code to a bewildering extent.
☝️ 1
To put it in more concrete terms, I think it is a bad feature of languages where you get StackOverflow questions coming from earnest users saying "Why is my program acting like this?!?", and then power users answer them with an answer from OUTSIDE the system. ("Ah, see this is a consequence of how CPython implements..."). When I was a novice programmer, I had an algorithm that was O(n) in CPython, and O(n**2) in Pypy. The reason was that CPython implements list.pop(0) as a pointer increase, whereas Pypy copies the memory to a new location. Discovering this information i.e. discovering what was "actually happening" (obviously, still an abstaction, just a less leaky one) was a process of digging round on Github, asking questions on dev IRC channels etc. In the future of programming, this process of discovery should be seamlessly possible from within the environment. I should just be able to keep opening up the machine and "seeing how it works", until I hit (at least!) x86.
💯 2
i
Those are excellent points. My counter is that, with the skills I have, I would not be able to create a FoC language that allows someone to dig as far down as x86. By necessity, I need to make contributions at a level far removed from the true behaviour of the underlying machine. I will need to build on top of abstractions that already exist. If those existing abstractions are poor, I can't solve that problem myself, and I will almost certainly need to introduce my own abstractions explicitly to hide underlying issues (nobody wants to to make the mess bigger, but sometimes you need to given all the circumstances). But perhaps with many people creating better things at the higher levels, it will motivate the people responsible for lower levels to improve their abstractions — like we've seen with GPU designers implementing new things in response to the demands of game developers, or folks working on RISC-V or ARM instruction sets removing complexities that don't fit with modern use cases — and then the compensatory intermediate abstractions can be peeled away (as we've seen with the new graphics APIs like Metal and Vulkan) Yes, I'm just making a "perfect is the enemy of the good" argument. I agree broadly with all your points. I just don't think it's reasonable to say that all new FoC projects must or even should conform to them.
❤️ 3
s
So, my only contribution in this discussion is if you are doing this to please, please, please make it possible to seamlessly "peel away" these layers of abstraction.
In the future of programming, this process of discovery should be seamlessly possible from within the environment.
💯 Strongly agree with that. I call this permeable layers. However I think we still need layers, and in fact better higher level models. Programming x86 is too low level, which is why compilers and higher level languages are popular, of course. It's a tragedy call/return is deeply embedded in the prevalent hardware, however it's not the only thing available (there's also `jmp`😉). Seems like there's kind of this feedback loop that keeps some designs propagating - hardware influences software and software influences hardware. E.g. Intel is not going release a chip that deviates from the call/return stack or use tagged memory only because it immediately becomes incompatible with all existing software. Industry will continue to evolve and extend C and C-like replacements. The only way out is to have different models in software first, mapped to the existing hardware. Over time new hardware might become economically feasible. In any case, good layers also let you stay at the higher level (don't require dropping down) as much as they enable dropping down by choice. I'll leave this here: https://www.quora.com/People-who-are-really-serious-about-software-should-make-their-own-hardware-Why/answer/Alan-Kay-11 ("Computing is about _processes_")
1
👍 1
But perhaps with many people creating better things at the higher levels, it will motivate the people responsible for lower levels to improve their abstractions
Yes, this is what I was trying to say. I see Ivan's point too. I think the principle of permeable layers is solid (just like in principle, we should redesign hardware too). Finding good higher level programming models is very valuable and fully compatible with the principle itself (which is kind of a defferred goal).
c
I should clarify that in "real life" I am also totally "perfect is enemy of good" and massively in favour of real tools that actually exist now and actually empower people. For example, I made a reasonably popular Python GUI library (https://github.com/ChrisKnott/Eel) that completely obscures it's mechanisms. I get a lot of Github issues from people that have obviously run into trouble with this incredibly leaky abstraction. However, I am also often heartened by just how "bad" most of my users are, people who aren't interested AT ALL in "good programming" and just have some problem they need fixing as easy as possible - e.g. this guy - https://github.com/samuelhwilliams/Eel/issues/264 - hit some unsolvable problem today with my library, dodged it himself in some hacky way and went on his way. From his profile it's obvious he is a Psychology professor http://www.reading.ac.uk/psychology/about/staff/a-haffey.aspx - ("I am studying individual differences in human reward processing. I am specifically interested in the question of how autistic traits influence social and nonsocial reward processing.") and was just making some temporary program as part of an experiment. It would be borderline immoral to burden people like this with Rust ownership semantics. At the same time, ideally, if he wanted he should be able to dig through those layers.
❤️ 1
👍 2
s
Yeah programming is a means to an end - which is some other human endeavor. Hmm I'm now thinking if 'looking below' the layers of abstraction is just a personal preference of us 'programmers'. I mean, most people who drive cars don't need to know the mechanisms, they just need to learn the behaviors for the context they encounter - how the car reacts to pushing the pedals, acceleration, braking, skid etc (note: skid is a 'leak' in the 'brake' abstraction). A complete layer of abstraction is perhaps a higher goal than a permeable one.
k
At the risk of sounding like a broken record, @Chris Knott my project is doing precisely this sort of transparency all the way down to x86. https://futureofcoding.slack.com/archives/C5T9GPWFL/p1584343418341800?thread_ts=1584343418.341800 https://futureofcoding.slack.com/archives/C5T9GPWFL/p1578083883321500?thread_ts=1578008614.279100 @Ivan Reese:
with the skills I have, I would not be able to create a FoC language that allows someone to dig as far down as x86.
Please build your prototypes atop Mu! It's nowhere near ready yet. I have no idea how to do graphics or sound. But a forcing function would be helpful. For example, perhaps there's a minimum in sound syscalls I could provide that would enable text-mode UIs for music.
@Chris Knott:
It would be borderline immoral to burden people like this with Rust ownership semantics. At the same time, ideally, if he wanted he should be able to dig through those layers.
From my abstract:
The hope is that rewarding curiosity will stimulate curiosity in a virtuous cycle, so that more people are motivated to study and reflect on the difference between good vs bad design and good vs bad architecture, even as the study takes place over a lifetime of specialization in other domains.
👍 1
@shalabh, the synthesis between your (paraphrasing of the paper's) point and Chris's seems to be: a) It's good for abstractions to terminate in the machine model. That's a feature, not a bug. b) There's some missing layers in between where you can step through your programming model in its own terms. This feels like a gap in tooling that necessarily depends on whoever is providing the model. Which is why I was so confused that the paper was blaming call/ret. It's saying that implementors of new models are still cognitively captured by the model they're implementing in. Einstein was troubled by the implications of his theories in ways that we are quite comfortable with. Jesus was Jewish.
w
note, the "tyranny" of call/return is not just enforced by CPUs, it is also also enforced by compilers like LLVM. The moment your language implements a calling/control flow structure not representable in LLVM by call/return, you immediately take a 10x or worse hit to performance, since now LLVMs suite of optimisations don't apply to your alternative structure (which is likely emulated by using memory directly). So to escape the tyranny of call/return while staying competitive, you need to be willing to put in similar effort to create a new compiler infrastructure
👆 1
i
Well, yes and no. What if your structure is representable by GPU ops, or SIMD ops?
w
GPU's and SIMD are even less flexible that general CPU instructions or whatever LLVM IR can represent? How would they help?
i
They offer alternatives to CPU norms, and are faster for certain kinds of work. It's possible (though — sure — unlikely) to imagine implementing a novel programming model (or part of one) in terms of these. I mean, after all, look at what folks are doing with ML on GPUs. LLVM IR is not a great substrate. For instance, Lattner designed https://mlir.llvm.org as an even more general representation — even he wasn't happy being constrained to LLVM IR. So your point is a good one — sometimes escaping tyranny requires changing the compiler, indeed. Nice work if you can get it. (Also — note that MLIR includes support for dataflow right in the IR.)
s
Somewhat tangential, but this blurb by Mike Pall (author of Luajit) is interesting: TL;DR the layers of abstraction just above/under the machine code look wasteful as well: https://www.freelists.org/post/luajit/Ramblings-on-languages-and-architectures-was-Re-any-benefit-to-throwing-off-lua51-constraints
Also, lets not forget FPGAs.
👍 1
w
How does SIMD offer an alternative to call return? I'm confused.. and MLIR improves on the generality of the IR representation, it doesn't necessarily change anything about the computational model it represents
i
SIMD offers an alternative to the perf hit of not aligning with LLVM — if your new language primitive can be expressed efficiently in terms of SIMD, then missing out on the LLVM optimizations doesn't matter to you. This is of course just a theory. For one, I don't have any particular primitive in mind. For two, I don't know enough about LLVM — perhaps there is, in fact, no computation that can be accelerated by SIMD that wouldn't be meaningfully slowed for lack of compiler optimizations.
n
Also I'll just point out that I'm getting a vibe from a lot of people that their "future of coding" involves C-like performance. Most applications aren't limited by operations per second... so I'm puzzled by why people are limiting their visions to ones with a clear mapping to C-like constructs. Only asymptotic complexity matters broadly, since that is what separates the possible from the impossible.
👍 4
e
LLVM and its gigantic array of optimizations, is insanely complex, with diminishing benefits for ordinary programs. I once did a test of a huge program and compiled it with lots of optimizations, and then compiled down to 486 era only instructions. About 1% difference. The sad truth is that CPU's are so fast now that both sides of a branch are executed, and many things that used to cost don't; but what determines your overall performance is how you have your data arranged in memory. If you can keep from hopping around RAM and stay in the caches your program runs like the wind. That's why some of the C programs are so fast, is that they were laid out very carefully from a data structure point of view, and why Java is notorious slow, because it sprays objects all over the heap. 99% of computers are idle nowadays; outside of big data and very specialized applications performance is really the last place to put your effort. Ease of maintenance trumps CPU costs in the vast majority of applications. That being said, there is no excuse for the sweathogs that we see today, where hundreds of megabytes are used to draw a single tab window in Chrome, etc. It is also true that CPU's are often advertised as being much faster than they really are. I once did a benchmark on a fairly CPU intensive app, and found that an e3 was faster than a far more expensive Xeon processor with many cores, because this was a mostly single threaded program, and adding all the cores trades off performance for single threaded. Also i discovered that some programs are faster with hyperthreading turned off, particularly those that do lots of interrupts because pretending you have more cores than you really do entails tradeoffs as well. Intel's product line is bewildering and a lot of products are sold for huge premiums just because they have some super rare instructions that spy agencies want. There's lots of nonsense out there. Intel hasn't added a useful instruction since they added true random numbers; most of the new ones are crazy; designed primarily to waste the engineering time of the cloners in China.
w
Reminds me of a Ruby interpreter story from a while back. (I forget the details.) But the gist is that the struct for a Ruby object is generally five words in size. Padding the size to eight words yielded a performance improvement from lining up better CPU caches.
e
Alignment used to matter quite a bit. I don't think so any more, because so much pipelining is happening. They have so many billions of transistors to throw around, but RAM is holding the whole thing back. RAM speeds have only slightly increased compared to CPU power; billions awaits to the company that creates a breakthrough. DRAM has hardly changed. And is the main cost in servers. A few years ago, Samsung had a terrible recall for the Note cellphones that caught on fire; they lost billions and made it up by tripling RAM prices, and then everyone else raised their prices. Took years to come back down again. There is a revolution coming where the RAM will no longer be volatile. That is a very interesting change that i have planned for in my Beads system. Writing to the disk will be a thing of the past.
c
Also I'll just point out that I'm getting a vibe from a lot of people that their "future of coding" involves C-like performance. Most applications aren't limited by operations per second... so I'm puzzled by why people are limiting their visions to ones with a clear mapping to C-like constructs. Only asymptotic complexity matters broadly, since that is what separates the possible from the impossible.
I didn't mean to suggest that C like performance is necessary. In fact I think optimising compilers are very bad for understanding. The examples I gave were cases of asymptotic complexity being different, and in high level languages. Shalabh's async track trace example is one that has nothing to do with performance. For me it's not about the performance so much as the tension of presenting a model of computation that is fundamentally at odds with reality. The disagreement is whether or not this matters. The answer to this question basically comes down to how airtight your abstractions are. My main point is these issues can be avoided if you just make it possible to draw back the curtain. If your abstraction is blocking my understanding, let me easily remove it. If your abstraction is not blocking my understanding, whether or not I can remove it is moot because I won't want to.
n
If a language has a bad abstraction, then the solution is to replace it with a better abstraction, not expose the machine code (the "reality") that the bad abstraction compiles to. I can't think of any situation in which exposing the machine code is a good idea, because if the user has to think about machine code, then my language has failed to achieve its purpose, which is to be a new foundation for specifying & reasoning about computation. What I do want to expose via my programming environment is the time & space complexity of programs, so the user can reason about what is possible and impossible. If the language is designed well, then the environment can provide an intuitive explanation for why a program has a particular complexity.
c
Possibly a failure of imagination on my part, but I guess I'm struggling to understand what you mean by "time", if not number of CPU instructions, and "space" if not number of memory bits. If you do mean these, then isn't your abstraction tied tightly to "C concepts" as well...?
n
The CS notions of time and space complexity have never been based on CPU instructions or memory bits. They talk about the rate of growth of time and space measured in abstract "operations" and "stored data" relative to the "input size" of your program, the concrete units being irrelevant (since we're interested in the slope).
You can't provide an exact number for operations or data storage regardless of what units you use, until code actually runs. The number of concrete operations in any program with a branch instruction varies with the input, but its (worst case) time complexity will be fixed, across all inputs.
c
Yes, but specifically in your language, how are you measuring "operations"? Surely in a way that maps very closely to x86 instructions? Otherwise, what's the value of this information? Obviously it's possible to define some notion of "operation" that treats add(a, b), sqrt(n) and factorize(n) all as "one operation", and then discuss time complexity in terms of these - but this is not an analysis that is particularly useful. We have actually built machines that map "memory space" to metres (RAM), and "instruction count" to seconds (CPUs), which is why time and space complexity analysis (in these terms) is useful to humans.
n
I’ll ask you the same question: how do you measure operations in a C program? An x86 program? How useful is that measure? Whatever measure I choose would be no worse.
For that reason, I’m not going to tell users how long a program did/might run by operation count, like “173”. It’s not useful. I could tell them the runtime in seconds, if they need to benchmark.
(Spoiler: you can’t specify how long an instruction is going to take across all CPUs, not even all x86 CPUs, nor all Intel x86 CPUs. It varies by model. You also can’t predict how long any memory access is going to take in any multi-threaded or multi-process system, which is all consumer systems).
c
The sentence I was struggling with was;
What I do want to expose via my programming environment is the time & space complexity of programs, so the user can reason about what is possible and impossible
Perhaps I will have to wait to see a concrete example of the kind of environment that you mean
n
You could, but I’m talking distant future. Designing a programming language by oneself is a Herculean effort. By that sentence, I mean literally O(n), O(nlogn) etc, but with a better syntax, not math notation.