guitarvydas
01/22/2023, 2:06 PM... function-name (...) {...}
In Ohm-JS, you can say this, whereas in a CFG-based parser generator you need to over-specify all of the niggly bits of C syntax. In Ohm-JS, this results in a few minutes of work and only a few lines of code. The Ohm-Editor assists in developing the micro-grammar.
In YACC and CFG-based approaches, though, you’re looking at a gargantuan job (days, weeks, months, ...) and you simply don’t bother to write such a quickie parser. You either don’t bother with the whole idea, or you use something like REGEX which fails on a number of edge-cases for this kind of thing. REGEX can’t search recursively for matching brackets, Ohm-JS can. Using REGEX, you might get away with a partial solution, or, the project might grow larger as you hit unexpected speed bumps. You either persevere or you just give up.
For the record, the grammar plus the accompanying code fabricator specification for the above simple example are shown in the appendix.
### DaS Comes For Free
When you can build totally isolated building blocks, you can draw sensible diagrams of how the building blocks should be snapped together to solve a problem.
Later, you can steal (cut/copy/paste) chunks of previous solutions and use them as building blocks for new problems.
DaS: Diagrams as Syntax.
DaS is not diagrams as an Art Form. DaS is diagrams as programming languages. For example, instead of writing {...}
, you draw a rectangle.
Programming languages were created by culling the English language and by choosing only the words and phrases that could be compiled to executable code.
Can we cull diagrams in the same way to invent new programming languages?
EE’s have done this and they call the resulting diagrams “schematics”.
Building construction engineers have done this and call the resulting diagrams “blueprints”.
## Don’t We Already Use Building Blocks?
“Code Libraries” look like building blocks, but, contain subtle bits of coupling that discourage building-block-iness.
For example, the very common idiom of a function call f(x)
introduces at least 3 kinds of coupling:
1. The name f
is hard-wired into the caller’s code. The calling code cannot be cut/copy/pasted into some other solution without also dragging in the called code, or, by futzing with the source code.
2. The function call f(x)
waits for the callee to return a value. This is also known as blocking. Function call notation works fine on paper, where functions can be evaluated instantaneously. It’s different when you map function call syntax onto hardware that has propagation delays wherein functions take finite amounts of time to “run”. This subtle difference in behaviour leads to hidden gotchas. A glaring example of the impact of such a difference can be seen in the Mars Pathfinder disaster[^pathfinder].
3. The function return v = f(x)
hard-wires a routing decision into the callee’s code. The callee must direct its response back to the caller. This is called “returning a value”. Again, this doesn’t look like a problem when you just want to build fancier calculators, but, this hard-wired routing decision discourages simple solutions to non-calculator problems, like machine control.
[^pathfinder]: https://www.rapitasystems.com/blog/what-really-happened-software-mars-pathfinder-spacecraft
When you don’t have complete isolation, you don’t have building blocks. Imagine a LEGO® set where all the pieces are joined together with a single, long sewing thread glued to each LEGO® block. Or, you have two real-world objects, e.g. one apple and one orange. You cut the apple in half. What happens to the orange?
As humans, we are used to the idea that objects are completely isolated. Programs don’t work that way. We have to stop and think hard when writing programs.Cfunctions {
program = item+
item =
| comment
| string
| applySyntactic<FunctionDecl> -- decl
| any -- other
FunctionDecl = name "(" param+ ")" "{" block+ "}"
param =
| "(" param+ ")" -- nested
| ~"(" ~")" any -- flat
block =
| "{" block+ "}" -- nested
| ~"{" ~"}" any -- flat
name = letter (alnum | "_")*
comment =
| "//" (~nl any)* nl
| "/*" (~"*/" any)* "*/"
string =
| bqstring
| dqstring
| sqstring
bqstring = "`" (qbq | (~"`" any))* "`"
dqstring = "\"" (qdq | (~"\"" any))* "\""
sqstring = "'" (qsq | (~"'" any))* "'"
qbq = "\\" "`"
qdq = "\\" "\""
qsq = "\\" "'"
nl = "\n"
spaces += comment
}
Can this grammar be improved and optimized? Probably. But, why would you care?
You would care only if you used this code in an end-user product.
If you use this code in something like a batch-editing environment, “efficiency” takes on a different meaning. End-users don’t care about the efficiency of your code editor and its Find-and-Replace function. End-users don’t care how efficient your command line tools, like grep, are.
When you treat Ohm-JS + Fab as batch editors for development, then, only development efficiency matters.
I strongly believe that one shouldn’t write code. One should write code that writes code. From this perspective, “efficiency” breaks down into 2 camps:
1. developer efficiency
2. end-user efficiency.
Note that traditional compilers are simply apps that write code. Developers use compilers. End-users don’t care if a developer created end-user app code by hand or by using a compiler. The only things that end-users care about is if the app is cheap and runs on cheap hardware. The final app is assembler, regardless of how it was created. Developers, on the other hand, do care about development time and effort. Hand-writing apps requires much more effort than using high-level language compilers to generate the final app code. Debugging apps is easier when using high-level languages with type-checkers. On the other hand, developers usually buy fancier hardware than that which is used by end-users. Developers can afford to burn CPU cycles on their fancy hardware to give themselves faster - and cheaper - development and debugging times.
The final step in development is that of Production Engineering an app to make it cheap-enough to sell. Up until that point, the development workflow should consist of anything that speeds up and cheapens development time, for example, dynamic language environments and REPLs. For example, Rust is a Production Engineering language and needn’t be used until the last moment.
## c.fab
A .fab
file is a specification that creates strings based on the above grammar. Fab is an experimental transpiler tool that works with Ohm-JS. It generates JavaScript code required by Ohm-JS. This could all be done by off-the-shelf Ohm-JS. Fab simply reduces the amount of keyboarding needed for creating JavaScript “semantics” code required by Ohm-JS. Fab is written in Ohm-JS.
Cfunctions {
program [item+] = ‛«item»'
item_decl [x] = ‛«x»'
item_other [x] = ‛'
FunctionDecl [name lp param+ rp lb block+ rb] = ‛\n«name»'
param_nested [lp param+ rp] = ‛'
param_flat [c] = ‛'
block_nested [lp block+ rp] = ‛'
block_flat [c] = ‛'
name [letter c*] = ‛«letter»«c»'
comment [begin cs end] = ‛'
nl [c] = ‛«c»'
spaces [cs] = ‛«cs»'
bqstring [begin cs* end] = ‛'
dqstring [begin cs* end] = ‛'
sqstring [begin cs* end] = ‛'
qbq [bslash c] = ‛'
qdq [bslash c] = ‛'
qsq [bslash c] = ‛'
}
## grep.c
The above was tested against grep.c
from the Gnu grep repo.
git clone <https://git.savannah.gnu.org/git/grep.git>
## Even Smaller
I’m playing with the design of a new tool that I call bred (bracket editor). It’s like a super-simple batch editor that walks through text that contains bracketed constructs.
The full specification consists of 2 strings
1. what to match
2. how to rewrite it.
The above specifications might be re-expressed as:
‛«name» («params») {«block»}'
‛«name»'
which reads as:
1. match, recursively, anything that looks like «name» («params») {«block»}
2. then, throw away everything except the name
Currently, my concepts have warts - what happens when a comment or a string or a character constant contains brackets, or, even worse, what happens if they contain unmatched brackets?Kartik Agaram
Vijay Chakravarthy
01/22/2023, 10:39 PMguitarvydas
01/23/2023, 12:19 PMMarcel Weiher
01/30/2023, 8:46 AMwtaysom
01/30/2023, 4:09 PMy = f(x)
but call/return unnecessarily couples how you enforce the relation, namely, by fixing y
based on f
of an precomputed x
. You may do it eagerly, you may do it lazily, but you're still committed. With Prolog you can leave variables unbound. With bidirectional transformations / lenses you can update x
from changing y
. And there are more possibilities.
I've long been curious about decoupling relations over state from evaluation/update mechanisms.
By the way Common Lisp's resumable exception handling mechanism is called the "Condition System."guitarvydas
01/30/2023, 4:43 PMMarcel Weiher
02/05/2023, 1:31 PMguitarvydas
02/05/2023, 8:51 PMI will try to whip up an example in some example language ...!
simple example▾
from leaf import Leaf
class A (Leaf):
def __handler__ (self, message):
self.send (xfrom=self, portname='out', data='v', cause=message)
self.send (xfrom=self, portname='out', data='w', cause=message)
from leaf import Leaf
class B (Leaf):
def __handler__ (self, message):
if (message.port == 'in'):
self.send (xfrom=self, portname='out', data=message.data, cause=message)
self.send (xfrom=self, portname='feedback', data='z', cause=message)
elif (message.port == 'fb'):
self.send (xfrom=self, portname='out', data=message.data, cause=message)
else:
raise Exception (f'internal error: unhandled message in C {message}')
from sender import Sender
from receiver import Receiver
from up import Up
from down import Down
from across import Across
from container import Container
from a import A
from b import B
class Top (Container):
def __init__ (self, parent, name):
a = A (self, f'{name}/a')
b = B (self, f'{name}/b')
self._children = [a,b]
self._connections = [
Down (Sender (self,'in'), Receiver (a,'in')),
Across (Sender (a,'out'), Receiver (b,'in')),
Across (Sender (b,'feedback'), Receiver (b,'fb')),
Up (Sender (b,'out'), Receiver (self,'out'))
]
super ().__init__ (parent, name, self._children, self._connections)
This example shows a small, 2-component feedback network.
The code does nothing useful, but, it demonstrates message feedback.
The problem statement:
• When A gets a message on its pin ‘in’, it produces 2 messages ‘v’
and ‘w’ in that order.
• When B gets a message on its pin ‘in’, it outputs the message on its
pin ‘out’ AND it produces a ‘z’ message on its pin ‘feedback’.
• When B gets a message on its pin ‘fb’, it outputs the message on its
pin ‘out’ (only).
The result of the system is 4 messages ‘v’, ‘w’, ‘z’, ‘z’ in that
order (left to right).
... for more details, see https://github.com/guitarvydas/py0d/blob/feedback/README.md (note that this is the “feedback” branch of that repo)
Feedback - why bother? In electronics, it is common to use feedback to self-regulate (“negative feedback”). In software, recursion (which only LOOKS like feedback) is used only as a form of divide-and-conquer.
The difference between Recursion and Feedback is the delay imposed by queuing. Recursion is processed immediately in a LIFO manner, whereas Feedback messages are put into a queue in FIFO order, to be processed when their time comes. It’s like someone waiting patiently in a lineup versus someone jumping the queue and going to the front of the line.
Stuff like this matters when you are building sequencers instead of calculators.
The Architect can be very explicit in the design instead of having a certain semantics built into the lower-levels of the tool. Loops (not Recursion) become explicit messages-to-self. If the Architect really, really, really wants a Stack, the Architect builds it explicitly and gives it the desired semantics, instead of relying on the built-in call-stack to do the work implicitly.Marcel Weiher
02/07/2023, 10:15 AMguitarvydas
02/07/2023, 10:46 AMMarcel Weiher
02/09/2023, 8:07 AMguitarvydas
02/13/2023, 8:48 AMMarcel Weiher
02/13/2023, 10:42 AMguitarvydas
02/14/2023, 11:47 AMMake it possible to directly write down non-call/return things.number of inputs = m number of outputs = n where 0 <= m <= infinity 0 <= n <= infinity where one input is a block of data that arrives simultaneously (“at the same time”, regardless of how you wish to destructure it (e.g. in f(a,b,c), “a,b,c” is but one block of data which is destructured into 3 elements a,b,c (this is how parameters are implemented in Assembler))) and one output is a block of data that is sent simultaneously (note that “functions” imply m = 1, n = 1, where the input is totally synchronous and the output is totally synchronous) (note that daemons have m = 0 when in steady-state) (note that buffered text filters have n = 0 most of the time, and n = 1 when they want to say something) You can express these kinds of things in text, but the result is a mish-mosh, IMO. Diagrams express this kind of thing better. In my nomenclature: Thing = Component. FTR - parsing technical diagrams ain’t much harder than parsing technical text.
Marcel Weiher
02/15/2023, 1:45 PMIronically, all of the Smalltalks that I know about implement so-called “message sending” using synchronous CALL/RETURN.Yes. And in fact, Alan’s famous quip “I made up the term ‘object-oriented’, and I can tell you I did not have C++ in mind” was followed immediately with the far less quoted “The important thing here is I have many of the same feelings about Smalltalk”.