I have a simple yet annoying validation problem Can any of y Future of Coding #present-company

I have a simple, yet annoying validation problem. ...

wtaysom

01/21/2021, 6:55 AM

I have a simple, yet annoying validation problem. Can any of you think of a language/framework that makes validating the following sort of schema as clear (or clearer) than the English language specification? Suppose we have a playhouse with a bunch of rooms. In each room, we want there to be a box, a ball or two, and no other toys. If there are two balls, one should be red and the other should be blue. Report all the rooms that have the wrong toys and exactly what's wrong.

wtaysom

01/21/2021, 7:04 AM

And yes, coding this up earlier today fizzed my buzz: 34 lines, half an hour, several mistakes.

Emmanuel Oga

01/21/2021, 7:29 AM

relaxng (XML validator) has an interleave operator for this sort of thing. Its been a while since I last wrote a gramamr but I think it would look something like this:

Copy code

grammar {
  start = element house { room*}
  room = element room { box & ((red & blue?) | (blue & red?)) }
  box = elem box { "box" }
  red = elem ball { "red"}
  blue = elem ball { "blue" }
}

👍 2

Emmanuel Oga

01/21/2021, 7:30 AM

or else any other grammar would help (ANTLR, PEG, etc):

Copy code

grammar = "house" room*
room = "box" ("red ball" "blue ball"? | "blue ball" "red ball"?)

Emmanuel Oga

01/21/2021, 7:30 AM

I think it is useful to think in terms of grammars because then becomes obvious when the thing you are trying to validate is not context free

👍 1

Emmanuel Oga

01/21/2021, 7:31 AM

in XML world, once you need to validate something were a context free grammar is not enough, you reach for Schematron.

Emmanuel Oga

01/21/2021, 7:37 AM

hah had to edit my grammars. Not sure if they match your requirements exactly but you get the idea 😛

Jack Rusher

01/21/2021, 8:32 AM

+1 @Emmanuel Oga's suggestion of grammars, to which I'd add any other automata generating-abstractions. For example, regular expressions over the domain of toys (as opposed to the domain of characters).

Don Abrams

01/21/2021, 11:58 AM

This writeup expresses the cleanest approach I know of: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

Garth Goldwater

01/21/2021, 2:40 PM

the trickiest part of this is the “report all rooms that have the wrong toys and exactly what’s wrong”—i’d be tempted to reach for the really dynamic grammars that let you pretty easily insert whatever side effects/logic you want, like OMeta/ohm or rebol/red’s parse dialect. do you just want console output or structured data about the errors?

🤔 1

Jack Rusher

01/21/2021, 3:54 PM

After someone asked in private what I was on about up above, I offer as an example this "regular expressions over sequences" library in Clojure: https://github.com/cgrand/seqexp The relationship between formal languages and automata is a worthwhile area of computer science study for practical minded practitioners. 🙂

Vijay Chakravarthy

01/21/2021, 8:41 PM

Wouldnt a datalog or mini-kanren style approach be useful?

👍 1

wtaysom

01/22/2021, 2:22 AM

Ah @Garth Goldwater, OMeta haven't heard that in a while! Yes friends, an important part here is the detailed error reporting. I'll need to read Parse, don't validate @Don Abrams. Seems many non-trivial programming problems have a parsing (state machine) feel, and at one level we have good parsing tools though they seem to be underutilized in practice. One interesting thing here that counts matter and order does not. (Of course, one quick fix is to sort the list of toys in a room and then match on that sequence.) As for error reporting, here's an idea. Imagine an interactive grammar generating tool. We start with golden path of validity (imagine a sorted list), something like:

Copy code

Room = "box" ("blue ball" "red ball" | Color " ball")

Then the tool would propose invalid sequences for which we have to generate error description productions. That would be neat. I suppose other perennial parsing problems are tokenizing nicely so that you can operate on higher-level constructs, and I mean once you are acting on tree-ish data rather than streams, it begins to feel like regular functional programming...

bmitc

01/28/2021, 11:08 PM

I am still learning it myself, but would Prolog be able to do this?

wtaysom

01/29/2021, 9:27 AM

Short answer: yes-ish. Prologs can be a bit odd in what they can and cannot do.

2 Views

Open in Slack

Previous Next