so, my technique is roughly this:
• use the AST as a guide for what to model. i'm using whitequark's parser gem for now because it's more at the level of abstraction users care about than the RubyVM::AST module, and i'm aiming for usability over exhaustiveness rn
• assume primitive datatypes exist and focus on modeling their semantics first, so, dual prioritization. AST nodes so we can tie things back to source locations trivially, and primitive datatypes because in dynamically typed languages with a propensity for duck typing, exhaustive exploration of the semantics of an AST node is intractable
• pick out examples from rubyspec for that AST node that imply some semantics
• either manually or by way of retooling the runner (more ideal) to do so automatically; symbolize the critical inputs
• take inputs and outputs over to z3 and find the implied minimal semantics (this part isn't in ruby rn bcz the best z3 gem is broken)
• manually, or with some tooling (obvs ideal) refine the implied minimal semantics. an example is that z3 couldn't infer boolean or semantics in a compact form at first, because it couldn't do some basic boolean rewrites that i expect something like coq etc would've been able to do. or at least, it can't do it as best i can tell yet.
• test the semantics against symbolic execution of MRI/etc using symbolic execution engines targeted at lower level semantics. probably going to use triton for this? maybe klee. depends. triton would be more portable to other implementations. s2e is an option too, but triton really feels easiest atm.
• then run the semantics against rubyspec to do a final test.
the end goal is to be able to infer semantics for all dynamic languages somewhat efficiently.