A single all encompassing FoC language for all domains is un Future of Coding #thinking-together

A single all-encompassing FoC language, for all do...

Doug Moen

10/15/2019, 8:44 PM

A single all-encompassing FoC language, for all domains, is unlikely. The future is DSLs. Some arguments for this. Also, how can we improve interoperability between DSLs? Requested by @Duncan Cragg

👍 4

Doug Moen

10/15/2019, 9:35 PM

What numeric types should your wide-spectrum, all-domain FoC language support? Douglas Crockford <http://dec64.com/> claims that decimal floating point (his DEC64 data type) should be "the only number type in the next generation of application programming languages". Binary floating point is unsafe for financial calculations, so DEC64 is a good choice for spreadsheets or any language that does financial calculations. But Crockford also says DEC64 is "well suited to most scientific applications", and that's less clear. I do agree that for an end-user programming language focused on a specific domain, the simplicity of a single numeric type is valuable. And DEC64 is a good choice if the domain involves finance. For my language Curv, a DSL for 3D graphics, the best choice is IEEE binary floating point as the single numeric type. For scientific computation, the popular choice is a batteries-included language with all the features needed by experts. Experts also need speed, because some scientific computations take a long time to run. That means you need vectorized IEEE binary floating point. Nobody will ever port BLAS and LAPACK to DEC64. Ditto for Machine Learning and 3D Graphics (which run on a GPU): you need vectorized binary floating point, because that's what the hardware supports. For a functional language like Idris, where you use dependent types to prove programs correct, you need a natural number type, Nat, which implements the Peano axioms directly. In the Idris standard library,

data Nat = Z | S Nat

. You need this so that you can perform inductive proofs over the natural numbers. You need other numeric types as well (Nat doesn't include negative numbers or fractions). In systems languages like C, C++ and Rust, you need the full set of numeric types supported by the hardware, including int8, int16, int32 and int64, so that you can write high performance code that optimizes memory consumption and memory access patterns and uses the L1 cache efficiently. A true wide spectrum language would presumably need to support all of these different kinds of numeric computation, and in each case, do as good a job as more specialized languages designed by domain experts. And that would make the language much more complicated than a domain specific language. For any given task, a DSL is probably a better choice.

🤘🏼 1

Doug Moen

10/15/2019, 10:14 PM

DSLs are better than wide spectrum languages because they are simpler. There is less to learn. The syntax can be specialized for the task. The semantics can be restricted to only what is needed for the task, which leads to cleaner, simpler semantics.

✅ 1

tbabb

10/15/2019, 10:40 PM

I rather like the way Haskell deals with numbers— most literals are the most general class of Number that can hold them, and usage will refine it to a narrower type where necessary. That seems to me like a decent compromise between ergonomics and flexibility, and it's especially nice because the system does a lot of work for the user to figure out what data type is appropriate. I think any "serious" language must at least provide the capability for a user to implement those data types, even if they are not built into the language. And obviously the exposed "batteries-in" types strongly set the tone for the user about what it's like to work with the language.

tbabb

10/15/2019, 10:45 PM

I currently distinguish between integers and reals in my system, but I don't like it. I strongly agree with the spirit of your "single numeric type is simpler" sentiment. I have aims of moving to a more Haskell-y system in my platform. (Hindley-milner is already there, so a good chunk groundwork is laid)

tbabb

10/15/2019, 11:10 PM

Although if you want One Format To Rule Them All, quad-precision float has an appeal— the mantissa is 112 bits (which means integers 112 bits or smaller are exact), and my favorite fact about quad floats is that if 1.0 is 1 AU, then (1ulp) : (the radius of a hydrogen atom) :: (radius of hydrogen atom) : (radius of Earth). "128 bits ought to be enough for anybody."

Garth Goldwater

10/16/2019, 12:02 AM

what would you propose the DSLs should be implemented in? otherwise i see this as being another brick balanced on top of the host language pyramid

Doug Moen

10/16/2019, 1:30 AM

@Garth Goldwater I'm not sure I understand your post. I would implement a virtual machine, garbage collector, etc, in a systems programming language: something that provides direct access to machine data types, full control over memory layout of objects, and lets me write unsafe code where necessary. You know: like C, C++, maybe Rust. I consider these DSLs, and I wouldn't want to use these languages for everything. A compiler, on the other hand, can quite reasonably be written in a higher level language. What's your proposal, and what about the "brick balancing" metaphor?

Konrad Hinsen

10/16/2019, 7:37 AM

Racket is a nice example of a platform designed for making interoperable DSLs. Assuming that everyone adopts it, of course, which is unlikely.

Duncan Cragg

10/16/2019, 8:05 AM

About 40 yrs ago I coined the acronym DTIL - Domain and Target Independent Language - as the definition of what I was striving for, and have been ever since. The Target independence meaning nothing in the language was affected by how the underlying hardware likes to work. Hence efficiency has to be achieved by whatever cleverness the language implementation can deploy; plus I expected hardware to get very much faster in the future (it did). Thus DSLs are the opposite of DTILs, as indeed would a "TSL" be - a Target Specific Language. C is a TSL - you can specify

uint16

, etc and fiddle with bits. It's very close to the metal. I guess GLSL and OpenCL are also kinds of TSL: any language that is a wrapper for a machine type. So I'm not interested in a number type that is machine-specific, or that makes things faster. If needed, that speed can be achieved by handing-off or wrapping or clever tricks. And I am not interested in

data Nat = Z | S Nat

as my pos-int number system, because no-one talks of three bananas by saying

S S S Z bananas

! I store numbers as strings, so they can be any length. My challenge of course, is that's not efficient, but I'm on this forum to ask for help with that, amongst other things!

❤️ 2

Edward de Jong / Beads Project

10/18/2019, 5:16 AM

Mathematica stores numbers as strings also, and they get away with it no problem, and offer as much precision as you request. So clever. So there is something to be said for ditching IEEE floating point and its crazy parts, like NaN not equalling NaN, and having many thousands of NaN's but the hardware only supporting silent NaN. The number of errors stemming from the fact that 0.1 + 0.2 does not equal 0.3 is staggering.

Duncan Cragg

10/18/2019, 9:10 AM

Thanks for that tip-off; I wonder how they make that work underneath?

shalabh

10/18/2019, 9:35 AM

👍 1

tbabb

11/11/2019, 7:09 PM

I'm glad this thread happened— it's this discussion that has prompted me to switch Lynx to quad precision floats (which LLVM supports: https://llvm.org/doxygen/group__LLVMCCoreTypeFloat.html#ga1c02fb08f9ae12a719ed42099d42ccd8). Other numerical types will be supported eventually, but this will fend off users from having to understand and distinguish the difference between an integer and a real, which they really shouldn't have to worry about. A 112 bit mantissa is more than enough to count anything, and should also be enough to do finance calcs in any reasonable currency.

3 Views

Open in Slack

Previous Next