I really like this post for the deep thinking on p...
# linking-together
k
I really like this post for the deep thinking on precisely where a type system should be nominal vs structural: http://pling.jondgoodwin.com/post/searching-for-quarks
👍 1
e
The type system of Cone is awfully low level, and tied very closely to hardware. It is very close to what was used in the sequel to Pascal called Modula-2 that was designed for systems programming. I used Modula-2 for over 20 years to great effect. With all the extra checks and great module system, it proved in the long run to deliver software half the size of C, and 10 times as reliable, because every assignment, parameter passed, etc. was checked against very strict typing, and in the final golden master release you turn off the checks and your program shrinks by 30%... But nowadays, even for a systems language i think it ill-advised. Anyone in this day and age that thinks a Boolean only has 2 values is uninformed. In JS and AS3 there are 3 possible values for a boolean (true, false, undefined), and in Beads, there are 4. You cannot make a robust language where all functions are total if you have booleans that lose track of unknown and error conditions. The reason most programs are fragile directly relates to this very important consideration. All IF statements resolve to boolean expressions, and having your IF statements flow the right way is such a primeval, fundamental task that to gloss over this is to prevent provably correct software.
k
Huh, I thought a Boolean had just two values!
m
@Edward de Jong / Beads Project what exactly do you mean by boolean having 3 values in JS? MDN docs explicitly state that boolean has 2 values. Maybe you are referring to null value in JS? Another simple example is Haskell -> Boolean certainly has 2 different values there. And if you need to transfer additional information, you can e.g. use Maybe to achieve 3 values, or Either, which makes it explicit what is happening, keeping Boolean being just a Boolean. Sorry if I am missing something obvious, would be happy to hear the more detailed explanation! Btw nice job on beads language, seems impressive :)!
âž• 1
e
It is very important to not lose information. If you know you don't know the value of something, i.e. it is unknown or undefined), you should never be permitted in your code to turn that into information. For example if you have a variable needs_diapers = age < 2, if age is unknown, then so is needs_diapers. JS implements 3/4 of Excel's protected arithmetic. In Beads I implemented 4/4ths of Excel's protected arithmetic, because it is one of the secret weapons as to why Excel is so preferred as a programming tool.
m
Hm I think I see what you mean, it is about seamlessly propagating unknown value or error through the expression, instead of handling it explicitly. I can see how that can be useful in certain scenarios (and as you said, that works well in spreadsheets)! Btw. (and sorry if you already know about it) Haskell is interesting in this sense, with its Maybe type constructor. While Bool alone in Haskell has strictly 2 different values, if you use Maybe Bool, you effectively get True, False or Nothing, and if you then do operations in Maybe monad you get short-circuiting in case of Nothing, without explicitly handling it! Certainly not as handy as if it was built in, but you can get pros of both worlds.
d
Haskell has an unsound type system. For example,
head
returns the first element of a list, and has type
[a] -> a
. But wait, what does
head
return if the list is empty? There's no error condition encoded in the result type. Well actually, the program aborts. This problem is everywhere, since case expressions are not required to enumerate all of the possible cases. A sound type system ensures that the program "can't go wrong", and Haskell doesn't manage that. Edward says Beads is a robust language where all functions are total, which is more ambitious than Haskell. Apparently Beads has a sound type system. That might account for the extra complexity in the boolean type.
e
When you have 3 values stored in a Boolean variable as in JS, you have 9 combinations for each operator. In Excel/Beads there are 16 different combinations that have to be considered. There is no question that moving away from the hardware's simplistic 4 combinations you pay a dear price in efficiency, but the additional robustness is quite salutary. If you look deeply into the history of programming, you will see that in the beginning with Assembler, people ignored errors. Then in FORTRAN, people used to write code to validate the inputs to their functions at the beginning of the function, and every function would typically have an error return code, usually some big unique number, that you might look up in a big manual. Then people got tired of all that error checking, and in C often ignored checking parameters for validity (i'm talking about you MS!), and maybe just had an error code which was often just success or failure. In Java errors are blithely ignored until you get 50 levels deep and finally hit a null pointer exception. In the JS word people don't seem to check anything any more, probably because they have so little experience, and as a result I find web software to be the flakiest software ever built. The beauty of the web is you just reboot the page and the V8 super duper just in time compiler fixes it all fresh. We are in an era where computers are so fast people don't care much about reliability or robustness. However, i find this sloppiness intolerable, and i want to align programming with the logicians who are striving so hard to prove programs have no errors in them. To that end we must eliminate sources of intermittent/undefined behavior, and that means going back to the bottommost arithmetical operations and ensuring they are airtight, and don't leak.
👍 1
m
@Doug Moen thanks for this extra explanation! In Haskell idea is to avoid using partial functions like head, but as you say, it is not enforced in any way, so there are no guarantees some third party library is not using them. Thank you both @Edward de Jong / Beads Project and @Doug Moen for explaining this, this is certainly some food for thought!
k
Doug, I'm not an expert on type systems, but I'm going to need a citation on "Haskell's type system is unsound." Best I can find from Googling is http://okmij.org/ftp/Haskell/types.html#unsound-typeable which refers to a much more arcane feature than
head
.
Edward, checking for errors is orthogonal to the number of states a Boolean has. Your comments suggest more states are always better, which is obviously wrong: adding the null state caused Hoare's famous billion-dollar mistake. And the third state in JavaScript booleans that you seem to like so much is the much maligned
undefined
if I'm interpreting you correctly. These are not good ideas. Booleans have 2 states. Variables that need more than two states are absolutely a good idea. They shouldn't use booleans. In general, the problem of type modeling is to have a Goldilocks number of states for each variable: not too many so some states are rare and programmers don't test for them, and not too few so programmers can't test for some. All this has absolutely nothing to do with booleans.
d
@Kartik Agaram The term "sound type system" has a general meaning, which applies to formal mathematical systems as well as programming languages. Here's a nice definition: https://eschew.wordpress.com/2009/08/31/sound-and-complete/ To apply the concept of "sound type system" to Haskell, you need to make a choice about the kinds of programs you consider unsound. okmij.org is only concerned with excluding programs that will cause a segmentation fault/corrupt memory/etc, and you have to look deep into Haskell to find such unsafe features. In Haskell, the case expression doesn't require you to specify all the cases, and aborts the program if an unhandled case is encountered at runtime, while in safe languages like Rust, you have to explicitly handle all the cases. I think this comparison makes Haskell look bad, and it is reasonable by modern standards to classify this as unsound. Standards of soundness have gone up since 1998. There are a few languages that take an even stricter stance on soundness, and require the type system to reject any function that isn't total. "Total" means: the function is guaranteed to return a value: it can't throw an exception and it can't go into an infinite loop. This usually means that recursive functions are not allowed, because they could go into an infinite loop. "Dhall" is a currently trendy language that guarantees totality.
k
Certainly, Rust's core type system is more rigorous than Haskell's. I just want to use words the way their experts defined them. I'll subside now. Ping me when others agree with you that runtime error = unsound.
d
Here's part of an email thread where well known Haskell expert Conor McBride is attempting to argue that Haskell has an unsound type system. I can't find his original post, just followups that quote his statement. Conor said "As an obvious consequence, Haskell type system would be unsound." https://mail.haskell.org/pipermail/haskell/2004-April/013979.html https://mail.haskell.org/pipermail/haskell/2004-April/013995.html
❤️ 1
e
@Kartik Agaram sorry, but if you assume that your input data is perfect, your program will malfunction in unpredictable ways the minute some data isn't perfect. As software changes over time, programmers tend to add more fields to records, and the existing data doesn't have values for these new fields, and are therefore best stored as "missing" or "unknown" or "undefined" or whatever word you prefer. To have any chance at provably correct, robust software one must consider the cases where the input data is either unspecified or known to be in error, which are two different states in Excel. Just because George Boole 1815-1864 didn't consider these extra states in his theory (which was fairly useless until Turing and Von Neumann dusted it off 100 years later), doesn't mean that in the 2000's we can't finally admit that erroneous and missing inputs should not result in some undesired outcome. Visicalc/Quattro/Lotus123/Multiplan/Excel has proven with overwhelming popularity and usage that protected arithmetic is the preferred solution, and unfortunately language designers which keep looking in the historical tradition of logic which ignores such impurities as beneath them, are doomed to repeat the same mistakes.
k
That is a valid concern, and the solution is to not use boolean types to model booleans in input data. But boolean types are still useful for internals. Adding a state to all boolean types is excessive and imprecise. The term means something, and what it means is useful. JavaScript violating the meaning isn't sufficient reason to ditch it entirely. In fact, JavaScript illustrates that just adding extra states doesn't actually solve anything. The onus is still on the programmer to use them wisely.
For including a third state I like
Optional<Boolean>
. The nice thing about it is that it's impossible to forget about the third state, a problem JavaScript has in spades.
e
@Kartik Agaram I think you are mistaken that there are only 3 states. There are 4 states for a Boolean quantity: true, false, undefined, and error. If you evaluate an expression using real numbers such as
val = sqrt(-1) < 0
, then the result is neither
true
, nor
false
, nor
undefined
, but as Excel would calculate it,
#ERROR
. There is no undefined or missing quantity, but rather the operation is erroneous. This is the fourth state which is necessary to be able to store. Perhaps your program will halt on an error or undefined quantity in debug mode, but in production may propagate the error value (as does Excel) so that it never crashes. People see the red
ERROR
cells quickly, and the propagation effect means that all derived quantities based on the original error value are also flagged. This is of great value in long chains of dependencies. It is neither excessive nor imprecise. In fact it is more precise because no information is being destroyed or made up. That a boolean value would take 4 bits instead of 2 to store is a trivial additional cost. There is no additional cost to have the fourth state given that you needed at least 2 bits to store it anyway.
k
This is an exceedingly stupid debate. If you need 4 states, use 4 states! My argument doesn't care if you have 3 or 4 or 6 (because 5 is right out). Meanwhile, if you unilaterally change what a word means and call everyone who disagrees 'uninformed' -- well, good luck with that.
s
Is “three states for Boolean” about modeling what many languages do with exceptions? That's the only interpretation I can think of to make sense of this. And then aren’t you really debating if a language should use a pure functional approach (where all possible states are explicitly modeled in a data structure eg. with Maybe/Optional or Result types) or an exception based approach (which seems to be argued against here?). Taking the pure functional approach leads to the discovery that this model isn’t perfect either — which I believe is what Haskell's mentioned “unsoundness” is about? Seems like a rather theoretical concern. There's this “hack” to talk about the category Hask, which sneakily adds the value Bottom to all type classes, which is probably equivalent to that mysterious third Boolean state discussed above…? If I’m not mistaken there is also a compiler switch in Haskell to enforce exhaustive pattern matching, as well as best practices to avoid partial functions, which means in essence Bottom becomes Haskell's way of saying “This function could never halt, take too long for you to wait on, or run out of computing resources (stack overflow etc.)”. It's there, but in a way that you don’t have to be concerned about it all the time, which is what it sounds like you’re doing when you talk about “Booleans really have three states”.
e
Above i meant a fully modeled boolean has 4 states, 2 bits data store, not 4 bits. Excel has an extended arithmetic, which is carried in each cell. In Excel each cell holding a value has these 2 extra states (#UNDEF and #ERROR) so that you don't need a second cell. In a programming language, if you only allocate 1 bit for the boolean, you cannot represent these two additional states, so you then must allocate another variable to hold it. Or maybe you set it to false which is making up a possibly incorrect value (a bad thing). A second value for tracking additional values is massively inconvenient, and would mean when you pass that value to a function you would also need to send the second parameter. Since every program is about manipulating data via code, when you start talking about type systems you are also assuming some kind of arithmetic system. Whether you like it or not, you always have these 2 extra states for every variable. You might have omitted to enter the value, or the value is derived from another value that was in error. For example a = sqrt(-1); b=a*2; c = b<2; In this case, A would be ERROR, and B would evaluate to ERROR as well, and so would C. You can't claim that true or false is correct in the case of C. Yes this is a rarely encountered edge case. But if you want your program to survive erroneous or missing data, you need this extended and protected form of arithmetic. The problem i have with type discussions in this forum is the way people gloss over these exceptions. You won't bother to check every arithmetic operation for some missing data, so the correct approach is to let your arithmetic be closed, and no operation stops the program (unless you are in debugging mode). Every data type needs these base values, because strings can be missing too, and in loosely typed languages you might try to rotate a number instead of a bitmap... since it is nonsensical, you need that error state to track this so you don't build on top of an error. Having protected arithmetic means you won't have to do as much testing on your code, because you aren't so worried about bad inputs causing crazy behavior. Test driven design has become popular with some because they know how flabby their code is and they would prefer to more fully exercise the combinations and permutations. I say screw all that extra work on testing, just make the code impregnable and save the effort of doing exhaustive tests. Nowadays with consumers running millions of copies of software, it isn't considered acceptable to crash on erroneous or missing data. Maybe it was okay back in the day where the person controlling the input data was the same person running the program, but exceptions are a terrible way to handle errors. My assertion is that exceptions are a bad idea. You must be prepared to survive bad or missing data. You are not going to be able to figure out where all the numerical exceptions could originate, much less recover from them, and to have anything less than Excel's extended arithmetic is the wrong direction IMHO. There are too many statements in your program that do arithmetic. Every atomic value in Beads has an error or undefined base value. JS got tantalizingly close with their undefined and NaN value sets, but not allow an Error value on booleans, nor did they consider it for strings or objects, etc. The arithmetic in JS is asymmetrical and incomplete. This is a mathematical issue stemming from the improvement in logic that people called "fuzzy logic", which was popular perhaps 20 years ago, but now forgotten.
k
Doesn't catching exceptions at the right granularity address your criticism? There seem to be complementary strengths and weaknesses here. Operations that silently propagate errors make bugs hard to diagnose. Exceptions cause crashes. This is why my (implied but not explicitly stated) position in this thread has been that the extra states you need to represent error are a domain-dependent concern. And, what's more, you often want different solutions for different parts of your program. These are just tools. A decree that "all variables shalt have #UNDEF and #ERROR" seems too blunt. Particularly since it requires programmers to be mindful of these states to work. JavaScript programmers largely try to ignore
undefined
as much as possible, which entirely defeats the point. Here's an example of a major domain where you absolutely want 2-state booleans: Boolean satisfiability (SAT) solvers. Adding extra states here would make tractable problems intractable. (Apologies for the harshness of my previous comment in this thread. I plead sleep deprivation.)
e
Exceptions in many languages create action at a distance, making it extremely hard to understand program flow. I think exceptions are one of the worst features of Java, and a major factor in its poor reliability. I can't recall how many times i have seen a 100 nested level deep nil pointer exception in some Java program. Erlang/Elixir uses an interesting approach, that of crashing on an error, with the plan that the process will be automatically respawned by some monitor program. I like how Excel has never crashed on me. There is something comforting about a system that is sturdy and doesn't collapse or freeze the minute someone forgets to enter a value, or puts in a bad piece of data. Of course one validates input data as thoroughly as possible, but with data arriving over the internet, you can't assume it is going to be perfect.