How does the future of coding handles missing/mixe...
# thinking-together
m
How does the future of coding handles missing/mixed data like undefined, null or values that change type over time?
o
Do you mean how future-oriented programming languages move away from those constructs?
m
or how they don't move away šŸ™‚
or sideways šŸ˜›
o
Dependent types, typed holes, pure functional languages, all have interesting ideas. Then there’s state, I highly recommend Rich Hickey’s talk

ā€˜The Value of Values’▾

šŸ‘ 1
m
data exists before entering the system and its shape not necessarily matches what a type system would like to handle
šŸ‘ 1
šŸ’Æ 1
d
That's two questions! I distinguish "null" from "undefined"; there's a difference between "nothing known (yet)" - undefined - and "known (to be) nothing" - null.
m
at least two questions
ā˜ļø 1
d
Or between Not Applicable and Not Available
m
I once implemented integration for a system that sent numbers when there were numbers, empty strings some times (I guess it meant no data) the string "N.A." some other times
d
The other question, about types: My types are basically syntactic: strings that can be matched by parsers. So any property can change type if it wants.
c
I see "null" as a very bad implementation of "missing". In general it's an attractive pattern to be able to build up an object gradually over separate steps. The way C++ or Java are designed, the easiest way to do this is using null. (It can also be done by using lots of separate interfaces, but this is more work). The reason null is a billion dollar mistake is because it allows you to 1. null everything, 2. retrieve a null value. If the null pointer exception was thrown at the moment you called x = obj->value (if value is null, not obj) it wouldn't be able to permeate so much. JS actually does the concept of 'missing' correctly but then throws in null as a kind of turd in the punch bowl. The other point has been addressed by modern languages such as Kotlin where properties must be explicitly marked as nullable (I would make it 'optional'), and then reference with ?> instead of ->
p
I tend to see null as ā€˜optional’, and then handle ā€˜partial data’ as a different ā€˜entity’: http://theprogrammersparadox.blogspot.com/2015/11/containers-collections-and-null.html
a
The ideas I'm kicking around for values that change type over time look vaguely like dependent types, specifically types that depend on a time or generation/version parameter. I don't think you can avoid ending up with something morally equivalent to dynamic typing: you're going to branch on (or use as a lookup key, whatever) the current type somehow, even if you manage to stash the branch in the runtime. I more often think of this in terms of migrating between different serialization formats or database schemata. Maybe another (equivalent?) perspective is as a discriminated union type where you keep adding variants over time. (Have you ever noticed that an instance of a sum type looks a lot like a dependent pair, with the payload dependent on the value of the discriminant? Is that an artifact of my shaky understanding or does everyone know that already?)
g
here’s a very pragmatic approach to null in the (unfortunately dormant i think) tulip programming language:

https://youtu.be/lvclTCDeIsYā–¾

šŸ˜Ž 1
and @Andrew F here’s one of Tulip’s designers talking specifically on variants:

https://youtu.be/ZQkIWWTygioā–¾

a
@Garth Goldwater that looks neat. Tagwords are a great idea. The little ascii faces are great too. :)
s
I second Hickey's talk - interesting questions in there. I have more than a few things to say about this topic. My position is that the generic nullability is bad. Also, empty strings are bad. Yes. Think about this: we've had the concept of 'zero' for 1000s of years but only very recently added the concept of 'empty string' - why? Could it be mainly for modeling in computers? There is no zero-like symbol you can write in a paper form where text is expected. Yes you could write
N/A
but that conflates null and "". Sometimes you might be asked to write why it is
N/A
. Consider that there are valid questions for which the meaningful answer is zero: "How many toilet paper rolls are there in the store?", "What is the temperature of the snow?" etc. There are no questions where the 'empty string' answer has a clear-cut meaning: "What is your spouse's name?", "What city do you live in?" etc. What does an answer of
""
mean? It could mean "I don't know" or "I dont want to tell you" or "There isn't one". Can these always be mapped to null or ""? I don't think so. In type systems that have both, the empty string and null, you end up with the question of 'what does null mean' vs 'what does empty string mean'. Sometimes when you have two missing values in the world that you want to model - you end up mapping these to '' and null. But really this has nothing to do with a generic solution - it's often just a reality twisted to fit the types. If you want to model three missing values e.g. 'unknown', 'non-existent', 'secret/intentionally-blank' - how do you map these to "" and null? Maybe the best solution here is to model the value as an enum?
unknown, non_existent, intentionally_blank, value(real_string_here)
. AFAIK no systems, languages have a great solution. However I feel better solutions lie in exploring the 'information modeling' space - RDF etc (perhaps with some kind of versioned schemas to describe what is possible and necessary.) Basically we don't want multiple data representations to correspond to a single reality (which is what happens in null vs "", or sometimes with nested
option<option<option<t>>>
).
o
@shalabh to add to your points on how reality and implementation play into our constructs of strings. If we take away implementation and ways to model them, strings are really just symbols in a total order. In that context, spaces don’t make much sense, neither do tabs, empty strings too. In my humble opinion we need to build a construct for orders and symbols (not just text, but any symbol you can imagine) and then modern string types can be considered special cases where we are talking about a total order of ā€˜text’ symbols.
d
In Onex I have no empty string concept and also no control characters (space is a control character in that statement)
o
@Duncan Cragg sounds interesting! Do you have a link?
d
I have unknown (perhaps like undefined) and nothing (maybe that's null) as distinct concepts or special symbols
Oh hai Orion, um, documentation is a little thin and/or dated 😧
Other shocking things about Onex: a property can be a single symbol but you can add more symbols, at which point it becomes a list. Indexed from 1. 😊
Maybe I should write an up to date description of this..
šŸ‘ 1
s
@Orion Reed I assume you mean a total order on words? Yes that could be one way to model strings. 'Text' however is a widespread concept outside computers and often mixed in with presentation as well (are paragraph boundaries important? Are underlines important?) So in some sense having a flexible 'container' of media provided by the user is reasonable too. Kinda like a 'bitmap drawing' or even 'rich text'. I don't think strings are particularly fundamental - they model some aspects of text (e.g. paragraphs) but not others (color, underline). In any case, the system won't look inside this media object to make decisions - it will just pass this object around and the meaning is entirely interpreted by another person at some other time. This is where I think "" and null become interesting: often we write
if text is none: ...
. So the system does look inside this shape.
The 'auto-list' model is interesting, and probably right. "I have one apple" should only be represented one way, so
apple
vs
[apple]
looks suspicious.
m
The thing I find interesting is that if a single value is not a special (and default case) then you don't need null or similar
if everything is a sequence (of potentially zero or one items), then an empty sequence would be "null", the cool thing is that if operations on values internally translate to map then handling the empty case comes for free. This sounds a little bit like "nil punning" in clojure, which is a pain when you find a nil and don't know where it became nil since everyone is passing it around happily
d
In Onex, empty sequence=empty symbol=nothing =the whole property isn't there. In an object property, if you empty the value or clear the list, the property itself is deleted; if you want a placeholder you just use unknown
conflating nothing with unknown is a source of uncountable glitches in the history of software
amiga tick 1
Note that this is all to help non-techies feel at home. Yes, non-techies would wonder why
apple
isn't the same as
[apple]
, or even why have those brackets
so I have
fruit: apple
then
fruit: apple pear
and no brackets, no "list type"
and so
fruit:[1]
will always be
apple
ā˜ļø 2
d
If undefined means "not known yet" and null means "known to be nothing", then what about the situation when a value is known to be the value
undefined
?
d
(just recovering from the neuronal warp that question induced)
maybe "not known yet" can also encompass "not known, and who knows why or when"
got any examples?
g
i’m dealing with that trying to do something somewhere between (un)typed holes and a structural editor—if a user has a key but its property isn’t defined yet, and then they skip ahead to elsewhere in the syntax tree, i’d say that i know they want a value for the property later but it’s not strictly defined yet
šŸ‘ 2
pretty sure i can just like... ignore that situation though lol
i haven’t stopped thinking about your no lists/everything is a list model @Duncan Cragg... i think you might be spiritually correct, not making afforsances for user convenience
reminds me of how a lot of stuff ā€œjust worksā€ in APL as a result of rank polymorphism, but also seems a bit further than that
s
In Python, there is no char type. They are just strings of length 1. So
a[0] == a
is true for strings of length 1. You can also loop over them
for char in a
will work for zero, one or longer length strings. Things don't come crashing down.
BTW, one angle to think about this is that 'nothing' is a property of the field itself (e.g. fields are boxes, and one box is empty). While 'unknown' is a special kind of object that I can put in a box. Interesting.. can I put two unknowns in a box? Or an
unknown
and an
apple
?
d
I think that falls apart if you can get back 'nothing' as a value. But if you can't, then what happens when you try to get it? The only other options I know are throwing an error, or some kind of Haskell-like "Maybe" construct
d
@shalabh to avoid that problematic subtlety, I don't have empty boxes, if you empty it, it disappears! (i.e. the object property is deleted) If you try to get it, you get nothing back.
You can put
fruit: *unknown* apple *unknown* banana
, yes
But
fruit: *nothing* apple *nothing* mango
collapses to just
fruit: apple mango
and of course
fruit: *nothing* banana
is just
fruit: banana
(no list any more)
@Dan Cook what falls apart if you can get back nothing ?
d
If nothing is not a value and just a property or state, then what is the result of getting nothing? If you can get back nothing, then nothing has to be a value. ... But the solution you just described (i.e. nothing just disappears from containers) is interesting! So nothing is somewhat like a value, but it's a disappearing value.
d
I suppose it depends if you're on the "left hand side" (matching) or "right hand side" (setting) of a rewrite rule. If you're matching, you have to be able to match with a symbol, so can use nothing to say you really want that to be absent. If you're setting, and use nothing, that's when the disappearing trick happens.
Obviously, with homoiconicity (ooo), you can't let the rewrite rule itself reduce because the matching side will disappear!