Debian is currently struggling with allowing <https lwn net Future of Coding #thinking-together

Debian is currently struggling with allowing <UTF-...

Nilesh Trivedi

12/06/2024, 2:13 PM

Debian is currently struggling with allowing UTF-8 in usernames. I wrote about various challenges involved in NAMING things (i.e. human-readable unique identifiers for concepts/topics or people etc): https://github.com/learn-awesome/learndb/wiki/Naming-Things

Denny Vrandečić

12/13/2024, 8:07 AM

For a large domain I am not a friend of human-readable unique identifiers, but prefer meaningless unique identifiers coupled with a system to label or name things in different languages.

Nilesh Trivedi

12/13/2024, 12:22 PM

@Denny Vrandečić Yeah, I often reconsider this choice. In fact, for items like books or courses, I used (uuid + labels) without requiring these labels to be identifiers. But for topics/subjects, I think identifers are a net benefit. Languages are effective because people have shared meanings for words. By adding uniqueness and removing ambiguity, we get to leverage this aspect. It is very annoying to have people who use the topic tag "math" v/s "maths" v/s "mathematics" can't find each other's data - unless we invest time/effort in building and maintaining a high-quality search engine.

Daniel Harris

01/04/2025, 2:20 PM

I don't understand why it isn't really clear. Every thing/object should have a UUID. Then people can add display names as they like. And there's also a way to map relationships between things/objects. And these relationships are adoptable by people as they so wish. Surely this covers all cases and then we are done. That's the underlying structure. Optimisation can happen in the form of indexes. Just objects and relationships. Need anything else? Too simplistic?

Nilesh Trivedi

01/04/2025, 3:38 PM

@Daniel Harris I think there's a lot of value in meaningful names as identifiers: • Providing a search UI (to turn display names or fuzzy queries into unique UUIDs) everywhere is not feasible. This is why names (like file names or domain names) are useful. Would you prefer just using

/usr/bin/local

or instead searching "local in user bin" and getting a response like

dd50f5eb-5845-4001-86c7-de5de793ab9b

that is the identifier of the

/usr/local/bin

directory? • Meaningful names are portable and can communicate meaning without requiring an extra lookup. Which one is preferable?

<https://www.youtube.com/veritasium/railroads>

<https://www.youtube.com/watch?v=Rdj5-6t6QI8>

(or,

dd50f5eb-5845-4001-86c7-de5de793ab9b

if we get rid of meaningful domain names as well)?

Nilesh Trivedi

01/04/2025, 3:44 PM

meaningless id have benefits too: they can be more reliably permanent (stable against real-world human concerns - political, economical or otherwise), and are language-agnostic (although, UUID in hex use english alphabets a to f which is not exactly universal).

Daniel Harris

01/05/2025, 11:54 AM

@Nilesh Trivedi I'm feeling there should be clear and clean division between human readable and machine readable. And I wonder how much of this problem exists because a while ago code and addresses were written down on paper which can't display semantic links. In the guts of the machine file and directory names are just display names for pointers to data structures. Domain names are just pointers to IP addresses. If we were more rigorous about abstracting human vs. machine I reckon we'd have a much easier time of it. Because a human should never see a UUID. A human should see a display name in their own language. So, file paths could be in their own language. Why not? It would mean more to the human. Anyway, I'm a human-first kind of guy so I think about my/our user experience before I look at the tech to implement it. If only everyone did! 🤣

guitarvydas

01/05/2025, 12:36 PM

De Bruijn concludes that names don't matter. I think he means human-readable names. Every nameable object - variable, function, etc. - could have multiple human-readable names. Experts get to see single Greek letters, learners see long phrases (with embedded whitespaces, yet).

Open in Slack

Previous Next