I was reading an old rfc for <Transclusion in HTML...
# thinking-together
j
I was reading an old rfc for Transclusion in HTML, and I noticed this line:
If the source document is HTML and the optional PLAIN attribute is specified, all HTML tags are removed and all SGML entities converted to the characters they represent.
This just strikes me as bizarre, that the desired behaviour would ever be to strip out the structure of the source to include it. And I'm thinking that maybe this made sense at the time, and that HTML really was used as a markup language, in the sense that you had a valid plain text document and then you marked it up to add additional information. This just has me thinking about how much the web is held back by attempting to continually adapt this notion of markup, rather than accept that the data is meaningless without the structure. I feel like a lot of semantic web stuff is held back by this. I'm curious what other people think of markup as a concept. Does it apply still?
i
As someone who does a bit of prose-y writing in Markdown, I'm very fond of plain text as a substrate, with minimal markup added for a touch of structure. And in the same way that I like being able to take a small step up toward more structure, I like being able to take a small step down back to less structure. There are times where I've taken an HTML document and stripped all the tags out to recover the plain text. Not often, but it has come up. I do build web apps, sure, but I also still enjoy that the web is a network of linked text documents, for which it's nice to have tools to add and remove structure.
the data is meaningless without the structure
That depends on what the data is. If the data is prose, then the only structure you need, at a minimum, is spacing and punctuation. The web was designed for writing and linking prose-y documents, and just because we've incremented it into an app platform doesn't mean the original prose-y purpose is any less valid, no?
🍰 2
g
Hello! This is my first message here. This thread makes me think of bidirectional transformations (see e.g. http://bx-community.wikidot.com/ ). You'd like to strip out some structure, but not by throwing it away, but by putting it on the shelf. You'd then like to do some modification on the less structured representation (editing the plain text file). And the dream would be to magically add the structure back in, with help of the structure you put on the shelf. It's obviously an ill-posed problem in general, but as I understand it, the research in bidirectional transformations is to characterize when the problem has a solution. My opinion is that it is generally not possible to find one all-encompassing structure for your data that does everything you'd ever possibly hope for. Not to mention that there are countless existing structures that you might want to interoperate with. For example, I could see myself arguing "well, if you had to remove structure to make the plain text more convenient to edit in your text editor, that's just a tooling issue. A better text editing tool would allow you to more conveniently edit HTML files". Not only would I be requiring your text editor to be a lot more sophisticated (probably to the extent of solving ill-posed problems), I would also be failing to leverage the workflow and tooling developed over decades upon decades for editing plain text.
👍 1
j
@Jared Windover It really was a linked network of documents with no pretense of becoming an app platform back in the day. The first version of JS was created in '93, Microsoft didn't add their own scripting thing (which was incompatible!) to IE until '95, and the first ECMA JS standard wasn't released until June '97 -- months after this RFC was published.
💯 1
j
@Ivan Reese That is a good perspective. I think I'm frustrated by the conflict of the modern web building so much into individual documents rather than relying on pushing the structure into the links themselves. You're totally right that the web can be used in a way that makes it perfectly valid to strip out the presentational markup to quote something. But I think most of the web (or at least most of what people experience on the web) would be made unusable by this sort of transclusion. @Gustavo Goretkin Thanks for the link! I hadn't heard of that as an area of study. It's interesting that you comment on the possibility of an all-encompassing structure because the main project I'm interested in is making a generic graph data structure usable as an application datastore. Which as soon as you have multiple overlapping schemas will immediately bring in concerns about maintaining different (potentially mutually inconsistent) ideas of consistency. I'll do some reading. @Jack Rusher That's good context. I have this sense that if we would have made browsers more powerful we could have avoided a lot of the js that gets written. An example I've been thinking about lately is authentication. Every website has its own login form, but browsers are (well, could be if digest auth were fully supported) totally capable of doing a login handshake with a server. Do you (or anybody) have thoughts on that?
👍 3
i
browsers more powerful we could have avoided a lot of the js that gets written. An example I've been thinking about lately is authentication.
Pour one out for Mozilla Persona.
j
@Jared Windover In terms of "more powerful browsers", major projects from giant incumbents promising secure, network distributed, platform-independent applications were developed in parallel to the modest and very much document-oriented WWW. For example, Oak was a language and virtual machine targeting embedded systems developed at Sun starting in 1989, which was eventually released (with a billion dollar marketing campaign) under the name Java. Most people believed in the mid-90s that the kinds of things we do now with HTML/CSS/JS would be done with Java applets. At the same time, Bell Labs released what they built after Plan9, Inferno, which was an OS/language/virtual machine targeting the same space that (ahem) did not get a billion dollar marketing campaign. As for authentication, we talked back in the wayback about using an open PKI-based system to handle auth on the web, but it didn't stick. All of the messiness you see today was grown very organically, sometimes by the unilateral action of one or another vendor and sometimes through standards body design-by-committee, all of it etched in stone forever because of concerns about backwards compatibility. It's an unhappy accident of history that we ended up with everything bolted onto a document viewer. 🤷🏻‍♂️