Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Alan Kay on web browsers, document viewers, Smalltalk, NeWS and HyperCard (2021) (donhopkins.medium.com)
234 points by gjvc on Jan 8, 2023 | hide | past | favorite | 269 comments


If our operating systems didn't blindly trust programs, we wouldn't need web browsers as a special category of interface.

You could just put your binaries up for the major OSs to access your data, and people could use them in a native context.

Alas, we're in the timeline without capability based security. where users can't just give a file to a program at runtime using a (powerbox) dialog.


Wouldn’t capability based security just bring you the current situation on mobile phones where most users simply approve access as they have no clue what is being asked, nor do they have real alternatives?


No: one possible alternative interaction would be the user handing the program a representation of an instance of the class of object requested. For instance, if the program wants a camera, it says "drag and drop a camera here". The user picks up a camera object - maybe the real camera, maybe a proxy for the real camera that will only work once to take a single photo, maybe a dummy camera, the app can't tell - and drops it on the region.


So? Nothing really changes with this scheme, compared to "Allow camera access" banner.


It’s different in that with capabilities the choice is no longer allow/deny, but allow/deny/circumvent. The allow/deny choice makes it so that the user is incentivized to allow things they don’t want just to get access to the app. If I can’t use the app without giving it access to my location, then I’m incentivized to just give it access in order to get what I want. Capabilities let you programmatically customize how to manage resources, so you’re not forced to click through just to get to the app.

Furthermore, capabilities are more fine-grained. Today an app will ask for your location and you get to choose yes/no. If you say yes, it has full access to the GPS forever until revoked, sometimes in the background. Give it network access and it has that forever as well, and it can send anything to anyone (including the GPS data you just gave it access to.)

Capability based security would allow you to give it access to the GPS for a limited time, or to give it access to a sensor spoofing service if it really needs that data. If you give it network access it would be for certain times, or it could only send data to a whitelist of servers.

Moreover you can restrict the app from sharing granted capabilities to other apps. Or vice versa, you can delegate apps to work on your behalf and share capabilities with other apps, which you can later centrally revoke or restrict.

It’s really a better security regime than what the major operating systems have ossified around.


Unfortunately we know now it wouldn't have helped much in practice. Take your GPS example: you can spoof GPS on Android, globally or per-app. Of the apps that are actually useful, many (most?) of the ones that want/need location data will attempt to detect GPS spoofing and deny you service until you turn it off.

And lest you think you can work around this by rooting your phone, an increasing number of actually useful apps refuse to run on rooted devices, "because security", and Google is actively helping this by making available an API that streamlines those checks.

Capability-based security is a good idea, but like all security technology, it only helps the owner of a device, and who is the owner of a device in any given context is a contested topic, with platforms, vendors, manufacturers and users all having a different opinion about it.


How do they detect the GPS spoofing?


That’s way too cumbersome for a user, I wouldn’t use it. I simply want an app that I can trust. If I use it to send a picture, obviously it should have access to the camera. But it shouldn’t be able to send that image to a destination unrelated to my use case.


> I simply want an app that I can trust.

Sure, we all want that, but we’ve learned that trusting apps does not scale. Even if you like an app and trust it, it can be updated to break that trust. Capability based security puts you in control so you don’t have to rely on trust.

> But it shouldn’t be able to send that image to a destination unrelated to my use case.

Capability based security could do this, and you don’t have to trust anyone to implement that feature for you (or to not silently remove it via an update). You would create a capability that would require camera and network access. Any app that needs both resources could take such a capability and be restricted in where it can send photos.


This grumpy old man made a video, just made for this reply

https://www.youtube.com/watch?v=kFRy4IwUNog

Phones do not do capability based security, they could, but they OS doesn't offer that as a possibility, we get the low rent knock-off imitation instead.


I share his sentiments.

We keep layering more and more gloop on the web. It will always be a game of wack-a-mole.

It's like Windows requires TPM, but then gives itself permission to snoop at your files. Or we have oauth, but then github is cracked to reveal the private keys. And so on and so forth, etc. ad nauseam. The clowns at Microsoft, Facebook and Google simply can't be trusted. They've proven that time and time again.

Security is better obtained "via negativa" (to quote from Taleb). Be parsimonious in what you allow. Keep things simple. If you don't really need security, then don't have it.


Cluelessness have been catered to to insane degrees in the name of sales, and users are now used to just clicking through whatever pops up until they see the thing they want to see. But if (1) they'd been properly taught, (2) apps asked only for the permissions they need instead of acting like all your data are belong to them, and (3) people actually read those damn pop ups (which they might have if there weren't so many)… capability based security is actually fairly easy to understand.


A $5 bill in your wallet is a capability. Capability based security is like handing someone $5, not eternal access to all the funds in your wallet.

Phones don't do capability based security. There's no way to say, here... you can have access to THIS photo, or this location, once, in an easy way.

On PCs, you have to trust the application to do what it says when it uses the system supplied dialog boxes to "open" files, etc. In a capability based system, that's not the case. The OS never trusts the application, ever.


in iOS you can grant apps access to a limited set of photos though the process is multi-step and most users probably just default to "allow all".


iOS is doing this: always grant permission for this capability, only when I use the app, or only once.


MacOS is doing this.


This does not appear to the case. There is no reference to single file run-time assignment of capabilities in their security page[1]

"Give X permission to access Y forever" is not capability based security.

[1] https://www.apple.com/macos/security/


There is no alternative. Mobile devices and browsers are equivalent to a locked down mode where the user is only allowed to approve a restricted set of capabilities.


It's on them if they're dumb or lazy enough to do that.


Actually all operating systems support PowerBoxes these days. It's an automatic part of the macOS Sandbox, it's a part of the FlatPak sandbox and there's an equivalent for Windows too. You can also implement this yourself the way browsers do because every OS supports the sending of file descriptors and handles to other processes, and they all support kernel level sandboxing now.


For people who (like me) have never heard the term "powerbox", it's this: http://wiki.c2.com/?PowerBox


>Alas, we're in the timeline without capability based security

Mobile operating systems (aka modern personal computing) say hello.


No, that's not the same.

There is no capability security on the OS level.

Otherwise Android wouldn't have a malware issue and iPhones wouldn't need to be locked up against third party apps.


Yes, it is. Try and read GPS data without asking the operating system for that capability. Even if you call to native code there is no way to trick the operating system into giving you that data. Malware can't get it either because the security IS on the OS level.

>Otherwise Android wouldn't have a malware issue and iPhones wouldn't need to be locked up against third party apps.

Android malware is not as bad as what is possible with Windows malware. Over the years Google has been cracking down and working to remove permissions that apps can use without the user knowing. Unless the app has an exploit for the operating system it can't do much harm to you. iPhones have allowed for third party apps since the very first iphone a decade and a half ago.


> Try and read GPS data without asking the operating system for that capability

Generally speaking, capabilities are not asked for; that would be a permission.

Capabilities can be used to implement permissions, but that's not what is usually meant by 'capability-based security'. Capabilities are essentially just inputs; a GPS-reading capability would be anything we can read GPS data from (e.g. a callback; or a pseudo-file handle; etc.).


Capabilities and permissions are the same thing. The app needs a capability or permission for it to use an API. The whole point is that an app should have permission or the capabilities to the minimum amount of things in order for it to work.


Capabilities are needed to "do" things; e.g. the URL of a document is a capability to GET it (via HTTP, or whatever). Without a URL (capability), performing a GET doesn't even make sense.

Security is enforced by choosing who is given a capability (like a URL); we can get extra security by minting new capabilities, and revoking those which should no longer have access.

Permissions are not required to "do" things; they are enforced elsewhere, to determine whether that thing should succeed or fail. For example, giving out URLs to search engines, then trying to have the server reject requests that we don't want (AKA 'closing the barn door after the horse has bolted')


Android's binder seems to be exactly what you are talking about. A binder object represents a token for something you can send and receive transactions. You use binder to talk to LocationManagerService which lets you get the location. Android has a service manager and lets any app get the LocationManagerService. LocationManagerService itself does do the permission checks.

The operating system uses both capabilities and permissions.

Regardless, this is all implementation details and for the user it is equivalent. There is the same principle of least privilege where apps are limited in what they can do.


> The operating system uses both capabilities and permissions.

No, that's still wrong.

The Android framework uses something that goes a little bit in the direction of capabilities.

But the Android framework / runtime runs (by now) on an OS (Linux) that is not capability safe.

> Regardless, this is all implementation details and for the user it is equivalent.

No, of course not.

On a capability secure OS your mouse driver can't read the disk or do network request. Even if you would find an exploit in the driver.

That's a completely different level of security!

It's a fact that there is still no capability secure OS out there in broad use. (Fuchsia could be the first that brings this concept into mainstream. But this will take time. Also Fuchsia isn't as strict as it could be, for different reasons).


> Capabilities and permissions are the same thing.

No, that are two fundamentally different concepts.

A capability is something that needs to be hold and passed to every call. If you can't show the needed capability (because you don't own it, in the sense that you're holding a special kind of object in your hands) you can't preform an action. On the very basic level: You just can't call a function because it needs a parameter of capability type.

This is fundamental different to permissions. Permissions are checked by the called code. You don't need to "pass a permission" alongside the actual action you try to perform.

Please read up at least the Wikipedia page as you seem to have some severe misconception about this topic: https://en.wikipedia.org/wiki/Capability-based_security


You are getting too hung up on the implementation details.

Capability based security means that applications are very limited in what they can do by default and need to be given capabilities in order to do something. Yes, one way to give an application a capability is by representing that capability by a token, but another way to give an application a capability is by giving it a permission.

If you want the term to refer to a specific implementation ignoring that it is identical from the end user's perspective then I'm fine in conceding.


We are using "capability" as a term of art here. Frankly, I'm not sure there is even a colloquial meaning of "capability-based security". Approximately no-one uses that phrase unless they're talking about capability-based security, the specific approach to managing access. In the same way that no-one calls functional programming "object-oriented programming" because "objects are just anything that exists and functions exist".


Dude, what's your problem? Are you trolling?

> A capability (known in some systems as a key) is a communicable, unforgeable token of authority.

That's verbatim the second sentence on Wikipedia!

That's not "an implementation detail", it's core of the whole concept.

> Capability-based security is to be contrasted with an approach that uses traditional UNIX permissions and Access Control Lists.

> Although most operating systems implement a facility which resembles capabilities, they typically do not provide enough support to allow for the exchange of capabilities among possibly mutually untrusting entities to be the primary means of granting and distributing access rights throughout the system. A capability-based system, in contrast, is designed with that goal in mind.


> Try and read GPS data without asking the operating system for that capability.

That's not capability-based security: you have to do that whichever model is in use. The ocaps come in when you use dependency injection (aka "function call") everywhere to pass in rights to make use of resources. Neither Android nor iOS are meaningfully object-capability based.


Yes, it is. Capability based security by definition just requires that apps have a list of a capabilities and those capabilities govern what APIs / resources it has access to. There is no requirement of dependency injection. That is an implementation choice.


> just requires that apps have a list of a capabilities and those capabilities govern what APIs / resources it has access to

No, that would be mandatory access control. Completely different concept.


You seem to be getting caught up with the implementation details. The whole point is that programs shouldn't be given overly broad access to the system. Access can be granted to programs by giving them capabilities. Whether that be from tokens or something else it doesn't matter.

Please remember from the first commented:

>Alas, we're in the timeline without capability based security. where users can't just give a file to a program at runtime using a (powerbox) dialog.

This is literally how Android already works today. An app can not read a random file from the system. It has to open a file selector and you have to select a file before an app is able to read it.


> This is literally how Android already works today. An app can not read a random file from the system. It has to open a file selector and you have to select a file before an app is able to read it.

That's not implemented on the OS level.


>That's not implemented on the OS level.

Yes, it is. The Storage Access Framework is a part of the operating system.


You can fake GPS data. Also, capabilities are far beyond the Android and iOS model.


I don't understand your point. In capability based security nothing prevents people from making up fake data if they don't have the capability for something.


> In capability based security nothing prevents people from making up fake data if they don't have the capability for something.

That's completely wrong.

The whole point of capabilities is that they're unforgeable.

You can't mint a capability! You need to get it passed "form above". Nobody can create capabilities from thin air (not even the OS kernel).


I do not follow your logic. Anyone can define a data type { latitude: 100. longitude: 100 }. The operating system is not going to stop you from creating this object.


Such a made up object would not be accepted by other parts of a capability secure system as it would lack the actual capability (which is unforgeable).

But I'm coming to the conclusion that you're only trolling. You refuse to look at the facts and definitions, and just repeat the same nonsense over and over. I'm done here as I obviously can't help you understand this topic.


SeL4 uses capability-based security, but unfortunately it's not in common use.


seL4 has a different "issue": The actual OS is missing… ;-)

seL4 is just the core of a micro kernel. The hard part would be to build an OS around that.


Have you considered that an approval process is one aspect of a capability solution?

You know, the same way that you ask for code review before merging to prod.


No, I didn't as this makes no sense, that's not what this technical term describes.

I've liked Wikipedia on that topic already in this thread. Please take a look.


Not in the direct ancestors of this comments you haven't. When referring to cousin comments like this one https://news.ycombinator.com/item?id=34306946 it's generally a good idea to put a link to them instead of having us dig through the whole discussion (I literally had to [Ctrl-F] your handle).

And for bystanders, here's the Wikipedia link: https://en.wikipedia.org/wiki/Capability-based_security


Thanks for the pointer!

My reasoning was that it would be rude to link to myself, to a comment that showed up at this time just on the same screen as this one.

But I guess you're right. A link would not hurt, but make it more easy to discover the mentioned content.


I don't understand.

What is the security model you propose?

With at least two examples, please. To help me understand.


Maybe as a starter:

https://en.wikipedia.org/wiki/Capability-based_security

For examples have a look at the "Implementations" section.


Example 1

Take $5 from your wallet to pay for a sandwich. That bill is a capability.

You can't accidentally give away your car when you do it. You can easily delegate the giving of the $5 to someone else safely.

Example 2

Take a file on a floppy disk for an IBM XT and make a backup of it. You can hand that backup disk to anyone and your original is safe.

You can use that file with any random shareware disk from your user group meeting and in no way could it cause you any more loss than the disk in it is on.

A disk is a capability.

Example 3

You are asked your location. You reply correctly.

The answer does not allow you to be tracked forever.

Example 4

You have a 15 amp outlet. You can't take down the power grid by accident, nor burn down the house. The outlet is a 15 amp capability.


When you download a program from the internets, it does not have access to any of your stuff. Can't read files. Can't make network requests. Can't access cameras or microphones.

You explicitly give it permissions to access something: give it a document you want to edit, allow it to save a result at some location (without giving it an idea what else exists at that location), allow it to connect to certain hosts on the internet, but not others, politely decline a request to use your camera, etc.

Very much like you do with a web app running in your browser.


Another example: you have a printer, a printer server, and N clients connected to the server. You send a capability to the clients that grants them usage of the printer. That capability allows them to print at certain time in black and white only, but the marketing department gets a capability that allows them to print in color.

Your customer comes to the office comes and wants to use the printer while he’s there, so you use your capability to send them a capability that will give them access to the printer (b/w only still, they can’t get color because you can’t get color) for an hour. However their capability is restricted such that they will not be able to grant anyone else access to the printer


It's moving that way. Even WASM is expanding out of browsers with WASI.


Only to bring us back to Unix legacy APIs.

Don't get me wrong, WASM is a nice idea with some nice properties. But WASI isn't a real improvement to the status quo in general.


Not yet, at least. But we're working on that :-).


Honest question: How can this work out if the legacy APIs are still at the bottom?


It depends on what you mean by "at the bottom".

If you mean at the bottom of the wasm, the the answer is, those won't be legacy APIs. The direction we're heading is to provide POSIX compatibility as an emulation layer on top of a cleaner foundation, rather than just doing POSIX at the base layer.

If you mean that all Wasm engines today are implemented on top of traditional operating system APIs, then yes, that is how things will often work, but that's ok. What really matters is how the virtual platform works. We don't have to expose things like "the filesystem namespace" directly to wasm, even if it's present in the host. And if we don't expose "the filesystem namespace", then we don't have the associated problems, even if the underlying host has them.


> The direction we're heading is to provide POSIX compatibility as an emulation layer on top of a cleaner foundation, rather than just doing POSIX at the base layer.

What's that "cleaner foundation"?

Next thing is: POSIX is the legacy stuff. If it's there nothing will change.

And not even to mention that the stack is going to look something like:

Hardware >> Hardware emulating some ISA (e.g. x86) >> C abstract machine >> POSIX >> VM emulating some ISA >> "container" emulating POSIX >> WASM VM >> emulation of POSIX APIs >> legacy applications written for some ISA & POSIX

I call this construct "layers of madness"… ;-)


That cleaner foundation is shared-nothing linking, capability-based security, virtualizable APIs, and more, and a WASI organized around things like streams as first-class types.

The goal is to build a new platform. Initially, that looks like adding layers on top of existing platforms (which, as you say, already have multiple layers). If we succeed, then we get to start taking out some of the layers.


WASM was designed to be used outside of browsers from the beginning.


Fuchsia is be incredibly interesting for this reason.


How so?


Because Fuchsia is exactly a capabilities-based OS as the OP describes.


If Fuchsia ever appears as an end user operating system, this might be true.


These comments go hand in hand with another observation he made in a talk of his (or several), how the first books that were printed on the printing press resembled the books before the printing press. It took people decades to figure out they can print smaller, easily portable books because they did not have to be viewed as these precious hand made things. In the same way, the web is an imitation of a paper document. There are some improvements, you can now e.g. read the New York Times from anywhere on the globe, but overall it could do so much more than that.


> It took people decades to figure out they can print smaller, easily portable books because they did not have to be viewed as these precious hand made things.

This is wrong.

The first thing Gutenberg printed after the prestigious bible (which not only resembled previous bibles, but surpassed them all in respect of the regularity of the printing types), was the Donatus, Aelius Donatus' Latin Grammar book "Ars minor". This was a standard text book of the time. It was short (aprox. 30 pages) and extremly popular: There were at least 24 editions published until 1468.

From 1454/1455 we have indulgence certificates, which shows that the printing press was immediately used for cheap mass production, were "precious hand made things" were of minor improtance.

The next things Gutenberg printed (1455/1456) fell into the "news and politcal pamphlet" category: a 6 page long Calendar Against the Turkes ("Eyn manung der cristenheit widder die durken") and a 25 page long German translation of a papal bull also against the Turks.


I think the parent is misremembering Kay’s point a little.

It’s not that people didn’t catch on to the idea that the printing press could make books cheaper/more portable because they weren’t so laborious to create. That much was obvious about its utility.

Kay’s point as I remember was that, at least initially, automation of previous tasks is all they wanted to do with it. They got this great new machine that opened up tons of possibilities, but all they used it for was to make faster the same thing they were doing before. But we all know rather than saving some monks some time, what the printing press really did was fundamentally transform society. That’s what people missed, and it’s something people keep missing about transformative technologies, even today.

In his book Mindstorms (which every programmer needs to read), Seymour Papert makes the same observation about computers when they were introduced in classrooms; he says that instead of unlocking the full potential of those machines, educators just used them to automate activities they were doing before - times tables, essay writing, drawing graphs, etc. Very few people realized the potential at the time, and most still don’t.


I love when plausible sounding factoids is dismantled by people who actually know history.


I'll try to link to several of Alan's talks [1] referred to by localhost on the first books and printing press. This first one [2] is too short, there are more detailed versions. Talk [4] has different slides then I remember.

[1] https://tinlizzie.org/IA/index.php/Talks_by_Alan_Kay

[2] https://vimeo.com/6606439

[3] https://vpri.org/writings.php

[4] https://tinlizzie.org/IA/index.php/Alan_Kay_at_Xerox_PARC_(1...


I actually find the arguments to be stronger for the opposite lesson. Some examples of successful designs from the physical world that are also successful in purely digital interfaces:

How smart watches display the time, VST instruments (e.g, https://mixedinkey.com/captain-plugins/wiki/best-vst-instrum...), ebook "pages", on-screen keyboards, visual-coding environments that use a "cables" metaphor (e.g., https://twitter.com/robenkleene/status/1280182525013475330).

I actually think the lesson here is that if you can take a metaphor from the physical world, and can incorporate it's essence into your digital design, it'll probably improve your design. And I don't think the reason is because we're leaning on our familiarity with the physical world (too many purely digital designs have proved successful for that to be the case). I think the reason is simple: Our brains are evolved for the physical world, and we've had thousands of years to master building things with physical materials, so the human race has created truly fantastic interfaces in the physical world. If we can bring the essence of those designs into the digital world, we should.


A lot of the design constraints remain the same. For example, all three of hand-crafted books, printed paper and web pages need a specific optimal line width.


Browsers wrap lines automatically, so line width can be chosen by the reader, and even varied depending on the situation (the same applies to font, size, colours, etc.).


Ideally the reader should still not be bothered and sensible defaults should be applied (even if they are dynamic defaults).

Not to mention that readers left to their own devices tend to chose stupid unoptimized color schemes and font sizes all the time, hurting their readability and lowering their reading times...

(of course for disabled readers it makes sense to be able to set to whatever suits them, but those also need sensible defaults for their case, perhaps even more so than abled readers, as it can be more difficult to them to change configurations)


> Not to mention that readers left to their own devices tend to chose stupid unoptimized color schemes and font sizes all the time, hurting their readability and lowering their reading times...

... thus reducing their exposure to our content marketing texts and/or ads embedded on the page, thereby impacting our bottom line!

I think it's safe to assume most people capable of choosing "stupid unoptimized color schemes and font sizes" are also capable of iterating on their choice, or reverting to the better defaults, when they feel the reading experience of things they want to read worsens.


>I think it's safe to assume most people capable of choosing "stupid unoptimized color schemes and font sizes" are also capable of iterating on their choice, or reverting to the better defaults, when they feel the reading experience of things they want to read worsens.

You'd be very surprised


Aren't Single Page Applications implementing most of the ideas? Sending programs (Javascript), and objects (JSON). I think we moved already to very different formats than print, for better or worse, i.e. endless streams vs. a limited paper resource.


In the 80s/90s era that these ideas are rooted in, an "object" means both code and data fused together. In this worldview JSON is not a way to send objects despite the name, it's a way to send structs. In the sense Kay means, sending an object implies receiving a thing that implements one or more interfaces, on which you call methods/send messages, and then stuff happens. The data held by the object is encapsulated.

What Java was trying to do with RMI, classloaders, etc was a lot closer to this vision because in that mechanism you could actually receive over the network a full object with both code and data, in which the actual underlying data structure was something you'd never seen before but it didn't matter as long as it implemented an interface you recognized, because the code to process the data came with it. This vision hit a number of practical engineering problems, most obviously security and less obviously the pervasive assumption of flat fast LAN-like networks that the CORBA/RMI/DCOM/etc people kept making.

The web won out over this vision partly because it was simple enough to reimplement. When Tim Berners-Lee didn't have the resources to take it to the full potential (not even having proper inline images) NCSA Mosaic took over, then Netscape took over from them, then Microsoft from Netscape, then Chrome/Safari from Microsoft. The history of the web is basically a history of failed business models up until Google decides to just subsidize the whole thing in order to drive search traffic.


Only to have anyone now use REST and gRPC for the same kind of purposes.

The only major difference between what I was doing with RMI/CORBA and DCOM and now, is the tooling and the protocols.

Everything else, IDL files, schemas, boilerplate generation, client SDK to wrap boilerplate code,.... is just the same story with a different dressing.


Well yes, RPC systems are nothing but tooling and protocols so if those are the things that are different then everything changed ;)

REST works because it's stateless message passing (no object state), supports one-way connectivity (no dialback), the tooling was much better (no equivalent in RMI/CORBA/DCOM to a transparent cache or a load balancer), HTTP/1.1 is easy to understand and implement (or looks easy at least) so every language can have get on the stack very fast, it has auth systems that work, it's more layered/incremental because it leaves language and type system mappings to higher layers, and so on. Lots of valid reasons.


A stateless message passing with statefull servers.... :)


The core of the web is a system to display a page of static page with one or two images in a single rendering pass.

SPAs etc. are insanely complicated and complex attempts to bend this into something it's not, and can never be.

The reason people are so excited about WebGL and upcoming WebGPU is that they will be freed of constraints of this model (but forget about the constraints of doing low level graphics)


and forget about the constraints of legally mandated accessibility.


Most countries still don't have those kind of legal mandates.

In all these years, accessibility has only been part of UAT for goverment related projects.


https://ec.europa.eu/social/main.jsp?catId=1202

sure it's true that most countries only have for government related projects, but that doesn't mean that there aren't enough important markets where that isn't the case.


EU directives are to be adopted as laws by the respective members, being a directive alone is not enough.

As per experience, anything that isn't legally binding usually ends up not being part of project requirements.


The history of the web is littered with attempts to build more generalized engines for safe rich media / app execution. Microsoft silverlight, java applets, adobe flash etc.

These attempts ran into problems around openness (one company did not want to promote another) and friction with resource contention of the host operating system. For example rich web media needed to be tightly coordinated with iPhone development for performance concerns and to force publishers to retool for mobile experience.

I don't think we have sacrificed so much in possiblity once you weight the constraints of the time and the world of possible today. Device monopoly and app store becoming what they became is probably not something the open web could have prevented even if there was more holistic vision for web tech up front.


>attempts to build more generalized engines for safe rich media / app execution

Again, the inability to run untrusted code thwarts progress.

Imagine if you had to grant access to your wallet forever to spend a $5 bill. Cash would never become a medium of exchange.

We need general purpose computing that gives no access by default to anything. Just like me and my wallet.


It was inevitable, what people generally want is to be able to interact with another node on the network in a visual way. The fact that browsers started as simple HTML is only due to technology limitations.

If the web was being made form scratch today, it would probably start with a repurposed game engine.


> If the web was being made form scratch today, it would probably start with a repurposed game engine.

I still hold that reinventing the underlying operating systems' text rendering and text input is almost always a bad idea. Some things this "game engine" would need to work with the underlying operating system to solve:

- Accessibility (screen readers, etc)

- OS-specific text kerning behaviour

- Ligatures (including for Arabic and Korean text)

- IME for Korean / Chinese / Japanese / etc.

- Right-to-left and left-to-right language support.

- Input events on every platform. There's about 20 keyboard shortcuts to interact with text on windows, macos and linux. They're different on every platform. iOS / Android are different again.

- Native scrolling

There's probably more. Its an obscene amount of work to reimplement this stuff correctly on top of a game engine-like API like Canvas. The browser already does almost all of this work and despite that its still a ridiculous hassle to implement web-native rich text editor components. (Though we may disagree on why).

As a developer, there are a lot of advantages to having your application behave in an identical way on every platform. But as a user, I don't actually want that. I want your application to fit in with the rest of my operating system. (Be it windows / macos / linux+GTK / iOS / Android).

Mind you, Raph Levien is smart and as I understand it, he disagrees with me on this. Here's a video of him talking about it: https://www.youtube.com/watch?v=zVUTZlNCb8U


These things are hard, and I don't want to discount any of it. But three observations.

First, browser engines do reimplement most of this stuff, though that's been an evolution, it used to be they relied a lot more on platform text layout, for example. And for the stuff that's not completely reinvented (accessibility), they have a put in a lot of work to make cross-platform abstractions. In many cases (interfacing with the compositor comes to mind), browsers are the only viable open source code bases you can read.

Second, to a large extent OS platforms are stagnating. UWP was going to be a big advance over the old WinAPI ways of doing things, but it ended up being a dud. SwiftUI has performance problems for a number of reasons, including the fact they're still using software rendering for a lot of stuff, and their compositor model requires allocating huge amounts of memory for intermediate textures for all their UI elements.

Third, the business structure around platforms requires them to each do things differently, which creates gratuitous incompatibility. You really only need one high quality library to do, say, text layout, rather than having slightly different and slightly incompatible versions on each platform. (And very few places are gratuitous incompatibilities more frustrating than GPU infrastructure)

So for these reasons, I do believe it's worth exploring a new cross-platform GUI toolkit. It is pretty speculative, though; there are lots of ways it could fail.

Game engines are decent at a lot of this stuff, of which of course GPU infrastructure (and graphics rendering in general) is quite good. Most of them could use work in the text department though :)


I think there’s tremendous benefit to what you’re proposing. The future I’m the most worried about is where every application ships its own bespoke, slightly wrong text layout and rendering engine. We see this already in video games and it’s a mess. (Which games allow you to copy+paste in and out of their chat system? It’s a crapshoot!)

The amount of work to do this well is high. Probably (easily) measured in engineer-years. All to reimplement something that - janky as it may be - already exists in some form in web browsers. But as you say, worth it if we can pull it off. But it’s a common good. The hardest part in my mind would be figuring out how to fund a project like this.

My ideal application platform isn’t a video game engine. It’s probably something like electron, but with most of the browser stuff stripped out. (Dump javascript, the game pad api, midi, usb, Bluetooth, tab isolation, and probably the DOM. And use Houdini or something instead of CSS). And then I’d happily swap the rendering for pietgpu or whatever will perform the best on modern platforms.

Basic desirada: Small and light, fast, cross platform and platform-native look and feel everywhere. Accessibility, IME, localisation, rtl- and ltr- language support, and so on.

Mobile is still a big question. But solvable if sufficient engineer time was poured into it. It’s just a big project.


How would a new cross-platform UI toolkit be meaningfully better than e.g. Jetpack Compose or JavaFX?


To me, the single biggest opportunity is performance. There's a lot that can be done if you optimize for that: plumb incremental computation through the pipeline from the app logic all the way to the GPU, run your logic in multiple threads, and just generally minimize the work done. I gave a talk[1] which goes into more detail.

Now, I freely admit that many applications don't need that much performance. But I think having a toolkit that is optimized along those lines could be a good basis for doing work on the other things (developer experience) that make a toolkit great.

[1]: https://www.youtube.com/watch?v=zVUTZlNCb8U


Thanks Raph! I'll watch your talk.

I think the Jetpack Compose guys have plans to multi-thread composition, and they already do it incrementally.

JavaFX has a reactive/observable properties framework where you can build chains of lazy computations which update only the part of the UI that needs to be changed, this is then propagated to a parallel render thread that runs concurrently with the app logic. They originally wanted to further parallelize the render thread via tiling, but never did so.

Performance problems obviously still occur, usually when trying to do complex layouts with lots of measurement.


> But as a user, I don't actually want that

Depends on the users I guess. Mine are really adamant that they want the exact same app no matter the OS. The OS is just a tool to launch actual productivity apps.


Alan Kay has been thinking & talking [1] about a game engine based web for more than 20 years called Croquet and they are just now launching into the open WebXR world because, like James Cameron waiting decades for the film tech to catch up with his vision of Avatar, the delivery of open 3D Immersive experience is becoming possible. And that will affect all the old underlying design assumptions for things like the Dynabook required by real world physical constraints.

[1] https://www.youtube.com/watch?v=uQTeWJNkylI

[2] https://www.croquet.io


There is much better documentation on Croquet [2] and its many forks (Open Cobalt, Teleplace, OpenQwaq, 3DICC) and I have many more video demonstrations of what this massively scalable collaboration environment and virtual unlimited desktop GUI could do [1]. It is still alive and being worked on. You can contact us and get involved [3]

[1] https://www.youtube.com/watch?v=1s9ldlqhVkM

[2] https://scholar.google.nl/scholar?hl=nl&as_sdt=0%2C5&q=alan+...

[3] morphle at ziggo dot nl


The Smalltalk based OpenCroquet was interesting. This might be too, but if so it is hidden well under the buzzword bingo marketing.


My hope is the engine Godot could fill this role as well. It is lower level but its also nice its a full engine designed for general game usecases.


First I thought this would be super cool.

But than I read "Metaverse" on the front page. So it's something like Web3, I guess.


More Web3D than Web3. They are building an Immersive PaaS and you can do with it what you want. Please don't let crypto own that word but maybe that ship has sailed for now.


The point is: I'm quite skeptical about this new Second Life fad.

It's a little bit off-putting when someone jumps on the "Metaverse" train.

But OK, maybe I should have a second look despite this marketing stunt.


The fact that browsers started as simple HTML is only due to technology limitations.

It's not clear that this is true. Smalltalk and Hypercard existed before the Web. The Web started with static HTML because its vision was based on static documents, not apps.


This is exactly right. Hypercard and HTML started the problem from opposite ends. Hypercard being a GUI app development tool and HTML being a way of joining together text documents.

That said, a network enabled Hypercard would have been a security nightmare. It was way too complex to secure given coding practices of the time. Heck, even "simple" web browsers had a reputation as a security weak point in the early days. There were no end to people who thought integrating the web browser with the OS was pure lunacy that was going to result in an endless string of compromised machines. Luckily "live desktop" and the like ended up being such weak features that they didn't have quite that much impact.


> There were no end to people who thought integrating the web browser with the OS was pure lunacy that was going to result in an endless string of compromised machines

Fortunately OS and web browser developers didn't really consider that to be a big problem so they did it anyway and implemented a system that downloaded random code from the internet and happily executed it in a non-secure sandbox on top of an OS filled with exploitable security flaws. See: Java, Flash, JavaScript, webasm, PDF, canvas, etc.. Not that it mattered too much anyway since media and HTML renderers were already exploitable.


Right but with "containerization" it could have been secured earlier.


Containerization is easier said than done, especially with no hardware support. Ultimately you have to make compromises to keep it performant (remember this is on 386/68030 class hardware) and those compromises come back to bite you. Containerization also has a memory penalty, and memory was precious back in those days. Remember that early web browsers were heavy criticized for running poorly on less than 8MB of RAM.


This exactly. Isolation is a very hard problem, and it's even harder when you're running on top of hardware and operating systems that prioritized features and speed over security for decades - certainly due in part to users who had similar priorities and/or were unwilling to pay more for security and reliability.


> If the web was being made form scratch today, it would probably start with a repurposed game engine.

I came to the exact same conclusion.

Modern browser work anyway like a game engine as that's the only way to be fast with rich interactive content.

https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-f...

We would just need to get rid of all the HTML / HTTP baggage (JS as such is OK-isch as scripting engine, but could be any other VM also of course; and CSS has great utility everywhere, so it's not web specific). Than switch to some sane RPC protocol for networking (as all the "RESTfull" stuff today is anyway only badly made RPC in disguise). Such client-server protocols are actually also already available in game engines because a lot of online games need super efficient real time communication. Efficient content distribution and client self-updates are also part of game engines.

I start to consider Godot as an application framework, to be honest.

A usual (completely self contained) GUI app with networking is only a few MiB, is supper snappy, and does not need much resources at runtime. The engine and it's editor is really nice to work with. You can finally again click together an UI with the editor. Like in the good old days.


> Than switch to some sane RPC protocol for networking (as all the "RESTfull" stuff today is anyway only badly made RPC in disguise).

RPC is a dumb protocol because it tries to abstract away the necessary aspects of "remote" as if they don't exist.

REST APIs done correctly are great because they expose the possibility of network failures and partitions, allow for idempotency (GET doesn't change stuff), caching and all of the other things that the architectural style offers.

RPC doesn't have that.


> RPC is a dumb protocol

RPC is a broad concept, even farther from a protocol than REST, which is an architectural style.

> because it tries to abstract away the necessary aspects of "remote" as if they don't exist.

Generally, it does not. There are some languages/frameworks/other technologies that have been promoted on the basis of transparent RPC or distribution across machines, but that’s not the normal case of RPC.

> REST APIs done correctly are great because they expose the possibility of network failures and partitions, allow for idempotency (GET doesn't change stuff), caching and all of the other things that the architectural style offers.

More accurately HTTP has all that, and REST layered over HTTP gets it for free from HTTP, while many RPC approaches (though this is not inherent on “RPC”) over HTTP would, e.g., tunnel all actions over POST and reinvent distinctions between safe/idempotent/neither operations if handled at all, within the RPC protocol, and would require specialized caching mechanism for things that are cachable, not HTTP caches. Or, worse, do the same thing but tunnel over GET.


> If the web was being made form scratch today, it would probably start with a repurposed game engine.

Doubtfully, for the same reason that VR is still looking for a general purpose problem.

Visual metaphors are great when demonstrated visually (e.g. in futuristic movies), but their information bandwidth sucks compared to text et al.

Or to put it another way, how many emojis would it take to convey the content of this post reliably?


> Visual metaphors are great when demonstrated visually (e.g. in futuristic movies), but their information bandwidth sucks compared to text et al.

Holy false dichotomy there, Batman!

VR had some problems in the first batch of devices with low resolution, but as resolution improves there's no reason why it can't be integrated as first-class in it.

There's no need for the structure of the text to be a long linear sequence. When you put together a sequence of more than a few paragraphs, it is convenient to group them into a text chunk under a headline; and in fact these chunks can be placed in non-linear structures such as trees and networks, which benefit from a visual representation. The 3D environment of VR would allow better manipulation of these structured text structures in space, compared to the limited 2D possibilities of minimaps and tables of contents.


> it would probably start with a repurposed game engine.

Like the html canvas or webgl/webgpu?

I think it would be cool if you could encapsulate chunks of canvas drawings as components and have native dev tools for inspecting, debugging, etc.

Or were you mostly implying that the web would have been more immersive (3D) from the start.


> Like the html canvas or webgl/webgpu?

I think more of a full flagged game engine. With all the bells and whistles.

> I think it would be cool if you could encapsulate chunks of canvas drawings as components and have native dev tools for inspecting, debugging, etc.

Have a look at how modern game engines with their editors work.


Yeah I work in the gaming industry. I was mostly daydreaming of that in the browser.


Unreal Engine can (or at least used to) compile games to WASM and render on canvas. There used to be an excellent demo based on Infinity Blade.


> Unreal Engine can (or at least used to) compile games to WASM and render on canvas.

So does Unity and Godot.


> If the web was being made from scratch today,

This is a brilliant prompt, for discussion / thought provocation.

My answer is: start with a reliable system for exchanging CSV files. Add a few extensions and boom, everything’s possible.


> This is a brilliant prompt, for discussion / thought provocation.

The web is such a complicated mess right now that I wish we would revert to it being just a document-based format. That's why I'm interested in things like gopher. Gopher pages can look surprisingly good.

I ye olden days there were BBSs (Bulletin Board Systems). It is stateful and all the heavy lifting is done server-side.


Please no, not CSV. A strictly specified format would be in need instead.


And isn't this on reason why XML was invented?


Sure. XML is a great tech. Much underrated today.

(Besides the insanity that correct parsing would need network access, it's mostly very sound).


> If the web was being made form scratch today, it would probably start with a repurposed game engine.

Possibly. But I dont think such a “web” would be anywhere near as successful. It was critical for the success of the early web that anyone would write a webpage in notepad and link to other pages.


I thought most of us just wanted to do the cool new thing, like Wargaming!


Would that be WASM?


Alan's observations, especially in regards to Netscape and their lack of vision, make me morose for OpenDoc which did not deserve its untimely death.

https://en.wikipedia.org/wiki/OpenDoc

Some good discussion from 2021: https://instadeq.com/blog/posts/why-opendoc-failed-and-then-...

https://apple.fandom.com/wiki/OpenDoc (the links at the end are better than the page)

https://www.wired.com/1997/03/closing-opendoc-a-great-leap-b...


Web browsers achieved all the dreams of Java developers. In fact, declaring to a newly wed college graduate that - back in olden days of yore - software had to not only be downloaded, but it had to be hand built against each operating system, until a magic wonder called Java came around and abstracted those things. Then better browsers came and brought the Java into the browser, then finally javascript and rendering had reached enough maturity to implement essentially an entire operating system / virtual machine AS the browser, that java applets were gone.

truly incredible in a sense


Incredible, but also incredibly wasteful. Often websites send our CPU fans spinning for stuff, that should not even wake up a CPU fan. In that regard the browser ecosystem needs to seriously lose weight. We have more powerful machines than ever, but we are wasting their potential to no end with fat bloated web apps.


Speak for yourself. I run a very battery efficient browser - running on a very efficient processor architecture that gets 10+ hour battery life that I don’t think I’ve ever heard the fan once - an M1 based MacBook Pro 16 inch.

My wife has a MacBook Air that doesn’t even have a fan.


That's not the fault of the browser ecosystem. Give people a real programming language and they will create terrible bloated software with it.


There could be a standardized execution speed. Similar to the home computers of old, and video game consoles, developers would then have to make do with the fixed performance afforded by the platform.


Maybe there will be a W3C spec for perf profiles. No js, no dynamic css, PDF life interactions. Maybe that is stupid, but with that information your browser can go into a different, near zero power, mode.


Didn't Google's AMP project set out to achieve something similar? https://en.m.wikipedia.org/wiki/Accelerated_Mobile_Pages (albeit as a bit of a walled garden)

Personally I almost wish that the majority of sites out there didn't pretend like they need to be highly dynamic, interactive and inevitably wasteful experiences.


Google's AMP project was Google's weapon against Facebook's instant news or whatever it was called.

It had nothing to do with a quest for better performance. It had everything to do with ad impressions.


> It had nothing to do with a quest for better performance. It had everything to do with ad impressions.

Why not both? Even if the former was the "public justification", AMP pages still performed better than bloatware-ridden websites that otherwise took 2-10 MB to load, not even mentioning the amounts of JS and the impact of it on performance, especially on mobile devices (as well as batter usage on those).


> not even mentioning the amounts of JS and the impact of it on performance

The only reason it performed better is because Google would preload over 1MB of Javascript needed to run AMP on its search page.

A cold start for AMP was barely better than most of the top sites that used it.

It was cheating, pure and simple.

Edit: IIRC it was also preloading the entirety of the search carousel, too. And you could get into the carousel only if you had an AMP page.


> The only reason it performed better is because Google would preload over 1MB of Javascript needed to run AMP on its search page.

I wonder what he actual metrics are for how often that was used, what % of people chose one of the AMP results, because those were at the top of the list. That preloaded data might be common for most of AMP pages, which is much better than the mess of each site having different React/Angular/Vue bundle versions or even jQuery versions back in the day.

As a thought experiment, I wonder how much we could save on bandwidth, if all of the "popular" frameworks/libraries split their base code from each individual application's code (we already have code bundling, after all) and browsers included all of those packages of base code, preloaded.

Maybe even follow SEMVER as close as possible, so the browser can always ship the latest minor/patch release for a given major release, to cut down on space needed in the browser.

Something along the lines of:

  <script src="/browser/react@18"></script>
  <script src="/js/my-app-react-bundle.js"></script>
Whereas another app that needs an older version could use:

  <script src="/browser/react@17"></script>
  <script src="/js/my-older-react-bundle.js"></script>
And if you don't want to participate in such an initiative (or really want a custom version), you could always make sure that your web server serves the "/browser" directory itself.

After all, browser install sizes are already horribly bloated, a few dozen MB will hardly matter at this point, whereas each site needing 1-5 MB of JS is absurd, especially because those are re-downloaded for each of them, despite the fact that most sites out there use one of like 3-5 popular solutions.


As it happens, AMP pages look great in the text-only browser I use and are also an easy solution to many "paywalls". (The text-only browser does not autoload resources, run JS or process CSS.) Even without a browser, it is easy to request AMP pages using a command line utility, extract strings enclosed with "<p></p>" and produce an even smaller, readable web page with no ads.


That's the glory of web browsers: 1-click sandboxed installation of programs on demand.

This is incredible!

> enough maturity to implement essentially an entire operating system / virtual machine AS the browser

Yes, an operating system-- but one without user-facing persistence! Only per-application, networked, silo'd persistence, where the user sees a rendered form of data but not the data itself.

IMHO these two facts, that

a) Web browsers are BETTER than Linux, Windows, etc as an OS in a very important way, package management

and

b) They're missing one of the primary features of an OS, persistence

overshadow any other facts about them. What a strange tool they are.


I would temper your message with the caveat that the data is still accessible to the user; one click to open devtools, and another to switch to the "Storage" tab, and you have all your persisted data visible to the user.

You can copy and/or edit said data from that view as well!

Albeit the convention is clearly to not presume your browser app user will be interacting with their data at _all_ through the devtools, which I find regrettable but unavoidable with the current state of "computer literacy" and the state of "devtools-as-an-interface" (obviously the ergonomics aren't great for the average user today).


That's likely just a keyhole view into your data though.

Eg take web forums (not this one since there's weird lisp stuff down there and I don't know how it works, but the average web forum).

On the app (server) side they have access to everything, all of your posts in nice beautiful structured SQL.

On the user side you have access to none of that, just HTML soup and maybe some cached stuff in the "Storage" tab, or maybe not.

> Albeit the convention is clearly to not presume your browser app user will be interacting with their data at _all_ through the devtools, which I find regrettable but unavoidable with the current state of "computer literacy" and the state of "devtools-as-an-interface" (obviously the ergonomics aren't great for the average user today).

I really like your use of "convention" and "presume" here, because I think that's the essential lens to view these things through. It's absolutely possible for an app to provide all of a user's data in a nice structured form in IndexedDB, that will just be rare because it's not the convention.


Couldn't Java (the JVM) or .Net have technically accomplished the same thing earlier if web browsers didn't already exist? Java was much bigger than applets. Bill Gates recognized the threat of both Netscape and Sun to the windows dominance, because of the possibility of running apps over the internet.

Maybe even Smalltalk in the 80s if Kay and company had foreseen the rise of the global internet back in the 70s.


Java applets versus HTML DOM is a debatably parallel competition, but the web was also a connected & online medium, with it's own protocol, and that begat interoperability & connectivity to a degree that applications never came anywhere close to.

Ironically, for all the fragmented wild ecosystem of webdev today, with a million different libraries & toolchains & frameworks, the web has maintained & iterated on a very consistent & coherent underpinning that is fundamentally cross-computer, that is globally connected via the url. Applications were always a lesser beast, damned by lower horizons, and if they dared to try to do more, were on their own with no other applications to interoperate across & on their own with their protocols. The web grew many UI development options atop consistent DOM and protocols, but the application had consistent UI development options across scattered connectivity protocols, if any (and often none).

Looking at the internet as an application delivery platform ignores that the web grew because it was both an application and content delivery system, interwoven, together, and infinitely remixable & extensible.


Applets relied on the browser. I’m talking about using Java or languages targeting the JVM outside of the browser as applications communicating over network protocols to deliver content and communicate. Java was designed with networking in mind. Applets were added to Netscape because the browser and web had become popular. But Java itself was positioned to be an application delivery platform over the internet.


And Im saying that is a much much smaller dream/ambition than what the web went after.


Apologies for saying applets. I meant the entire Java apps world. That was misleading, you were right to be confused.


But I also want just a document viewer. Great job building all that, developers of the world, now how can I read a news article or an email in a modern sense without a complicated machine executing code that’s not at all for my benefit.

I do very little on the web where I actually want there to be code running on my document.


More news orgs should offer up something like lite.cnn.io.


>I do very little on the web where I actually want there to be code running on my document.

Just some random examples, but how do you do your tax, pay your bills, buy/sell stuff online, make appointments etc?


Can’t all of that be done with simple form submissions?


How did the people do all of that without the modern web? How could humanity actually survive so long without web browsers?

If you think about that maybe you come also to some conclusions regarding the questions you just asked.


[flagged]


Which just strengthens my point.


None of those things require client side code other than an HTTP POST.

Some things yes client side code makes sense, like an appointment finder, and that’s fine, but such things make up a tiny amount of my internet activity.


Sure, but you are talking about a complete minimalist implementation, one that before all of this Javascript explosion, made people pull their hair out. For example you need javascript for dropdowns or form validation to begin with. I think there is a lot of stuff done by front end developers these days that people take for granted.


Drop downs do not require JavaScript. HTML has basic form validation and your POST endpoint needs to validate the form anyways, sending the form back to the user for corrections is not very difficult.


This is just the same argument as the comments upthread rehashed. In the absolute most minimalist, hair pullingly annoying implementation, that is correct. However the polish you are used to on dropdowns plus countless other controls you interact with every day require javascript.


> However the polish you are used to on dropdowns plus countless other controls you interact with every day require javascript.

I got one can do without the "polish I am used to". I'm generally fine with HTML form controls.

My problem with JavaScript frameworks is they are huge beasts that build their components out of non-semantic block elements. A button or text field is just a giant mass of divs and spans with events attached everywhere with one-off CSS. The browser is stuck doing tons of work in its single-threaded JavaScript engine to build and manage these components.

It's a lot of wasted memory and clock cycles to recreate a text field the browser already provides. It makes for a worse user experience, wastes my battery, and eats up my bandwidth. To me it's not polish but waste.


Ok, so when registering on a new site you don't mind waiting back to hear from the server after clicking submit, to find out your passwords don't match, or don't meet the required policy? That's fine. But the majority of impatient and perhaps spoilt users do. In fact I hear people these days complain violently about the most trivial UX issues on websites they visit. User's expectations are sky high.

Regarding the massive nestings of divs. Well this is computer generated HTML. It is ugly as hell, but browsers are well geared up to deal with it, it doesn't create any serious performance issues unless taken to stupid levels. The other side of the coin, with this rather ugly generated HTML, is you get a human benefit, a time saving efficiency. You can spin up a web page with React/Bootstrap in a fraction of the time it took you in the 1990s.


> However the polish you are used to on dropdowns

Emphatically no. The absolute vast majority of those "polished dropdowns" are horrible to use, and break all possible platform conventions (for example, not accessible or usable with a keyboard)

> countless other controls you interact with every day require javascript.

Very few, if any, "polished controls" used on the absolute vast majority of web sites require javascript.


Emphatically yes. The most popular UI framework's dropdown component, documentation reads as follows:

"Dropdowns are toggleable, contextual overlays for displaying lists of links and more. They’re made interactive with the included Bootstrap dropdown JavaScript plugin"


Bootstrap dropdowns:

- don't follow platform conventions (don't trigger the actual system dropdowns with actual expected behavior)

- get cutoff by browser chrome when page is zoomed in (because you can't control that behavior from JS)

- default implementation's touch targets are too small

And that's for a framework which has had over a decade to make this work.

This is not to throw undue shade: they've done a very good job. But that's about as far as you can get.


> However the polish you are used to on dropdowns plus countless other controls you interact with every day require javascript.

Or just a proper GUI framework not based on HTML.


>software had to not only be downloaded, but it had to be hand built against each operating system

Back in the day, we didn't even have modems or compilers, let alone internet - so we would go to Ye Olde Computer Store and copy the disks of Fred Fish. That was how I got my first experience of Hack, which was great, and Emacs, which has always been a WTF for me.

I wonder if for some reason computer hardware had stopped getting faster and more complex at the stage of, say, a 25 MHz 486, how much different the world would be, or whether it would just force software to be consistently purged of cruft.


I surely did not dream with Web browsers, additionally plenty of bytecode languages predate Java, with Pascal's P-Code being the most well known one.

Back in the day there were several P-Code aware platforms.


Of course, the big question is if the web worked because it didn’t start out as a dynamic programmable media platform like HyperCard.


In the early days of a field or subfield there is such an explosion of creativity, but this is followed by a series of consolidations and we tend to forget the “losers” of the initial competition and the beautiful paths they illuminated. See C/Unix vs Lisp/Lisp Machines vs Smalltalk/Alto. Or NeWS vs X.

Alan Kay shows the beauty of HyperCard (the “loser” in networked hypertext) vs Web. We had in HyperCard a visual hypertext app dev and reading environment that was trivially programmable / writeable in some very powerful ways for any determined user, using a GUI authoring tool. Now what do we have? Yet another Algol like programming language whose capabilities are severely constrained — and more importantly which is is not remotely a realistic user authoring environment, hell even paid professional programmers have trouble keeping up with JavaScript, the browsers as dev platforms, and the surrounding tooling/libs.

More broadly, it’s dismaying how rapidly and seemingly permanently our collective expectations as developers have fallen. Kay is right: What we have in the web stack is not enough and we should not settle.

A quote from Kay I enjoyed:

“You have to include all media that computers can give rise to, and you have to do it in a form that allows both “reading” and “writing” and the “equivalent of literature” for all users.

“Examples of how to do some of this existed before the web and the web browser, so what has happened is that a critically weak subset has managed to dominate the imaginations of most people — including computer people — to the point that what is possible and what is needed has for all intents and purposes disappeared.”

It’s also amazing what we settled for in networked computing when I read his notes in this system from 40 years ago:

“ perhaps the best early structuring and next stage design of Unix was Locus by Gerry Popek and his researchers at UCLA in the early 80s. Locus allowed live Unix processes to migrate not just from one machine to another on a network, but to a variety of machine types. This was done by combining the safety required for interrupts with multiple code hooks in each process, so an “interrupt” could allow the process to be moved to a different machine and resumed with different (equivalent) code. ”


> Locus allowed live Unix processes to migrate not just from one machine to another on a network, but to a variety of machine types.

My personal take on this is that need to make the data layer in computers transparent.

Files are currently opaque sets of bytes which need an application specific parser to interact with them. And when you have programs that can interact with a file, the only operations are read and write. Realistically, you can only have one program editing a file at once. Even having multiple programs reading a file at the same time is often a bit broken.

Instead, what if files were structured (think JSON or SQL). And instead of overwriting the file when you hit save, the application could send granular updates to the files’ content as edits happened. (Eg update Foo.bar.baz = 22, or the sql equivalent). And applications listen for changes in the data model, so another application (maybe on a different computer)! Can edit something in the file, and the changes get propagated live.

There’s all sorts of benefits to this:

- We can enable collaborative editing in all applications for free

- It’s super easy to make plugins for applications - just make another program which interacts with the data. No plug-in api needed.

- The Unix philosophy is strengthened because multiple small programs can work with the same data at once. We can split out IDEs into a lot more small tools.

- Users have lots of devices. This sort of thing would make data able to move between their devices in a way that’s transparent to the application. Like live collaborative editing Dropbox.

We need CRDTs to do it properly. But the technology behind that is more or less solved at this point.

Honestly I think the hardest thing will be getting OS people on board. I think it’s about time we shook up how the filesystem works. open/read/write basically haven’t changed in 50 years.


Microsoft made a huge bet on this exact idea in the mid-1990s as part of their Cairo OS project.

When Cairo fizzled, the database file system became a separate component called WinFS but didn’t result in any end-user product:

https://en.wikipedia.org/wiki/WinFS


Apple too, with OpenDoc (that His Steveness killed on his return)


What went wrong? Was it just a failure to deliver?


My impression is that it was a combination of three things: internal Windows team dysfunction, ambition too far ahead of the storage and network technologies of the day, and lack of a clear customer use case at a time when the focus of Windows+Office shifted towards enterprise.

The OO database storage project was apparently Bill Gates's personal baby, so it didn't lack in executive support. That probably explains why the dream was kept alive for over a decade (Cairo started in 1991, WinFS was finally left out of Vista in 2006).

Steven Sinofsky's interesting online book "Hardcore Software" has some chapters about Cairo and later WinFS:

https://hardcoresoftware.learningbyshipping.com/p/020-innova...

"Cairo aimed to advance personal computing with dramatic changes in how we thought of files—rather than single files and folders, Cairo intended for files to have the capabilities of a database. Everything on your PC was to be stored in a database to easily search, find, and show relationships between items: files, email, contacts, photos, documents and more. Advancing storage was a long arc of innovation Bill favored.

"Ultimately, the human toll of Cairo was high in the sense that so many people spent so much time early in career working on a project that not only didn’t ship but was viewed as squandering resources, at best, and misguided at worst. It was a bit of a black eye for Microsoft among the press and analysts who believed Microsoft would deliver on the idea of object-oriented the way that NeXT had done but at scale. The magnitude of the project would leave many people with Cairo war stories for years to come. I wish I could say that the lessons learned would prevent another experience like this from happening, but that isn’t the case as we will see."


You're on the right track. A good place to start is by defining some behaviours for cases which are errors in POSIX, because you know apps can't be using them. The most obvious move is to let directories also be files. This isn't allowed today mostly for historical reasons - directories were originally implemented as files internally, so the OS couldn't let you store your own stuff there and the side effects of this ripple all the way up and down the tech stack. If you fix this then you can "mount" a file onto itself using FUSE, such that apps that want to read the custom binary format or JSON as text can do so, and other tools that want to explore a directory structure can do that too (e.g. a shell), and the two are linked by a collection of user space format-specific filesystems.

Collaborative editing is too much to bite off at once. The semantics of edit conflicts are very app specific and high level. You can get a long way by just hooking file IO in userspace.


> Collaborative editing is too much to bite off at once. The semantics of edit conflicts are very app specific and high level. You can get a long way by just hooking file IO in userspace.

I've been working in the collaborative editing space for over a decade at this point. So (predictably) I think it should be a day 1 feature. In my mind realtime collaborative editing is one of the main motivating use cases for this whole system - its one of the things we can't do with traditional local file based software architecture.

Based on my experience with collaborative editing I don't think you need a special app-specific conflict resolution strategy most of the time. I think about 95% of applications just need CRDTs for:

- Tables with editable values. (Or, equivalently, collections of JSON objects). I think an MVRegister is the way to go here.

- Text

And then the longer tail of apps need:

- Generic sequences (eg for the list of layers in photoshop)

- Rich text documents

- Text with conflicts (eg for Git)

For editing regular data, MVRegisters are the star of the show. MVRegisters - aka Multi-Value Registers usually just store a single value. If two users concurrently set it to different values, when you do a read operation you can either (consistently) choose a single winner, or present the application with all of the conflicting values and let it choose how to resolve the conflict. Its the best of all worlds in my mind - because conflicts are rare in practice, and this lets application developers start simple and make their apps more complex over time as they desire.

MVRegisters are also crazy simple to implement.


The biggest failure of HyperCard was not integrating networking. Bill Atkinson has lamented that oversight. Indeed, the whole visual layer of the Internet might have taken a totally different path.

Even more maddening since AppleTalk was a simple, cheap and effective way to network Macintosh's.


I think this is copied from his 2021 Quora answer: https://www.quora.com/Should-web-browsers-have-stuck-to-bein...


Yes, that's where the first link points, to where the discussion started on Quora, but it continued in the comments (which require a whole lot of scrolling and pointing and clicking to discover and read on Quora), over email, and on David Rosenthal's blog, so I thought it was worth putting the whole thread in one place, and updating the links.


Thanks! I completely missed the link because I thought it was an ad.


Alan Kay made scathing comments [1] on the World Wide Web browser (reinventing a broken wheel) at length in a lecture "Normal Considered Harmful" for computer science students at University of Illinois Urbana-Champaign.

For context: Alan himself invented WYSIWYG at Xerox PARC. He was present at "the mother of all demo's" by Douglas Englebart [9]. Ted Nelson who invented bidirectial hyperlinks 'outside of the file' and Xanadu [6,7] got Alan in touch with his future wife [2] and Alan was there at Apple when Bill Atkinson made Hypercard.

He mentions that when he told Tim Berners Lee about the much better previous designs of Douglas Englebart, Ted Nelson and Alan's group at Xerox PARC, Tim was chagrined (that he had not read any of the papers about earlier and much better designs of 28, 25 and 20 years before the WWW browser in 1993).

Of course Alan also discusses how the web should have been done. In short, it should be that when you click a hyperlink you load a virtual machine in a secure sandbox of a Operating System Kernel. The virtual machine would then render the (bitmap or vector) pages. Operating System 101, well known in 1965. Of course, Alan's team did implement this (several times). I can demo all this software if you contact me. I also give lectures on this.

Email me at morphle [at] ziggo [dot] nl for a video where Alan Kay, Douglas Englebart and Tim Berners Lee meet and talk about this at a forum discussion. You can ask me about anything related to Alan, Ted or Douglas, I maintain an extensive archive with all scientific papers, emails, talks, interviews and lecture video's.

Alan, Ted and Douglas worked closely together and where friends. TED's beautiful eulogy [3] for Douglas is a wonderful speech where he bitterly indirectly refers to the ignoring of Douglas' work by Tim.

Alan made a demo with Apple in 1987 on what the WWW should be able to do [4,5].

[1] Normal Considered Harmful https://youtu.be/FvmTSpJU-Xc?t=963

[2] Alan Kay's tribute to Ted Nelson at "Intertwingled" Festival https://www.youtube.com/watch?v=AnrlSqtpOkw

[3] Ted Nelson's Eulogy for Douglas Engelbart https://www.youtube.com/watch?v=yMjPqr1s-cg

[4] https://www.youtube.com/watch?v=9bjve67p33E

[5] https://www.youtube.com/watch?v=umJsITGzXd0

[6] https://www.youtube.com/watch?v=En_2T7KH6RA

[7] https://en.wikipedia.org/wiki/Project_Xanadu

[8] https://golden.com/wiki/Project_Xanadu-P6JE4

[9] https://www.youtube.com/watch?v=yJDv-zdhzMY


Well, as a text-only browser user for decades, all I can say is that there is a great deal I can do with the text-only only browser than I can no longer do with the enormous "modern" graphical browsers. Even something simple like opening a very large HTML file will cause Chromium to choke.

With text-only browser I can fly around the www and consume information at a speed that is just impossible with a "modern" browser. No auto-loading resources, no pre-fetching, minimal headers, no user-agent, no cookies, no Javascript or CSS, and so on.

In the "modern" browser one ends up with an ever-growing number of tabs. In text-only browser I have no tabs. Yet I have a complete navigable browsing history of content URLs.^1 I have enhanced this with a TLS proxy that records every URL. In a "modern" browser any such comprehensive browsing history generally becomes a liability, along with cookies and too many other things. And the history would be cluttered with auto-loading URLs for no-content URLs that must be filtered out Too complex. Too much work.

I'm in textmode most of the time, the "modern" browser generally cannot operate there, so it is of limited utility. Even just the sizes of the binaries for "modern" browsers are absurd. Chrome is like 150MB. To try to compile a "modern" browser is, IME, more difficult than compiling an entire UNIX-like operating system. Many folks simply refuse to even try or just give up.

One can still download W3C's line mode browser from the 1990's, it will still compile and it still works. I use a curses-based browser but I still do many HTTP requests using nc or similar along with localhost-bound TLS forward proxy. HTTP/1.1 pipelining for bulk text retrieval is a favourite feature of the www and that cannot be done inside a browser.

1. "Content URLs" here means URLs pointing only to pages at the domainname I entered, pointing to pages with text content that I can read, not stock images, pixels, ad servers, telemetry and other garbage.


Out of curiosity, which text-only browser do you use?

I personally use w3m for many things, including HN and viewing links from HN (if the website permits it).


I haven’t found a text browser that is able to properly render the nesting of HN comments, they all appear at the same level. I haven’t looked recently though maybe it works on some text browser now.


w3m should work, so long as you enable inline image display (for HN's spacer gifs.)


Links does.


I volunteered with a non-profit several years ago, and due to the competition for the available Windows machine, I located an old iMac nobody was using - large flatscreen era, but still apparently too old for system updates. And apparently could not properly connect to secure websites.

It made me sad, particularly because an organization like that can't really afford Apple prices and yet they also can't afford an IT department which is probably why someone bought it originally figuring it would "just work".

I realize in principle I could've installed Linux on it, but didn't see any point.


I have a 2012 iMac that is no longer supported, so I installed the latest Ubuntu on it and it works very well. So if you run into that situation again, Linux is a very viable option.


I didn't want to work (without pay) for that non-profit for the rest of my life, and if I had, I would've had to deal with their small-time local IT service provider, and it was already a nightmare prior to a hypothetical initiative like me standing up computers or servers of my own.

My experience with small (non startup) organizations, meaning under a dozen people, is limited, but it's led me to believe that every role in that situation is like being a high level executive, only without the pay of a big corporation or an assistant (other than the volunteers...).

They might have had a place for me if I had been able to wrangle their hostile service provider, but doing technical things wasn't what they needed. I couldn't even start if all their servers and their network were under the control of someone who wouldn't talk to me.

They would've been happy if I could've dealt with their domains, and everything else related to IT, but at a administrative/managerial level - it was entirely about talking to people, and I was keenly aware I didn't know where to start.


Any x86 based Mac can have Windows installed. The older ones didn’t need BootCamp. You could just pop a Windows DVD in.


It might have been PowerPC. I can't recall for sure.


It would have been nice if it evolved into something simple like markdown documents, but also support and alternative markup language geared towards app


Like XUL?

The browsers didn't stuck being document viewers. HTML that stuck being a document format.


Like HTML embedded Java applets?


I am as ever amazed by Alan Kay's ability to comprehensively remember events, interactions with people and thoughts he had from 40 or 50 years ago and still have such detail.

I cannot remember all the interactions and conversation topics I had two weeks ago or explain in detail how an issue got fixed or what its root cause was. It feels to me as if things happen so fast today, that my brain drops old information the moment I consider it done to make room for all the new information.

Is there a trick or technique to it [1] or is his brain wired differently with higher neuroplasticity?

=== [1] I do use bullet journaling for example. It helps to stay on top of the gazillion things that need my attention, but feel it makes matters worse in the regard as I offload stuff and my brain knows it can forget about it ever more.


There is actually no evidence he remembers any of this accurately, just like most of us.

The difference between you and him is that you (rightfully) doubt your memory while he doesn't doubt his.

I think you are more likely to be correct than him.


I wonder if it has to do with the fact that prognosticating about technology and being a visionary and having conversations about it has basically been his job for the last 30-40 years, while people like yourself (presumably) and I are having to write code. I'm not sure how much code Alan Kay has written for the last few decades. He certainly thinks a lot about it, but not sure there's much program authorship.

Put it another way, your focus might just be elsewhere. Holding bags of knowledge perhaps in more disperse ways, or on other topics. But these conversations Kay has, and the thoughts and plans he's made around them... that's kind of his whole thing.


Ah I wonder, are these recent memories worthy to you ? A colleague told me my memory was impressive yet we both struggled to remember what we worked on a month ago. As soon as a release was done, our brain wiped mostly everything to start on new work. I was personally shocked and I assume it's because it's of no interest to my brain.

Some stuff gets a lifelong spot in your brain, some are week long refugees.


My mother had memory problems in the last years of her life. My sister would take her to a coffee shop, where she would order a latte or some such. To my mother, it was new and delightful every time--and she would exclaim over how she'd never had anything like that before, although in fact she'd had it just the week before.

When I get old(er), that's how I hope I'll be: delighting in new experiences, even when they're old.


I discovered some weird thing: our brain seems to have "taste memory", separate from "regular" memory. If you have the same dinner several times in a row, for example, you become averse to it. You then have to stop having it for awhile so that the brain "forgets" what it tastes like.


Alan Kay was a big influence for me and I still respect him a lot, but you know, something about Smalltalk didn't work... and it was exactly this whole live object/heavyweight runtime distribution stuff.


The HotJava browser was the closest we've come to advancing the state of the art of dynamic documents.


James Gosling's article, "SunDew - A Distributed and Extensible Window System" from "Methodology of Window Systems" is a fascinating preview of NeWS, but even more so is the transcript of the discussion between some amazing pioneers of interactive computer graphics who attended the the workshop where the paper was presented.

I love how James gently humors the standards guy (P. Bono) who suggests he should have considered the glorious "CGI graphics model" (Core Graphics Interface) standard, and seems incredulous that he is "ignoring the standards" by choosing PostScript over his favorite standard of the day in 1985, CGI.

My favorite quote is: "There is really nothing new here. It's just putting it together in a different way." -James Gosling

SunDew - A Distributed and Extensible Window System, by James Gosling: Discussion

http://www.chilton-computing.org.uk/inf/literature/books/wm/...

5.4 DISCUSSION

Chairman - George Coulouris

Williams: What is the scope of the devices you are considering? I don't suppose you intend running the window manager on a graph plotter.

Gosling: The crudest display we are willing to accept is 1 bit per pixel black and white but we also support 8 or 24 bits per pixel colour or 4 bits per pixel black and white.

Williams: Essentially bitmap raster devices.

Gosling: Right - although when you get to grey scale devices, things stop behaving in a model that is comfortably compatible with RasterOp. You have got to be able to deal with antialiasing, such things as subpixel positioning begin to make sense. It makes sense to draw a character midway between two pixels because you can use antialiasing to shift the character over by subpixel amounts.

Bono: When you use the word PostScript, do you mean literally that PostScript is in some way your virtual machine instruction set? It has not been extended or generalized?

Gosling: It has been extended. It is a superset. We are committed to implementing everything in the PostScript telephone book that makes sense. They have a few commands that are particular to their storage allocation and to the fact that they are going to a printer and these are not implemented. We have imported some things that are peculiar to a display and a small number of incompatible changes to the language have been made which we spent a long time talking to the people at Adobe about to make sure that was reasonable. In particular, we added garbage collection and lightweight processes. There are very tiny ways in which the semantics of PostScript's memory allocation strategy shows through in the printer version because they have a quick and dirty storage allocation mechanism and that wasn't really useful in this system.

Bono: The virtual machine was not virtual enough.

Gosling: Right. When we made the generalization to multiple processes, their storage allocation mechanism just completely broke and so we had to use garbage collection instead and that necessitated some small semantic changes but they are not things you are likely to see. All of the important things such as how you specify a curve, whether you can render an image rotated 37 degrees, all of that is there or intended to be there.

Hopgood: How do you handle input?

Gosling: Input is also handled completely within PostScript. There are data objects which can provide you with connections to the input devices and what comes along are streams of events and these events can be sent to PostScript processes. A PostScript process can register its interest in an event and specify which canvas (a data object on which a client can draw) and what the region within the canvas is (and that region is specified by a path which is one of these arbitrarily curve-bounded regions) so you can grab events that just cover one circle, for example. In the registration of interest is the event that you are interested in and also a magic tag which is passed in and not interpreted by PostScript, but can be used by the application that handles the event. So you can have processes all over the place handling input events for different windows. There are strong synchronization guarantees for the delivery of events even among multiple processes. There is nothing at all specified about what the protocol is that the client program sees. The idea being that these PostScript processes are responsible for providing whatever the application wants to see. So one set of protocol conversion procedures that you can provide are ones that simply emulate the keyboard and all you will ever get is keyboard events and you will never see the mouse. Quite often mouse events can be handled within PostScript processes for things like moving a window.

Sweetman: How do windows relate to canvases?

Gosling: I did not use the word window because its overloaded with all kinds of semantics. Does it have a border? All that a canvas is, is a thing on which you can draw. It is not even rectangular.

Sweetman: Do you see canvases on your display?

Gosling: Yes you can. A canvas is a thing on which you can draw. It might be visible on some display and it might not. If it is visible, you can specify its position in a 2.5D coordinate system. It is opaque but you can get a transparent effect if you want something to show through a window, by cutting a hole in the window.

Williams: You say clients have to accept redraw commands. Is there any indication as to how soon they are supposed to do them?

Gosling: The client can ignore the request if it wants. The screen image will look a little funny. Or it can wait half an hour - it does not affect the integrity of the screen. It will affect the integrity of its window but nothing else because all the canvases are maintained in apparent isolation from each other.

Williams: Will the visual image just contain what was there before?

Gosling: You can do that, if you wish. For most applications it does not make any sense to retain the old bitmaps so you might as well just blow them away and replace them by whatever is most convenient - which is probably what was there before. There are some times when you would like to retain the old bits - that is there as an option. I really want to make it difficult for people to exercise that option as it is a very tasteless thing to do in general.

Bono: I hope I am not being overly opaque about this but it seems like I have a sense of deja vu in that it's a lot better way of doing what people did with display lists, where you sent things down to Vector Generals or whatever. They were little programs and there was some grouping structure but everything is done so much nicer now in the sense of having complete programs and a better set of primitives etc. Am I missing something from the model or is that really what is going on?

Gosling: One of the things that tended to characterize all the display list languages was that they tended to be tailored very much towards what the hardware could do. PostScript tries to stand back and say "I don't want to know what the hardware can do, I want to know what makes sense for the user".

Bono: What I would like people to do when they do look at standards, instead of ignoring them, is to look at, for example, the CGI work; this is a set of functions which some people claim is a good instruction set (forget the syntax for the moment, just look at the functionality). We would like to get some comment on whether it is a good set of functions and that is what you should be looking at rather than ignore the standards. All of this could be fitted in a standards context if we get it right.

Gosling: One could easily use the CGI graphics model for this instead.

Teitelman: The innovation here is not that we are using PostScript. The reason we chose PostScript is due to a lot of historical connections and proximity to the people who are doing it.

The interesting thing is that all these processes look as though they are executing in one of those old single address environments. It is a single user process that James has been able to implement lightweight processes in. You don't have lightweight processes in Unix systems, which you really need to implement realistic user interfaces.

Gosling: There is really nothing new here. It's just putting it together in a different way.

Rosenthal: It is worth bringing out a number of points about this style of window manager. There are some real limitations, though I think this is the way we should go.

Some help from the kernel is needed. This is easy in 4.2 which has a sensible IPC. It could be done in System V through shared resources etc.

A reliable signal mechanism is essential. The window manager has to stay up. It is the only routine for talking to the system and hence must stay alive. We have 100 systems running and only experience 1 - 2 crashes per week. This is good by Unix standards!

Applications cannot read pixels back - this is just too slow.

There must be a way to make the client think it has a real terminal, such as the 4.2BSD PTY device, otherwise all old programs misbehave.

There must be a mechanism for downloading application code into the window manager process, for example, rubber band line mouse tracking is done by a user level process.

There must be off-screen space in the window manager.

Examples of systems built in this way are:

BLIT [50]; Andrew (CMU)(see Chapter 13); VGTS (Stanford) [36]; X (MIT).

The BLIT system is successful - people wrote programs for it. I believe it is the only successful window manager for a Unix system.

Myers: Maybe that is because BLIT doesn't run a Unix operating system.

Rosenthal: Overall, it is a Unix software environment.

Williams: We should bear in mind that Rob Pike says that to do anything sensible using the BLIT, you need to write two processes which communicate: one running in the Unix host and one running in the BLIT, and this must be harder to do than, say, writing two programs which don't communicate.


Browsers still fail at being good document viewers:

1. For a long time now Chromium didn't support MathML. It's only coming back in the upcoming version.

2. Firefox's implementation of MathML is buggy: <http://0x0.st/o7HQ.16.mov>.

3. There is no CSS property to enable grouping of threes of digits in large numbers, so you need to put spaces into the markup, resulting in them being included when copied and search discovering completely different numbers than the ones you want to present.

4. And just try making this table reproducing the proper alignment of numbers in HTML: <https://practicaltypography.com/samples/number-grid-after.pd...>. It's hell.


I've never encountered a document viewer with property 3. Does this space technology exist?


Excel.


Regarding 4. Only the "weight" column looks tricky to be honest.


If browsers had stuck to being document viewers, we'd have invented a new technology to do everything we do today, and the web would be like gopher.


Yeah, we would have invented a tech stack that is actually good. JavaScript+ HTML is just about the worst option imaginable if you were to design a cross platform gui application runner from the ground up.


In a field that includes X Windows, you might want to reconsider "worst design imaginable".

There were dozens of serious attempts at a remote GUI applications framework from the 70s to the 90s. HTML + JavaScript is the one that finally succeeded. It is not a coincidence.

Unlike all its predecessors it was conceived in a free software environment, and worked with both degraded and upgraded environments. It was fast-evolving enough for commercial interest, standardized enough for mass adoption, and trivially simple to author.

Yes, the web has grown complex beyond all reason and if you want to make a real state-of-the-art web app today you'll have to use an absurdly deep stack of tools. It's probably time to rethink things. But give credit where it's due. No other system would have even been _able_ to evolve like HTML + JavaScript has done.


> Yes, the web has grown complex beyond all reason and if you want to make a real state-of-the-art web app today you'll have to use an absurdly deep stack of tools.

The absurdly deep stack of tools is the reason why the web has grown complex, not the solution to it. If front-end devs applied a bit of fucking YAGNI every now and then instead of churning out tutorial #193859812 on how to get set up in 10 minutes with a WXYZFooBarHyperCSSJS stack then get completely brick walled a few months into your project because now you're committed to tech that would take 15 years of study to actually understand, then front end would be perfectly fine.


> In a field that includes X Windows, you might want to reconsider "worst design imaginable".

You can implement a XServer in only a few MiB, that runs on a potato (as seen form today). Compare that to current web browser.

> Yes, the web has grown complex beyond all reason and if you want to make a real state-of-the-art web app today you'll have to use an absurdly deep stack of tools. It's probably time to rethink things.

It's long over due. We should get lost of all this "layers of madness".

> No other system would have even been _able_ to evolve like HTML + JavaScript has done.

How can you know that?

The problem is only, the market takes always the cheapest solution that barely works. And you can't compete with "free" (like in beer).


Emacs itself does much more than HTML and JS.


I hear some version of this all the time. Why then are gui systems like electron so popular and why do new design systems continue to incorporate html/CSS as the design tools? If it's so bad why hasn't something else replaced it? If something truly better was developed it would be wildly successful. I don't know the answer, but I'm guessing cross-platform frameworks are just really difficult to develop. You say "just about the worst option imaginable", but the evidence doesn't really suggest that.


"Why then are gui systems like electron so popular"

Electron is popular because Chromium is an extremely well funded project that has a bullet-proof business model (more web apps = more crawlable html = more searches = more ad clicks). There are probably more than 1000 people working on it, many of whom are very senior and expensive engineers.

Does Electron make sense in the abstract, i.e. with unlimited resources and no business constraints, would we have ended up with apps shipping a full copy of a web browser with themselves for their UI rendering? Probably not, no. But does it make sense in the concrete, in a world where Google is willing to spend billions every year on a monolithic codebase that does things like optimizing text rendering, abstracting native APIs and so on, then give it away for free? Sure!

So you ask, if it's so bad why hasn't something else replaced it. Money is a big part of it. Yes, writing cross platform frameworks is hard, but it's not that hard. What matters is who is willing to fund it. Google has money, gives away lots of free beer, and that beer comes in the form of an HTML engine.


> You say "just about the worst option imaginable", but the evidence doesn't really suggest that.

Notably, the anti-web mob almost never bothers to articulate why it is they think the web is bad at all! This is another classic case, of inchoate disparagement, with no specific claims to be backed nor refuted.

I want to be a sponge, to embrace good ideas, to hear & listen to what people think might improve or enhance things. But there's rarely real problems cited (or there are gross mis-understandings), and much rarer still is a better path or possibility articulated. I'd love some genuine discussion of the web, but the silent majority seems ok-enough or better working the web regularly, and the protestors against the web tend towards blunt & undirected.

One conflict I do see as visible again and again is the expectation of what the web should be. There seems to be quite a lot of hostility to the web being a low level platform, to it not having a single consistent development style. This leads to a lot of "the web was never meant to do this!" sort of feelings, but that doesn't really argue whether this iterated adaptability the Document Object Model has spawned serves us well or poorly.

One thing that is true is that the DOM is a low level platform, that it's not higher level. It doesn't compactly solve all or even most of the problems. To me, this is the critical value of the web, that which makes it so much better & more powerful: that we are free to express multiple architectures, free to innovate our UI patterns & do so, while still keeping the same underlying bones. This has begat great cross pollination across many different tribes of webdevelopment, has continued to make the web a crossroads of development, where many ideas cross paths & new ventures, new trials are began. It compares versus near-literally everything else, where isolated communities come and go (even within corporate walls, e.g. Microsoft's GDI, WPF, Silverlight, WinUi, DirectDraw, WinRT, UWP/Uno, .NET MAUI, or GTK's major 1, 2, 3, 4, & Adiwata reworks), ideas for how to make interfaces rise & fall, never quite satisfying.

There's protests against the web being fast enough, but frankly it's horsepucky. The browser is intensely optimized & stunningly fast at all kinds of things. It does not however prevent mis-use, prevent bad code. Often installing ad-blockers (something not doable in apps note you) is enough to get users to lightspeed.

The web has made a great Bazaar, a great place for a conflux of possibilities, and has proven vastly malleable as a medium & platform, boosted so much further by it's connectivity-at-the-core versus applications which bolt-it-on, and given far greater reach than the developer-centric gui toolkits of yore. So much is possible here, and the great reverberating om keeps giving, keeps exploring. But, it feels like, so many just want the Cathedral, the one specific incarnation that is well shaped, that everyone can just do. I just hope we can have some discussions about what would make a good Cathedral, what properties & virtues it would have, because i think the Bazaar will probably ultimately accomplish those ends better, but if we can't engage & discuss, no one wins.


> the DOM is a low level platform

The DOM is not "low level", its the wrong abstraction. It feels low level to you because it is fundamentally hard to express the layout of a gui application in terms of a document layout.

Compare the HTML DOM to Android's XML layout. It is so much easier to work with Android's layout system. You can trivially do things like anchor element X to the left / right / top / bottom of element Y, without workarounds like display:inline and adding div upon div upon div to make things work right.

I could list out another dozen things that JS does poorly, but I don't think that would be very productive.

> The web has made a great Bazaar, a great place for a conflux of possibilities

The web has been successful in spite of its bad technology. The concept of the web and the internet is wonderful. And the reality of the web would be better than it is today if websites weren't so slow, bloated, and buggy.


> Notably, the anti-web mob almost never bothers to articulate why it is they think the web is bad at all!

They articulate it all the times. Unfortunately "sponges" with their "reverberations of yore" are too enamoured with the sound of their own voice to hear that.

Meanwhile, as far as app development goes the web cannot even reach the heights of Turbo Vision from 1990s.

Just some of the reasons articulated in a Twitter-sized manner: https://news.ycombinator.com/item?id=34218520

> boosted so much further by it's connectivity-at-the-core versus applications which bolt-it-on

Tell me you know nothing about history without telling me you know nothing about history


Here are some concrete issues. These aren't "web is bad" kind of things but more like "sometimes the web is bad at certain tasks and that can't easily be fixed".

1. Not layered in any meaningful way. Seen another way: you can't implement a web browser using web tech.

This matters because a typical way to write apps outside the web is to pick an opinionated high level platform that you think will do what you want, but if it turns out at some point to not quite work in one part of your app you can drop down to a lower level. You can use the operating system's text rendering but you don't have to, and if you don't for some reason you won't really lose much performance because you can use the same lower level APIs that the OS's own text rendering library uses. This ability to drop down a level when you hit the limits of an upper level is significantly limits development risk and makes planning easier, because you know you will always be able to do something, it's just a question of effort. On the web, if you hit the limits of the platform, you're just SOL.

2. GUI only paradigm. Useless for CLI apps, servers, etc. There's no deep reason for this beyond history, inertia, business models and so on.

3. No working notion of software components. This has actually got worse with time: web components got rolled back, plugins were removed, HTML includes don't exist. That's something ViolaWWW got right in 1992! The web's leading component technology is React which doesn't even come from browsers and browsers lack obvious features it could use to improve performance. See how Sciter doesn't even need ReactJS core at all because they've got a DOM extension that does DOM diffing and patching in C++, but regular browsers don't!

4. The DOM model is not high performance. It's highly optimized in Chrome for what it does but the DOM is an extremely complex and generic system that nonetheless fails to deliver modern and competitive widgets causing a high level GUI API to be treated as if it's "low level", even though it isn't. It can't really remove anything, so browser code is absolutely full of slow branches, lookups, workarounds and so on. If you take it as given then browsers are fast because Google and Apple invested so much into it, but if you take a step back and say "what's the best way to render documents and GUIs fast" then it's not that great.

5. The web is absolutely not malleable as a platform. It's not at all bazaar like. Take just one obvious pain point where some malleability would be nice - a malleable system would let you use non-JS languages! The web's current attempt at this is WebAssembly, which can't do anything useful unless you call into JS to do it, which imposes a large performance penalty vs native code and which doesn't even really work for anything higher level than C++ or Rust. If the web was malleable someone would have fixed that years ago, but in reality the web can only be changed by Google and only via a convoluted and opaque political process.

Don't get me wrong, the web gets a LOT right! There are dozens if not hundreds of lessons to learn from its success. But the critics do have concrete issues.


> web components got rolled back

Unfortunately, they haven't. Google especially pushes them at every opportunity, and they are poisoning the web. E.g. Scoped CSS that covers about 80% of what web components are used for cannot progress because now it has to account for weird corner cases in the Shadow DOM.

Meanwhile none of the major and very few of the new frameworks view web components as a viable foundation. They still have multiple issues that will maybe someday be solved by throwing more and more Javascript at them (see how incompatibility with forms was "solved")


For everything else, there’s electron


The technology already existed with the JVM, .Net, Flash, Silverlight, etc. It would just be a different runtime than the browser. Mobile app stores are that.


Even Flash was much more sane for application development than HTML / JS.

But as always, the cheapest "solution" won.


Let’s say you’re blind. How do you use a Flash application with a screen reader?


It was apparently possible, although 15 years late. http://maccessibility.net/2010/03/24/adobe-announces-flash-a...



We are getting there again with WebGL and WASM.


iow proprietary fiefdoms


And it's entirely unclear what any of these non-view-source minimal-user-agency alternatives has that's actually improved over what the web can do. There's no contentions about what about these alternatives would be an improvement, or to whom it would be an improvement.

Most of these architectures could probably be recreated on the web, if we cared to, if we thought they provided sufficient advantage. Projects like swf2js are surprisingly competent at playing Flash. But it's... just not a better medium, for users, for extension builders, or for developers. It did have some good tooling for creatives at least, but there are plenty of really good modern web tools with similar capabilities, they're just not as popular, not as important anymore, I'd argue because Flash was actually a pretty crude & raw interface builder that lacked much of the refinement, navigability, control, & capabilities we now take for granted.


> It did have some good tooling for creatives at least, but there are plenty of really good modern web tools with similar capabilities

There really aren't. Because the web can't even do animations properly. And Flash had a faster vector rendering model than SVG.

> I'd argue because Flash was actually a pretty crude & raw interface builder that lacked much of the refinement, navigability, control, & capabilities we now take for granted.

Ah yes, the refinement, navigability, control and capabilities of... what exactly do we take for granted? Where are these amazing tools that replaced flash and can be used for creatives, games, interactions...


What do you have in mind that beats Flash as an SVG animation development platform?


This would be so good.

DHTML is the greatest hack in all history.

It's almost unbelievable that we're using such a kludge for almost all user-facing application development today.


Any hack that lingers long enough is a paradigm.


There seems to be truth in that. Now I'm thinking about Unix. (I may say that, as someone using exclusively a Linux desktop since more than 20 years ;-)).


Aaaand here we are, still bound to the atrocity that is the DOM.

With still limited interactivity and terrible work-arounds (canvas).

And still with a data silo problem - (but not quite as bad as the data silos created by 'apps' on your devices)


DOM is what's commonly known as "retained mode graphics" and the canvas is known as "direct mode graphics". You literally have the two primary strategies for visualization in a platform to use as you please, but you're unhappy. I'm curious why?

The browser APIs we have today have legacy cruft, no question about it, but they also have all the good, and modern parts any other system for GUI and graphics would have. It's surprisingly complete today.

I think browsers suffer an "image" problem, where we just assume what's there is bad, for some reason. I'd be curious to hear specific shortcomings you encounter in your work.


There's a big design space within the retained mode concept and you can see places where DOM probably isn't the right design for apps. For example, I've never seen reflow or the "flash of unstyled content" problem in real GUIs. Sure, there are ways to prevent these problems when using the DOM, but in traditional toolkits those problems are not present to begin with.


> For example, I've never seen reflow or the "flash of unstyled content" problem in real GUIs.

I've seen reflow happen with real GUIs when they're fetching tabular data (from the network or from disk) and the columns are flexible and stuff is too big. Or some frameworks love to hide scroll bars, so when those pop in, everything moves around a bit.

Unstyled content is rarer, because usually the styling information is compiled in, but sometimes that just pops in late too, more often when your computer is overloaded and storage and network are terribly slow. Sometimes you get a couple different revisions of splash screens as the thing starts up.


Traditional GUI toolkits are local apps, not streaming content.


That's true, but so are a lot of SPAs and Electron apps. I don't think downloading the whole app is much of a burden if it makes the UI less janky.


In the case of SPA and Electron apps, we have a platform used against its intended purpose and the companies doing the platform, paying for its R&D, simply don't care to optimize it for that side scenario.

Downloading an entire site is an existing strategy, what SPA means (single page application) and these are often terrible UX. You wait a lot for it to load, then it's clunky and the moment you navigate away by clicking a link, you lose a ton of your app state.


Sure, an "image problem".

https://gifer.com/en/embed/7iJT


Yes. The modern web is an example of how many things are just worse now. It’s like Hollywood movies, power tools, and Applebee’s. Many things were just better in the 1990s.


Does anyone have a year for this material?



Ok, let's go with that for now. Thanks!


I blame the day that Mozilla trashed the end user principle and removed the ability to shut off Javascript from the preferences screen.


But HTML is for sissies, and the JavaScript debugger has come so far! ;)

For the truly brave and principled, you should be able to disable the HTML view completely, and browse the web entirely through the JavaScript debugger console, network log, DOM and CSS editor, JavaScript sources, breakpoints, scope, and call stack, by searching and opening up DOM elements and JavaScript objects and stacks as needed by clicking them in the outliner and typing JavaScript expressions into the console.


>I blame the day that Mozilla trashed the end user principle and removed the ability to shut off Javascript from the preferences screen.

In that case, the Disable Javascript[0] addon might interest you.

[0] https://addons.mozilla.org/en-US/firefox/addon/disable-javas...


No that misses the point. Turning off JS might not be impossible, but once it became inconvenient enough, more and more sites came to fundamentally depend on it and became impossible to use without it. So if you turn it off, lots of the web becomes unbrowseable.

Regarding the browser internals being in JS: the UI option to turn off JS was about not running JS found on web sites. The browser internals are another matter.


>No that misses the point. Turning off JS might not be impossible, but once it became inconvenient enough, more and more sites came to fundamentally depend on it and became impossible to use without it. So if you turn it off, lots of the web becomes unbrowseable.

Sorry for the late reply.

Actually, I am very much aware of this. Mostly because I default to browsing with JS disabled and don't enable it unless it's required for the content on a particular page.

Often times, I just don't even bother enabling JS on pages that require it and move on without viewing whatever "important" information that can "only" be displayed with JS.

So yes, I am aware of the overuse of JS. That said, I can't control what tools/frameworks website developers use to create their sites.

And since I am not a big fan of running arbitrary remote code on my devices, I choose to limit the ability of remote sites to do so.

Is that a solution? No. But asserting control over my devices is the only option I have.

Well, aside from raising a few tens of millions to start a web development training academy to teach the world's web developers that they shouldn't use JS to do simple things like displaying static text or images, and that all computing required should be done server-side and not on the client, unless the client affirmatively consents to such code running on their device.

Unfortunately, I'm guessing the VC world has little interest in such an endeavor.

As such, I'll continue to do as I do WRT JS. It's not great. Heck, it's not even good. But it allows me to at least attempt to retain control over my property (in this case, browser-capable devices).

I take other steps too, such as using DNS block lists (PiHole), egress filtering, browser containers/private tabs as well as various content blockers.

Those steps may turn out to be ineffective in the long term, but hopefully I'll be dead before that happens. I will continue to employ and tweak those methods until they aren't useful.

My devices should do what I tell them to do, not what web developers want my devices to do. While those things may be the same, I'd like to be able to decide that for myself.


Well, large parts of Firefox are programmed in JS. So you couldn't disable it completely anyway.


I dunno who this Alan Kay guy is, but he needs to get to the point faster.


[flagged]


Please read the article, it says the opposite of what you think it says.


The full title is "Alan Kay on “Should web browsers have stuck to being document viewers?” and a discussion of Smalltalk, NeWS and HyperCard"

The first sentence is Alan Kay answered: “Actually quite the opposite..."

Looking past the poorly edited title this whole thing is clickbait answered with one question. No one really thinks all the interactivity we have in web pages is bad and implying that someone does for clicks is dishonest.


No one really thinks? Well I doubt you never opened a page and sick of it downloading megabytes of JS and breaking navigation because instead of links, the navigation is say React with mouse events hooked and loading parts of the page in the background with spinner graphics.

Everything can be done poorly or well. And because modern sites are often done so poorly, the power of interactivity is also seen poorly (it allows you to be worse than if you didn't have it).

This does push some people into contemplating restrictions or subsets of the web platform as a standard. Where you simply can't do the travesties we see on popular sites today.


> No one really thinks? Well I doubt you never opened a page and sick of it downloading megabytes of JS and breaking navigation because instead of links,

There will always be poor examples, but they don't invalidate the concept.

Smart web pages, when done right, are an incredible time saver, and if you're not convinced, go visit, for example, the web page of your bank from 15 years ago on the Wayback machine.


The prevalence of poor examples does invalidate the concept.

More accurately it does NOT invalidate it, but points out we need that concept refined. There's no selective pressure for making a page well-designed, light. Google is trying to to put better designed sites higher up, but they can't hide good content only because it provides horrible bloated UX.

I'm not literally against JS? :D


The full title does not fit in the description field, or did you not realise that?


Sorry for wasting your precious time, but the point of the article was to publish the entire discussion thread so it was easy to read in one place, including some previously unpublished private email discussion, and a citation to the classic "A Clipping Divider" paper, as well as David Rosenthal's correction and comments to some mistakes in Alan Kay's Quora answer, and Alan Kay's correction and comments to some mistakes in Warren Teitelman's article "Ten Years of Window Systems — A Retrospective View" from Methodology of Window Management, and to cite David Rosenthal's blog posting "History of Window Systems" in which he goes into much more detail in response to Alan's original Quora answer.

Ten Years of Window Systems - A Retrospective View, by Warren Teitelman:

http://www.chilton-computing.org.uk/inf/literature/books/wm/...

A Clipping Divider, by Robert F. Sproull and Ivan E. Sutherland:

https://dl.acm.org/doi/10.1145/1476589.1476687

I think a better title of this discussion would be simply "History of Window Systems", which is the title of David's blog posting that I linked to at the end, if you made it that far.

Alan Kay thanked David Rosenthal for writing up that History of Window Systems: "Your blog is a real addition to the history and context needed to really understand and criticize and improve today."

Or did you get that far? I guess if you didn't make it past the first 50 characters of the title, you probably didn't make it to the end where all the good stuff is.

DSHR's Blog: History of Window Systems:

https://blog.dshr.org/2021/03/history-of-window-systems.html

>History Of Window Systems: Alan Kay's "Should web browsers have stuck to being document viewers?" makes important points about the architecture of the infrastructure for user interfaces, but also sparked comments and an email exchange that clarified the early history of window systems. This is something I've written about previously, so below the fold I go into considerable detail. [...]

Also, David Rosenthal previously posted to Reddit in 23 years ago in 2000 to correct some misleading comments other people posted:

https://slashdot.org/comments.pl?sid=5311&cid=1075535

>I'm Dave Rosenthal. I worked on window systems for a long time. I was one of the authors of the Andrew window system, and NeWS. I did the first port of the X Window System to non-DEC hardware and was one of the team that got X11 release 1 out the door.

>This comment has good references, all worth reading. The Methodology of Window Management book is the record of a workshop in 4/85 - it has a lot of pointers to work further back, in particular a paper called "Ten Years of Window Systems" by Warren Teitelman, who was at PARC and then ran the Windows group at Sun. This describes many of the early window systems at PARC.

>I'm not going to try to write a complete history, but I do want to correct several misleading statements in the comments to this post. There were several streams of development which naturally influenced each other- broadly:

>- Libraries supporting multiple windows from one or more threads in a single address space, starting from Smalltalk leading to the Mac and Windows environments.

>- Kernel window systems supporting access to multiple windows from multiple address spaces on a single machine, starting with the Blit and leading to SunWindows and a system for the Whitechapel MG-1.

>- Network window systems supporting access to multiple windows from multiple address spaces on multiple machines via a network, starting from work at PARC by Bob Sproull & (if memory serves) Elaine Sonderegger, leading to Andrew, SunDew which became NeWS, and W which became X.

>Gosling & I & others at C-MU's Information Technology Center wrote the Andrew window system. We were all post-docs by then, not students. My memory is that it was working OK in the spring of '84.

>NeWS was not based on Display Postscript. Gosling built a completely independent (and much faster) implementation of PostScript from the specification that Adobe published as a book. This was SunDew. It was never a fully functional window system. NeWS was a ground-up re-write of SunDew, adding extensive multi-threading and garbage collection to the PostScript clone. My memory is that it was working pretty well by the summer of '86.

>Early versions of X were specific to bizarre DEC hardware. While I was working on NeWS at Sun I also did the first port of X10 to non-DEC hardware - to Sun 1,2 & 3 workstations. Experience from this and later ports of X10 was a major input to the redesign that resulted in X11, and this careful design has led to reasonably easy porting to X11 to a wide range of hardware.

>I'm not going to comment on the politics which have surrounded window systems, except to say that an individual's rankings of the importance of different technical, aesthetic or licensing features are no more than an individual's views. The market's view of the same features can be quite different without the operation of conspiracies or malign forces. It may simply be that the majority of customers don't agree with your priorities.

Original reddit discussion of "What GUIs Came Before X11?":

https://slashdot.org/story/00/05/04/1321234/what-guis-came-b...


> No one really thinks all the interactivity we have in web pages is bad and implying that someone does for clicks is dishonest.

This forum is literally the epicenter of people who think that.


Agreed, I flagged this article due to the clickbaity "and..." in the submitted title.


You are judging a book by its cover by doing this.

This site has a length limit on the description field. The "..." is there to signify that something else is following. Is this new to you?


IME, Hacker News titles tend to be edited to fit within the character limit; I actually opened the article despite low personal interest because of the ellipsis.

BTW, I was able to catch the updated title before the "unflag" action disappeared, so I was able to remove it. Thanks!


Very big of you. And so courteous to thank me to boot!


GUIs, images, video, are all a mistake.

Computers should have nothing more than text terminals, as Ken & Ritchie intended.

That way the world would be a better place without TikTok, Instagram, YouTube.


> Computers should have nothing more than text terminals, as Ken & Ritchie intended.

Odd statement because that same Bell Labs crew created Plan 9 as unapologetically bitmapped GUI, since even 30 years ago they already considered TTYs to be a smelly anachronism.


You know what…I sort of agree with you. You’re on to something.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: