I have been working on an interesting experiment during the past four years.

Should you have the need to build your own user interface from scratch, how much work do you need?

There are quite a few embedded projects around, but I really needed to check what the real requirement is for higher end graphics and office applications.

I am an old timer, who started to deal with user interfaces writing games in the 1980s. I saw news articles of games ... yes, we used to read magazines with code printed on paper. We just wrote it down, and played. I came up with my own versions of games like a bomber, a gambling game of horse races, archery, banking apps, etc.

I used a Commodore Plus 4 that time. The company does not exist anymore. Later I switched to AT PC of 286, 386, 486 with ever improving graphics. The Mac was fascinating, especially the iPhone. It teaches you, how to excel in what you are doing.

illustration

Copyright© Schmied Enterprises LLC, 2024.

The experiment started with building out some drawing primitives for a 1Kx1K buffer. Displays usually draw their output to such a buffer of an uncompressed matrix of pixels. The buffer is large enough to recognize patterns. It is also small enough to be able to forced into a unit of computing like a single processor core or tensor core. Higher resolutions typically employ multiple cores and just copy the same logic to avoid Big O issues.

There is a width, height, and a pixel size. Pixel size is usually three or four bytes of red, green, blue, and an optional opacity byte called alpha. The memory bank of processors or graphics chips usually require that such large bandwidth buffers are aligned to a specific location. This makes the width longer in memory than on the screen. It is extended to something called a stride, the actual byte width of a row in the memory. The advantage of the design is to be able to address and draw each pixel on demand.

The experiment expanded to applying, and scaling images, and eventually some font rendering. It turned out that the server processors of today are not very good at either of them. Drawing was very slow with default cloud processors optimized for web based text. You can rent bigger ones using sixteen cores or so, when the rendering is fast. However, rendering is used very rarely in office applications. What is the point then?

I quickly figured out that using serverless rendering is probably the right solution. You either need a processor of AVX512 instruction set that helps with graphics. If you need more, there are plenty of options these days to rent NVidia graphics cores left behind after AI training.

The economic limitation is the customer base. I learned at an NVidia conference that companies started to play with serverless GPU services. Check out Cloudflare for example.

Still, the user interfaces of Windows, Mac, and Linux are seasoned. We can expect some refreshes soon. I also cannot move away from the iPhone user interface being so streamlined compared to Android lacking other options. It was just designed easy to use once, and teams are strict enough to stick to, what works.

It is likely that a fractured world will introduce more user interfaces. Nowadays, the solution is to drop another theme on Linux. The design of Linux seems to be a bit bloated for us who grew up on DOS and sightly simplistic European engineering patterns.

User interfaces evolved during decades but every software innovation was sparked by hardware. Faster microchips and megabyte scale memory helped to move the user interface from character terminals to pixel graphics. Even faster processors had plenty of idle time to introduce multitasking of the Mac and Windows XP widgets.

Everybody expected that processors will follow Moore's law doubling the frequencies time to time. Making silicon with multiple cores became easier allowing the growth of NVidia, and AMD. AMD's implementation of the 64-bit processor started to spread in the early 2000s, when I graduated.

Multiple cores distorted the innovation in favor of graphics. Browser JavaScript started to download and render in parallel asynchronously making Flash graphics obsolete. Better and better games and video compression improved the experience. Lots of engineering was required to handle asynchronous programming of multiple cores. It is not trivial, and once the need was gone with AI, it explains some layoffs in tech.

Standard processors could not keep up with graphics cores. The complexity of x86 and amd64 architectures is huge. Compatibility requires to keep every single feature from the 1980s. Have you heard about binary coded decimals? It takes precious silicon space today in every data center processor limiting core counts. Proprietary design of NVidia, ATI and AMD allowed graphics to evolve. Should you change the instruction set, you can lose the well tested system base of your paying enterprise customers. Not all data are equal. Some data is more expensive allowing the purchase of more expensive hardware.

This proprietary advancement in parallel vector computing of graphics cards supported AI training in the middle of 2010s that lasts today. Graphics enthusiasts can actually opportunistically leverage the "wastage" and build on the AI optimized vector processors.

The proprietary design became a hurdle. Should you write your own user interface, you need to deal with image blocks across CPU and GPU memory over PCIE interfaces. This requires the installation of gigabytes of drivers and CUDA libraries. User interface design is limited by the amount of applications written for them. Any change must go through rigorous testing to keep the paying app developer base online of app stores and browser JavaScript. Why is that? The computer was personal, so once you bought an expensive hardware, you expected the lots of services that came with it. Simplicity was not a business.

My approach ended up as this. Energy usage affects battery sizes tremendously. Smaller batteries allow lighter devices, more devices per user, and a rich set of usage cases. This will trigger some browser JavaScript based logic moving to the server site. You can already see this with Angular, React, and Node.JS.

AI based solutions can tweak adaptive codecs like Codec21 of our parent company Schmied Enterprises LLC to balance between the power of client side, bandwidth, latency, and edge servers. Graphics can render on serverless GPU services shared by multiple services on cloud providers. One disadvantage of Google Stadia was the pricey design.

Serverless sharing drops the price of compute by a magnitude. Serverless requires a better process model than virtual memory handling of x86 and ARM. Our solution relies on services like tig and sat of eper.io. They treat variable sized memory blocks as the unit of computation. They were traditionally called segments. Such blocks are oftentimes static and can be downloaded on demand with their SHA hash. It is inspired by the design of Docker but for memory blocks, not just files. The Intel 386 also had a similar approach of variable length segments for memory blocks.

Game developers usually have serious artists who are limited by the restrictions of platform based user interfaces. Designers do not like boundaries. Writing software for DOS was a very liberal experience. Sometimes you could write an entire software in Borland Pascal without turning on the internet. Writing browser code requires a copilot or chat assistant that guides through the decisions of the past.

Any script logic does not need to install gigabytes of libraries anymore with serverless segments. Only the required blobs are downloaded identified by their hashes. Should you need the tool vim, the stub just downloads the binary by the hash on demand on first use. Processes using segments are much faster to start up. They require only the memory needed.

Such processes can run bursts in remote serverless containers to do graphics, inference, or page rendering code. This is the way tools like Flink work for narrower use cases of scientific workloads.

The granularity of such serverless bursts eliminate the drawbacks of the proprietary platforms of the day. More edge hardware draws in more bursts. Cheaper servers move the workload there. AI can generate, and verify systems, people can validate.

Platforms used to follow the management style of San Francisco startups. Designers, product managers, developers used different tools reflecting in rich standards of CSS, HTML, SQL, Java, and JavaScript. The proprietary set of standards had something to do with communities and job roles, I believe. This limited the flexibility requiring more staff. We will probably define more code in English with copilots in the future, especially user interfaces.