Monday, September 7, 2015

Apache Spark Overview

Let me tell you a secret.

Well, it shouldn't be a secret, but given how few people seem to know about it, we may as well be talking about Higgs-Englert physics.

It's Apache Spark.

"WTF is Spark?" you will probably say. You'll have heard about Apache, the venerable webserver software. You might not have really being paying much attention to their background projects which have been merging together over the last year like the arms and legs of some kind of Voltron. ("And I'll form the Head!")

Apache is making the transition from being the kind of software you run your old website on, to being the kind of software you run Twitter, or Facebook, or eBay on. Or Netflix, which is possibly the best case study for the software we'll be talking about, although that's more Apache Cassandra, a topic for another time.

At that level, there are new problems that are oddly different from the old ones. All these guys use cloud computing resources.. they don't really depend on physical machines. They rent them 'out of the air' for as many hours (or minutes) as they need. This is so they can increase the size of their clusters from say 20-50 to a few hundred for a couple of hours in order to handle peak loads.

eg: They don't repair servers. For most of those machines, no human will ever ssh into the box. In fact, they are usually put on a countdown for a rolling 'refresh' which shuts down the most ancient servers and replaces them with fresh ones, in the equivalent of a slow A/B testing transfer. Really clever systems stop the rollout of the new software and clone copies of the old, as automated reliability statistics come in.

But if a box is giving you trouble, you don't spend any of your time on it whatsoever. You mercilessly sent it back to the cloud, after getting a replacement.

At that level, it's all about "devops". Specifically what they call either orchestration or choreography. Depending, I suppose, on whether you listen to chamber music, or prefer dancing the samba.

Here's the Netflix devops problem: In each country, there are daily viewing peaks. There are weekly viewing peaks. These peaks are 10x the baseline, and last a couple of hours. Then most people go to bed. This is the predictable part.

Then there's the unpredictable side. When a beloved actor like Leonard Nimoy dies, there is a tendency for millions of people to go home via the bottle-shop and queue up every movie he's ever done as a kind of binge-tribute. I've heard.

And that's the kind of situation that your scalable internet service has to handle, if you're going to serve movies on demand to 100 million people. Very rarely, you have to be able to service everyone at once. And you cannot FailWhale. That was funny once, when it was Twitter. Once.

The most amusing thing about Twitter is that once they got past the FailWhale, their company value went from merely silly to completely ludicrous. We are talking 10 million to 200 million, because they proved they were finally in the big league. What was the technical miracle which banished the white Whale? It was Scala... the primary language behind Spark.

So, what's Spark? In a nutshell, it's next-gen Hadoop.

"What was hadoop again?" you probably ask, since you probably never used it. Well, it was a giant hack that allowed hundreds of computers to be ganged together and carry out various file-processing tasks across the entire cluster.

What for? Logfile processing, mainly. The daily re-indexing of Wikipedia. Places like eBay and Amazon used it for their first recommender systems ("other people also bought this!") and all because of the simple necessity of churning through more gigabytes of text than any single computer can manage.

You have to realize that, to a large extent, the billions of dollars that eBay and Amazon are worth are because of their "people also bought" recommender systems. That list of five other things (five is the optimum psychological number) absolutely must be the best possible, where "best" is defined as "most likely to buy next". This is not advertising, this is lead generation. There are metrics.

The point of lead generation is to turn each sale into an opportunity for another sale. "Accessorize, accessorize, accessorize!" and when those system break, or just degrade, then the bottom line impact is direct and palpable. Companies live and die by their ability to snowball those sales.

Netflix had this happen, and they offered a million dollars to the mathematician who could solve it for them. This was the famous "Netflix Prize". The resulting algorithm is now known as "Alternating Least Squares", and the details are a topic for another day.

Spark implements the ALS algorithm in it's standard mlib library. It's core. It's yours. You can have a million dollar algorithm, gratis. If you want to run ALS large scale, and this is most important - in real time - then Spark is the only option.

The only option, unless you want to spend about a man-century implementing the equivalent of fine-grained distributed transaction control and data storage, and that's just the infrastructure your math needs to sit on top of.

If you want to grow into the size of one of these services, you need to start with a seed capable of growing that large. Fortunately, in this analogy, those seeds are happily falling off the existing trees, and being blown about by the winds of open source software to the fertile fields of... some poetic metaphor I probably should have shut down a while ago.

That means Scala, and Cassandra. That means Zookeeper, and message queues. haproxy to spread the load. Graphite to chart the rise and fall of resources. Ansible to spin up the servers. It means dozens of support tools you've never heard of, and would never run by choice if you didn't have a pressing need to get the job done.

And these are all sub-components needed to support an overarching system like Spark - which schedules "parallel programs" across the entire cluster which are tolerant to the various slings and arrows of the internet.

There is a level above Spark which is still in formation - exemplified by the Mesos project. That one seeks to be a kind of  "distributed hypervisor" that can manage a cluster of machines and run many flavors of Spark, Hadoop, Cassandra or whatever within the single cluster. Otherwise we tend to get 'clusters of clusters' syndrome where each 'machine' is effectively only running one program.

You have the dev cluster, and the testing cluster, as well as the production cluster of course. Oh, and that's one each for the database cluster, webserver/app cluster, and the small front-end routing clusters or logging cluster that hang off the big clusters...

Yeah. Fire up the music, let the dance of clusters begin. Oh, and once you put on those magic dancing shoes, you can never ever take them off again until the company dies. This is that fairy-tale.

Spark is the answer to questions you haven't asked yet. Literally, that's the kind of algorithms it is specialized to run. And it scales all the way. That's its value. That's what Apache is doing these days, trying to close the conceptual gap so both ends, big and small, are using the same base code. I love it.

But no-one sells it, and the people who do use it in anger are too busy making billions of dollars to spend much time explaining exactly how, or writing documentation. You really gotta tease the information out of them, and watch a lot of their talks at Big Data conferences to see where all the pieces actually fit. There is an enormous learning curve.

And that's why it's still such a secret.

And given how many people read my blog, I'm sure will remain so.

Thursday, August 6, 2015

Tricopters, Birthdays, and Life

Today's my birthday, so I'm blogging. That alone should tell you how well my life's going at this point. The last year has been a little harsh, to be honest. Much of the good stuff turned out to be various kinds of vaporware, but all the bad stuff happened with full inevitability, all the same. You don't want to know.

One of the few things that's been working out is the multirotors. Apparently, I'm quite good at them. My latest one is a little 180mm Tricopter made from a handful of  hobbyking parts and zip-ties in an afternoon, to a plan that existed solely in my head, but based on experience with previous builds. (Especially the David Windestal tricopter)

My little one, called "Fourth"

Putting that in Jedi terms; I now construct my own light-sabers - even if my skill at wielding them could still do with some work.

Yesterday was my first successful FPV session, where 'successful' is defined as "wandering around an empty soccer field for three minutes without crashing". This builds on Tuesday's effort which only managed the first half of that statement, and required a day of CA glue and clamps to fix. Fortunately, crashing upside-down on a soccer goal net is one of the gentlest fails possible, and I feel like, for once, the universe gave me that mistake for free. Thanks, Spidey.

"Dancer", folded up for storage. 

ps. reason for crash? A butcherbird took me down. Damn little dark magpie harried me, made me panic, (via my spotter) and he's been insufferable about it ever since. Keeps turning up to gloat. Bird 1, Tricopter -1. Let's keep it there.

If you want to get into flying robots, I recommend watching Iron Man again, especially the bits where he's still developing the suit and face-plants into the ceiling on his first attempts. He learns via failure, and it's good to have a friend standing by with a fire extinguisher. It's a lot like that, only without JARVIS.

Well, so far. The HUDs are getting cooler. The equivalent of JARVIS is coming. And I'm sure someone's quad is already rocking a sound system capable of playing "Thunderstruck" at 99dB. If they exist, there's a good chance they'll be at the QLD FPV racing competition happening here in town in a couple of weeks, which I hope to attend. I've got my little 5.8Ghz facebox sorted out, so I'll be riding virtual shotgun as the (currently reigning) best in the world hurls his tech-avatar around a converted sheep-shed.

Remember that each multirotor is its own little TV station. (pure analog video, baby!) The pilots stay locked on their own channel, but spectators - if we have the right equipment - can frequency-hop from craft to craft at will. I won't be winning any style competitions with my Borg visor compared to the cool Fat Sharks, but it should get the job done.

Friday, March 20, 2015

Cheat your way to Success!

First, just in case you're new to my sense of humor, the title is a Joke.

Also, it's kind of not.

Lying to other people is still bad, m'kay? Coping with reality is difficult enough, without you inventing bits of it that never existed.  I don't mean "cheating" as in "I know I said I agreed to some common rules before starting the game, but, surprise!"

I mean in the sense of "cheating nature", or even "cheating death". Those are not rules we agreed to in good faith, coming into the game. They weren't even properly written down. And it's also very difficult to take the re-negotiation route with Reality and, for example, talk Time or Gravity into being a little less strict.

So in that case, it's perfectly fine to, in all good honestly, cheat like hell.

If in doubt, you know you're doing the right kind of cheating if you'd be happy to arrange a press conference, and rock up like Tony Stark in his Iron Man suit to announce it in advance.

Aerodynamics is a cheat. Medicine and Science in general is a cheat. The Large Hadron Collider is a complete and utter cheat... deliberately causing billions-to-one chances to come up nine times out of ten. (Talk about weighting the die!) And if you're going up against Ebola, I don't want you playing fair. There is no sportsmanship with pandemics.

Technology is cheating. And the skillful use of it requires, above all, a kind of joyful cheating that is forever thinking up new things to do with things no-one has noticed, and gets hackers into trouble when they can't distinguish between natural and man-made restrictions.

If you were in a foot race, suddenly substituting yourself with a robot scooter mid-event would be frowned upon. But in general life, it's a brilliant idea that gives us things like dishwashers and ATMs and insulin pumps.

So, cheat at life. Cheat the universe! Just be honest with other people when you're doing it, and everything will be fine. They might even help.

Tuesday, January 6, 2015

Once more unto the 3D Breach, dear friends.

Haven't posted in a while. I built two flying robots recently, but everyone does that these days. My David Winstead V3 Tricopter kit turned up last week, and that's going to be a joy to put together.

Yes, this one...

But amongst all that, I've decided to get back into 3D printers.

Well, substitute "decided" for "saw a UV laser and complete 20kpps galvo kit on eBay and couldn't pass that up."

UV lasers, 30 years ago, were the kind of thing you needed to be in the Defence Department to get a hold of. These days, $20 on eBay. That's what quantum mechanics has done for us lately.

"Galvos" are the laser afficionado's slang term for galvanometers that are specially designed to have a mirror stuck on the end. They're conceptually no different from the "meters" that have pointy dials attached to magnets, so that currents in a nearby field cause them to twitch to and fro, as invented by Galvani. Except your average multimeter needle isn't designed to accurately hit the mark 20 thousand times per second.

Galvos are. And while moving significantly more mass than your average meter needle. Which is why they need big-ass driver boards and +/-24 volt power supplies. "Closed Loop Galvos" even have feedback systems so the galvo knows how wrong it is, and can correct. Necessary at those kinds of speeds.

Why am I buying lasers and signal-driven mirrors? To build one of these:

Which is a later-generation version of these:

Which are both hobbyist (but still multi-thousand dollar) versions of a fairly old idea called "Stereolithography". Basically, using light (from lasers) to selectively cure a special epoxy resin.

A 3-D laser printer.

Now, all of these printers have suffered a problem that took a while for me to really appreciate - the "release" part of the layering cycle. You get the impression from all the videos that the object builds downwards from the platform like a stalactite from a cave roof, but in fact each layer forms upwards from the optical window until it reaches the previous layer, which it hopefully sticks to,

Then the "build platform" is supposed to move upwards a fraction, and that's when the issue comes in. The epoxy is stuck to the window. And even if your window is "less sticky" to the epoxy than the layer above it (you hope) there's still a moment where you have to pretty much rip the newly formed layer off the optical window and prepare for the next layer.

This is the infamous "clunk" that sometimes yanks the protective silicone coating right off the bottom of the tank, especially if you've in-filled too much of the previous layer. And once that coating degrades, you have to recoat or prints start to fail badly.

I actually though the achilles heel of these machines was the cost of their "toner". But that's come down to the point where you can by a useful quantity for $50 from a place like

Bit it's not. The worst thing is literally the sucking sound of the next layer being ripped off the optical window's protective surface. If you're lucky, you get ten coatings out of a $50 bottle of the stuff. And you'll go through the coatings in less than a dozen prints. So the tray coating alone is a $0.50 per-print consumable.

There is only one 3D printer which has managed to avoid this form of sucking. The "Peachy" printer - which uses a liquid float system, and builds in the opposite direction.

It's full of elegant ideas, but some terrible design decisions. (like trying to make his own galvos)

After thinking about it for a while, I've come up with an compromise that combines the key idea of Peachy (liquid resin float) with the more repeatable inverted Z-axis build tables of the B9 and the Form 1+. It's even a fix that could be retrofitted to those machine, with some work.

My key idea is to replace the solid silicone protection layer with Wax.

Why? It's second only to teflon in having a non-polar surface, which Epoxy doesn't like to stick to. (That's why carnauba wax is used in mold release agents) It should contaminate the epoxy less than saline solution in the peachy. And finally, it's incredibly cheap and available everywhere.

Possibly even use Paraffin oil, although I'd need to float a clear layer above the epoxy to provide the non-stick optical window, due to relative specific gravities.

If there's no solid surface, there's no sucking, so the Z axis drive can be weaker. A careful retract - extend would still be advised to pull in fresh epoxy to heavily infilled areas, but maybe this method can 'continuous build" thin-wall structures without doing a release cycle. (ie: the way people actually think these machines work.)

Now, there's a couple of minor things to consider when combining flammable waxes with high-power lasers, but I really don't need very much in there. Just a shallow layer. I'm planning on a much smaller build volume, too.

So, I'll let you know how it all goes. Clearly the machine has to be called "Waxer".

And any day now, I expect a breakthrough or two with the UV epoxies that are the expensive 'toner' these machines consume - probably by replacing the metallic catalysts with cheap and safe organic dyes squeezed from colourful fruits, similar to what's happening with dye-activated solar cells.

The Fused Deposition (melty plastic gun) printers never really excited me, I've got to say. But the photolithographic ones - I think - are,on the right path at last.

Writing with light. Writing solid plastic with light.

Friday, August 22, 2014

What "good" looks like

Here are two queues, which are attached to programs I wrote that do things in response to reality. You actually don't need to know the details, merely observe the graph of "stuff waiting to be done"...

Heartbeat erratic but strong...

He's dead, Jim.

One of those is a "good queue" with low latency, and it's pretty obvious which. The numbers help quantify just how bad, but the shape alone tells you which queue you'd rather join.

The thing is, I've seen that shape elsewhere in my readings this week.

Those are 'neuronal spike patterns' in human cortical columns. You'd be surprised how large an overlap there is in the math used to analyse correlations among spiking neurons and, for example, doing security intrusion detection using logfile events.

I'm not making any deep claims about the equivalence of computers and brains... I know better. I'm just pointing out that it's interesting that, when both information systems are working well, they create graphs with similar shapes. Probably for similar underlying reasons to do with good engineering.

In essence, it's better to sit around with nothing to do most of the time between bursts of activity. Such systems can weather storms that tend to bring more consistent and 'globally efficient' systems to their knees - at exactly the time they are most needed. Especially if there's any dependance on shared resources.

In other words, the blinky lights shouldn't just stay totally on. That's almost as bad as completely off. We all know this, which is why we like blinky lights.

Then again, earthquakes and avalanches follow much the same log-law pattern. Don't read too much into it. It's probably just the universe itself infusing into our data, unstoppably shining through it. Or, as we generally call it: 'the noise'.

Saturday, June 28, 2014

Why Ted Nelson was (almost) Wrong

Most people have never heard of Ted Nelson. I've had the privilege of sitting a few meters away from him as he tried to convince a roomful of us to join the Xanadu project.

I think that was the same day I (briefly) met Tim Berners-Lee, on an escalator, shortly before his keynote at WWW7. Lovely man. Pretty much everything he ever said has turned out to be right, which is why I've never really felt the need to rant about him.

Ted Nelson, on the other hand, causes many people to rant. He pretty much invented the concept of Micropayments, which I'll get back to in a moment, but as with many things, it's not that Ted didn't have the amazing foresight to recognize the importance of the concept, he just has a tendency to grasp the stick by the transverse end, and never let go.

There's a reason I put Ted Nelson and Tim Berners-Lee in the same paragraph, not unrelated to why they were both there that day... it's because they're kind of weird polar opposites of each other. The WWW equivalent of Steve Jobs and Bill Gates, although with entirely different personalities.

The first thing to know is that Ted Nelson is very, very bitter that we didn't build the Web the way that he suggested. That alone explains half of what project Xanadu is about.

The second thing to know is that Ted really does have prodigious powers of foresight, especially when it comes to the intersection of technology, content, copyright, digital rights management, and information access. Ted Nelson is a brilliant writer, engaging speaker, and can coin a phrase like few others. "Hyperlink" is actually Ted's word. Yes. Really. He invented that word.

If we'd built the web the way Ted wanted, then all of us would still own the copyright to all our blog entries, and would get paid per view (in 'flecks', a gold-backed digital currency that was a decade before BitCoin) so that, like professional photographers, whoever posted the cutest cat video (in Ted's mind, most interesting article) would get showered with money. Sort of like YouTube has become.

But the third and last thing to know is that Ted is a much better writer than he is an engineer, and that nearly every attempt to realize his grand dream collapsed utterly into a black hole of development hell that took some of the best and brightest down with it. It was legendary.

Tim Berners-Lee, on the other hand, didn't worry about shaping the social impact. He just made the technology work, using what little he had. So now we live in his world.

Then came Samasource, and Mechanical Turk, and Crowdflower. If you don't know what any of those are, go look. I'll wait.

These are technically "Micropayment" services, but they work in the opposite direction from what most people, including Ted, predicted. And they raise serious ethical concerns, although not necessarily the ones you think at first.

For example, is it more ethical to take the money you would normally spend employing one bored teenager in the west for a few hours, and instead employ dozens of Kenyan refugees for days? If you know the teenager has a fallback, and the refugees don't?

Is "digital sweatshop" a bad term? It is better than a "real sweatshop", which already have the ethical concern that they're often better than peasantry and hauling coal. Are we adding rungs to the ladder here (by making education and computer skills a valuable, exploitable talent in the third world...) or is this leading to yet another decimation of the western "middle class" that I'm quite fond of?

What happens when you can hire a thousand people for a fifty cents each, to write letters to your senator? Because that's happened. Or fill the comments sections of a major news outlet with your point of view? That happens too. Or create ten thousand accounts that praise one company's product? Because some less ethical crowdsourcing firms offer that one on their front page!

People are being paid... to be opinionated people. Right now, that's only worth a couple of dollars a day, which will make Ted very sad to hear. To me in the west, that's literally not enough to justify leaving the house. To a hundred million people, that's more than a living wage.

And what's heartbreaking, in some ways, is that sub-saharan African are just like Victorian England cockneys, in terms of pride. The don't want handouts, they want work. We have the opportunity to show them that 'work' can be sitting in front of a computer for a couple of hours a day, typing crap that no-one will read into textboxes. Or playing the "pick the right drop-down menu" game to feed their families, like the rest of us do.

We have databases to clean,  product descriptions to write, companies to hype, and endless comment boards to filter for hate or euphemistic profanity. And that doubles every year. Let's turn each into a "database game" (with pretty graphics and sound effects why not) that earns someone a few cents, and see if we can't empty out the refugee camps that way. Nothing else has worked, possibly because no-one else sees such people as valuable, or able to contribute, stuck in the middle of nowhere with nothing but a smartphone.

I recommend having a very close look at the Crowdsourcing services and seeing if they can be used to help your projects. If nothing else, you might finally understand some of the odd trends you've noticed on the internet lately. (Especially in comments sections.)

If you are reading this, then you are a citizen of the world, with access to technology and resources. And you know how annoying it all can be. Now, you reach out across the Internet and employ hundreds of people who want to fill out forms all day. You can connect the dots, be the magnate of your own global workforce. It's the new Gilded Age, after all.

But should you? I wish I knew the answer to that.

I feel there is a new future here, that none of us really saw coming. People are being paid to be human. Not to act like machines, but to act like humans. (Well, 'writers', which is mostly the same thing.) To have opinions, and express them. To notice things.

We've long had an established 'market' for Popstars and Celebrities to be rewarded for just being themselves, (for their 'performances') and now we've established a bottom to that market. I rather hope there will eventually be a middle.

As Ted foresaw, I would quite like it if all people could get a decent income simply from being alive and typing opinions on the internet.

It's, like, the main thing I do.

Monday, June 9, 2014

Astromech Update

I've been quiet, because I've been working on some new stuff. Cluster Computing stuff. I'm starting to get the hang of it, too. My capabilities have expanded in the last week, to scary levels.

Alas, this next project is going to take up the majority of my time, which means "Astromech" will lose focus for a bit. But hopefully I've got it to the stage where it can sit there and gather some attention from others. The video should help, as well as the on-line demos, but...

A two-minute video is no substitute for proper documentation, and I know this. For anyone who wants to get involved, here's the 30 second primer on what it takes to create a new Astromech "world":

  1. Download and install "Blender", the 3D modelling package.
  2. Spend a week learning how to model static scenes, Youtube tutorials help.
  3. Export your blender model to a ".dae" collada file, with all the options turned on.
  4. Now follow the video steps... log into Astromech, upload the file, set all the options, and publish.
You're not going to be able to take an existing complex scene, dump it out, and everything will be perfect. There is a severe mis-match between all the material options you can set in "blender" and what you actually get in Astromech. In particular, the "Lighting Model" is still very primitive, and only cares about "Sun" lights. (And don't even consider using "Cycles" yet.)

But the saving grace is that the vertex painter and UV Maps do work, and blender allows you to "pre-bake" texture maps with ambient occlusion and shadows. This is what makes most of the Astromech levels look good, while keeping framerates up at 60fps.

If you want to try the system, I heartily encourage it. Start small with a single cube, and slowly build. Most of the effort will be in learning "Blender", and figuring out what gets lost in translation. (when I get more time, I'll be making that translation better.)

And since I haven't written any docs yet, you're encouraged to hit me with questions and queries. And definitely to tell me about stuff you've done.

If there get to be enough users, I might finally sort out the "Teleporters", (or the "Farcaster Network" as one friend thinks of it) which are supposed to allow you to step from world to world without all that tedious mucking around with hyperlinks.

Saturday, May 24, 2014

What Hits Bikes (in Queensland)

To answer that question, you go here:

I warn you, it's layer intense - and probably needs a legend (although you can figure it out by clicking and thinking) and it should even come up area-appropriate for you.

Big kudos to Google for making the tech so public and useful.

If you want to delve into the source data for this map, check out here:

This is a filtered extract from the full TRM dataset which can be viewed here:

... but you will quickly see why I made the extract, because it's slooowww.

TMR Crash Data Analysis Results (or: Bicycles! Bicycles! Bicycles! )

These maps are generated from the Queensland TMR Road Crash Location data, obtained from:

If you are all googlified, you can access the Fusion Tables here:
(and potentially build your own views by example)

All incidents recorded a "casualty", but the few "fatalities" are marked with a big flag. In most cases, I would assume the fatality (or casualty) was the person on the bicycle, except perhaps in motorcycle vs bike where it could be the faster-moving vehicle. Unfortunately the data does not assign casualties to vehicles (well, not obviously) but again, it's very probable that if there were casualties, one of them was the person on the bike.

Sometimes it is hard to concentrate on the data analysis, and try not to imagine what each of those little flags actually mean...

[Update: the maps that used to be here were just not working well enough... go to to see all in one map]

Sunday, May 18, 2014

Astromech in Two Minutes

I finally announce my latest project. Ready for users.

Contact me if interested. Hoping to write a manual shortly.

Tuesday, April 29, 2014

Google Drive Realtime API - First Thoughts

Technologies tend to arrive with a bang of hyperbole, and then settle into the valley of despair before reaching the plains of enlightenment. It's a well known process. Look it up.

Google's Realtime API is probably going to buck that trend. Well, unless a lot more people read my blog than I know about, because this is a technology that is so sufficiently advanced it's essentially magic. But few people know about it.

I'm assuming you've used Google Docs at some point, and had the pleasure of watching other peoples' cursors (or your own) wandering through the document. Collaborative text editing is older than Sub-etha-edit, but Docs took it to the point where it "just worked" across the entire Internet. Do not underestimate that achievement.

The Realtime API is that.

In short, you can write Javascript web pages that hold a shared 'document' object, and every copy of that data structure in every other browser in the world updates to include any changes.

Seriously, stop and think about that. Text boxes which update when anyone changes their content. We used to laugh about such things being depicted in bad 80's action movies, but that's now state-of-the-art.

I have made my own attempts at such a technology, which is why I find their solution to be so familiar. And I also understand the limitations... that while the API does its best to appear magical in operation, you pay for it in other ways: atomicity being the primary one.

So before going into it's strengths, let's go over the weaknesses of the OT (Operational Transform) approach, in order to better dance around the landmines.

The big one: binary data. OT depends on having a structured understanding of the data it's transforming - it wants to be 'git' (the version control system) but without the possibility of ever having 'unresolved edit conflicts' that require manual intervention.

Binary blobs are - by definition - unstructured, and the realtime API cannot patch large blocks of binary data without fundamentally stepping all over the toes of everyone else trying to do the same thing. the upshot: patches can't be combined, so changes go missing.

So, don't keep BLOBs in Realtime.

A DOM-like object tree is the exact opposite. It is so structured that every branch insertion, every node deletion, can be tracked as a separate "mutation". That's great! The OT system has a stream of micro-operations that can be 'transformed' against each other in a more granular way. Google sat down and figured out the full "theory of patches" for that limited case.

Text strings are a kind of half-way between the two, and one where the OT 'rules' are simple enough that most programmers could sit down and work them out in a half hour.

Creating an OT 'grammar' is necessary for each datatype. The rules which work when combining text edits in a "description" box are not adequate to make sure that two "legal' edits to a JSON string result in a syntax legal combination. The strings may combine to produce invalid JSON... bad if that field is storing program config data.

If you know that two binary blobs represent GIF images, then extracting the 'differences' between two versions (with the intention of applying the 'difference' to a third image) is a simple set of photoshop operations. Without that knowledge, a 'standard binary merge' is only going to corrupt the GIF file.

Clearly, the OT rules for combining images are useless for combining XML data. Every datatype needs an OT definition, and it's not proven (or possibly proveable) that all datatypes can have one.

The academic area that looks into this is called the "Theory of Patches". If you read the papers in the hopes of finding a solution, what you tend to get is "Oh no, it is so much worse than you thought... have you considered these pathological merge cases?" and then your head hurts.

The best thing about the "Theory of Patches" is that, in academic style, at least it lays out the general shape of the minefield, and mentions some particularly impressive craters from past attempts to get through it.

For the moment, the Drive API only has built-in rules for three datatypes: Strings, Lists, and Maps. ('Custom' objects are possible, but they're really Maps in disguise) And frankly, Lists are a pain in the ass.

But since you can build pretty much any data tree structure you want out of those, you're generally good. And by doing so, your 'document model' is granular enough that OT can merge your micro-patches with everyone else's version and keep that tree in synch.

Then there are the consequences... because you have a data structure that doesn't only change when you tell it to, but when any bloody person in the world does. What happens when, three steps into a dialog wizard, someone else deletes the file? Well, you're going to have to code a listener for that.

There are other hard limits: 10Mb per 'realtime document', and I think it was 500k per 'patch'. But you should plan to never hit those limits: if you store bulk data in realtime, you're doing something wrong. (That's what the normal drive API is for.) Realtime is for co-ordination and structure, not streaming, and not data transfer.

Google handles all the authentication, permissions can be set for files through the usual Drive web interface, which is nice. Realtime documents get their permissions from the 'base' drive file they're attached to - like 'conversations' that can be started about any existing file. (actually, file version. If you change the base file, you invoke a new realtime branch - watch that.)

Although the OAuth sign-in process is a lot slicker than it used to be, it still has problems... Mostly caused by pop-up blockers. But that's part of a much bigger discussion I want to save for another day.

And they have automatic undoUndoooo! Do you know how hard that is? How much of your time that saves? How happy your users will be?

What the "Realtime API" does is make Google into the biggest 'Chat Server' in the world. Every document is hooking up to its own little social networking hub, to discuss what buttons their users are pressing, and how their cursors are moving today. A billion tiny brainstorming sessions, between browsers.

There's a lot of guff being written about how people are leaving the social networks. That's fine... Google's social network isn't just for people. It's a peer message-passing layer for our software, arguably more useful and important to the internet's long-term future.

I really encourage you to start writing code that depends on this modern view. Spend time with the paradigm, learn its' flaws and graces - you can probably have a shared 'ToDo' list app working in a few hours that instantly scales to millions of users. But be prepared to let go of a lot of baggage. This is Star-Trek level technology, so don't try applying 20th century thinking. Start fresh.

Enlightenment awaits.

Wednesday, April 23, 2014

FireFox still missing MessageChannel

I've managed to 'find' three browser bugs in the last few weeks. One is wending it's way through the chromium bug process (it's nice when you get a good one) one was already logged, and the third was FireFox.

Well, I say "bug" but really it's "unimplemented HTML5 feature"... specifically the MessageChannel object necessary for cross-document messaging. Without those, you can still use .postMessage for everything, but you take a performance hit because all the "marshalling" required, and security goes out the window.

Oh, so FireFox has not implemented a bleeding-edge new feature. Aw, cry me a river, you might say. Except this is a rare case where even Internet Explorer has managed to get over the line, making Mozilla the last holdout (except opera mini.) and it's such a simple feature!

In fact, if you look at the great repository of knowledge on such things:
You'll see that Chrome has had it for five version, IE has had it for two, even the Blackberry browser has always had it. It's one of the easier parts of HTML5 spec, and is a fundamental part of security in future apps.

So why is it missing from FireFox?

Reading through the bugtrack is quite illuminating, and paints a winding story. I think the tone was set in the initial days by one developer saying "I don't see the point of this." and another replying "Because it's in the HTML5 spec." and the first replying, "Oh well." and then nothing happening for three months.

Recently a flurry of people reporting 'bugs' prompted some work, and the feature was mostly implemented except for web workers, and then pulled back at the last moment. From what I can tell, the patches are sitting in some kind of purgatory until someone cares enough again.

In the meantime, the fallback for FireFox is to continue using the single window.sendMessage event to route everything. Why is this bad? Perhaps the best reason I can give is that, once I attached my "onmessage" handler to the outer window to receive messages back from the inner frame, that handler was passed all inter-frame traffic... including the Google+ widget messages that I didn't even know were there... and it was clear that the Google+ widget was also receiving all the messages intended for my scripts, (filled with things like authentication tokens) so I was essentially trusting them to ignore messages intended for my origin and not be evil. (just like they were trusting me)

Google, I do trust. But what happens when I add a Facebook or other third-party web-library-component-widget things to my page? Will they start conducting industrial espionage on each other because of the globally accessible message queue? How would you know?

So it's not just me working around the missing features in FireFox, or the dozen other edge-dancing coders complaining in the forums, It's Google. How much effort is being put into coping with this out in the field?

Monday, April 21, 2014

node.js finally arrives, thanks to RedHat OpenShift

I've been messing with node.js for about two years now. It's a beautiful system. Most people don't get it, but that's fine. The bigger problem had been there was no-where to run your code, if you didn't maintain your own servers.

And who seriously wants to do that, if you're not getting paid for it? It's a time sink.

I've been waiting for a major provider like Dreamhost to enable node.js. A couple of small providers have started up, but it's kind of a premium service. And that's a shame, because the first job of node.js is not to take over completely from your old site, but initially to "plug a few gaps" in the big internet puzzle - mostly all this crazy new 'real time' websockets and messaging stuff. a chat client. Everyone loves a little chat client. But you're not going to redevelop your entire site and chuck Apache, with all the pain that would cause, to get one. You want a cheap little 'experimental' side-server that fills that gap, so cheap you essentially forget about it.

Well, along comes RedHat with their cloud computing platform called "OpenShift". I think they changed it recently, because I remember it having a much more sensible name and a lot less features the last time I checked.

I'll skip over explaining their funky new words for everything (like "cartridge", clearly intended to harken back to your halcyon Nintendo days) and just say that, FOR FREE, you can have THREE servers spun up for you, pre-loaded with all kinds of linux-based server setups such as Apache, PHP, JBOSS, or node.js.

If you're used to FTPing into your hosting server, then OpenShift is going to be a shock, due to the heavy use of GIT to transfer files. It seems crazy at first, especially setting up the SSH access keys if you don't have a unix box handy, but there is a method to the madness which becomes apparent when you start to 'scale'. Using GIT to deploy files to one server seems completely excessive - although it does do some cool 'merging' of your code with the latest server upgrades - but the moment you need to deploy to half a dozen instances, you'll see there's no better way.

All other providers I've tried work in a similar way. If anything, RedHat has streamlined the process to it's simplest. But why does the typical node.js deployment system 'force' you into being scaleable? Because if you don't start out that way, you never will.

Yes, and websockets work, I've tested it. (You have to be aware it appears on port 8000 externally, regardless of what you thought it would be. And only the websocket connection method works - no comet fallback that I know of.)

Widespread use of node.js and websockets is likely to reduce global internet traffic, by removing all the inefficient ajax 'polling' or comet 'long poll' IP connections being made and remade, a billion times a minute. node.js can handle more users and page hits than PHP. But none of that mattered while we didn't have any damn servers to run the code on.

So, Thanks RedHat! You've made my day, and solved one of my longest-standing network plumbing problems, for free. And you've made the internet better. Thank you.

Monday, April 7, 2014

Steam Family Sharing renegotiates copyright in our favour

Steam (the video-game distribution service) just renegotiated your copyright with them. And unusually, they've taken a step on the long road towards ending the pointless copyright wars, instead of escalating further.

You can now "lend out" digital games to your family and friends. It's a small step, but a good one:

Jaron Lanier gave a talk recently where he pointed out one consequence of the "paper-thin" Digital Rights Management around DVDs, compared with the "open-access" data on CD.  "Imagine that 20 years ago, you put a CD and DVD into safe." he thought-experimented. They cost about the same when you bought them, and twenty years later you open the safe and ask "what capabilities does owning this medium offer?"

Well, with DVDs, you can do pretty much what you could do back in the day - play them on your region-locked DVD player, unskippable FBI warnings and all.

But with CDs, you can shift the music to your iPod, digital music collections, Spotify... there's more you can do with a CD today (again, legally speaking) than ever. Amazingly, the unprotected CD has increased in utility value, thanks to technology. That investment you made back in the day keeps paying off. Weird.

Valve and Steam have just done something equivalent. They "renegotiated" your copyright with them - as is allowed in that EULA you clicked through, as could be done by any media company at any time - but instead of screwing us, they have taken the opportunity to act on our behalf.

(In this way, they are outperforming many actual governments.)

Basically, you can now lend your game library to other people as easily as letting them sit down at your own computer to play a game, or loaning them the game media. (with the added advantage they can't lose or step on it, and you can yank it back any time you need to.)

What this does, once again, is increase the utility value of your Steam digital library. All those games you bought can convert into favours amongst friends - one of the hardest currencies of all.

Steam gave you more digital rights, rather than trying to restrict them even further. Isn't that refreshing? Being treated with respect? Feels good, doesn't it?

We're closer to the day that "baby's first steps" can't get yanked from YouTube because a digitally-bought-and-paid-for copy of "Finding Nemo" was on in the background. Where buying an album means you can sing your own terrible karaoke version in public, or do a heavy-metal guitar cover that many think is better than the original, without fear of prosecution.

More generally, the digital rights afforded us need to align with the way we (and our children) actually use the technology in our modern daily lives, or the copyright industry is just a way of criminalizing the entire population for being alive in the 21st century, for profit.

Saturday, March 22, 2014

"LeoDroid" Design Decisions

I should explain a few of the design decisions I made with LeoDroid. First, DC Power!

Power Systems

Notice the USB "Power Bank" on LeoDroid's back? That's an 8500mAh NiMH battery capable of delivering 1.5 Amps. They're designed to recharge mobile phones, and come cheap from eBay. You can't build one cheaper, trust me.

If you have built power systems before and didn't know about these brilliant new devices, all you need are these magical words; "5.2V @ 1.5A DC step-up regulator, built-in USB charger, all matched to the battery, less than twenty bucks."

They can be plugged into any standard USB charger. They can charge each other. 5.2V drops to exactly 5V after a Schottky protection diode or LDO regulator. 1.5 amps is enough to spin robot motors.

Unreliable power is one of those things that can ruin your whole day. Buggy self-made power supplies can screw up your whole month. These power banks can be bought for less, in many cases, than the naked battery.

They come in a variety of shapes, colours, sizes, and optional features like solar cells. If one breaks or dies, you just plug in another. You don't have to care about battery chemistry, charge curves, or special connectors. If you use USB-powered Arduinos, there's no soldering required; just unplug from the PC, attach the USB Power Bank, and away you go.

Power Switch

Note well the power switch! The use of a DPDT switch is not accidental - the idea is that when turned on, the battery pack power is connected to the Arduino and Motor Driver - but when turned off, those systems are isolated from each other.

See it? Over to the right, above the wheel well.

To understand why, notice that the Arduino's USB port isn't connected to the battery, (which would be easy with standard cables) instead we route the power to the RAW pin. This arrangement is absolutely necessary because we can't drive enough power through the Arduino USB port to turn high-current motors. It's not a matter of wire gauge - it's the protection polyfuse. If you try to pull more than 500mA through that fuse, it will shut down.

So we have to run power to the motor driver directly, which means soldering wires. And that same input power has to get to the Arduino, because we're using its on-board regulator to supply the digital circuits with 5V. Power just can't go through the Arduino's USB connector on the way to the motors.

Unfortunately, a problem arises because we occasionally want to plug the Arduino into a desktop PC to upload new program code. We really don't want the battery turned on when doing this, because it's possible that the battery pack (if the PC is underpowered or defective) could feed power INTO the computer and burn out ports, even with the polyfuse. Or the PC could try to charge the battery bank backwards, via the wrong port. Don't cross the streams.

If you plug your PC into one side and the battery into the other, power can flow ALL the ways.

If you unplug the battery to prevent this but leave the microcontroller and motors driver connected, the host PC will need to supply all your Droids' needs - including the motors if they happen to suddenly power on because you're debugging, and that will trip the polyfuse.

If you've got good equipment, this will merely reboot the board. If.

With the double-pole switch, "on" means the battery is connected to both controller and motors, and "off" means the controller is safely isolated for programming.

Throw ze switch, Igor.

If you use a single-pole switch (or none at all) then you need to be damn sure your equipment (and everything you plug it into) can survive the conditions. Be aware that I have seen USB ports that burst into flame because someone plugged a double-headed cable in twice to the same PC.

Another simple option is to not use the RAW pin but hack up a USB cable so it splits to supply the motor driver as well. Therefore, you'd be forced to unplug the battery USB connector before plugging in the programming USB cable to the PC.

The most foolproof solution, for the usual class of fool.
It's still a switch, it's just harder to flip.

If you think about it, you would be 'manually' performing the action of the DPDT switch by having a mutually exclusive 'power cable' and 'programming cable'. If you intend to do software development, plugging and unplugging over and over can be annoying, and physically break things. Hence the actual switch. Of course, you could leave the switch in the wrong position...

If you don't intend to connect the Arduino to a PC ever again after emplacing it, then a single pole switch is probably fine. But why not do it properly?

(Did I hear someone say "diodes"? Sure, you could use diodes. They'd better be expensive schottky ones, or the voltage drop will take the circuit out of spec. Even good ones are wasting 5% of your power.)

Power Loom

This is such a simple and obvious concept - have a PCB or connector block that is the central power loom. With dupont connectors, I use old 2-row IDC connectors (such as the IDE connectors off dead hard drives) and solder all the pins in each row together to create two 'rails'. Then all the connectors plug into that. It's basically a 10-socket mini power board.

Like this, only tiny.

The alternative, which I see all the time, is the 'daisy chain' that develops when you run power wires to the nearest other powered device.

The 'loom' is not only better electrically (to prevent ground loops) but it makes visually checking hook-up polarity straightforward - if all the black wires are on one side, and all the red on the other, then you're obviously good at the loom end. (This is how we battle Murphy's law.)

It also means you can quickly isolate individual devices without worrying about chained devices. And you can untangle wires easier.

"LeoDroid" photos

Hooray, pics uploaded at last. Here is my latest robot, based around an Arduino Leonardo: I'm just about to pull it all apart again to do the final trim and shaping of plastic bits, and add The LASER.

Hi there.

Looking pensively into the distance.

The NiMh battery pack has a solar panel built-in, and runs for hours. 
Also, can recharge your phone in a pinch.

The thumbstick controls a pointer on the screen

...which controls the Integrated Programming Environment

Under the hood, kinda looks like a prop from the technicolour 60's.

The actual computer is nearly invisible. It's all really just different lengths of coloured wire.

Release: Fritzing design files for "LeoDroid"

Here are the "Fritzing" design files for the Arduino Leonardo-based robot that is sitting on my desk. I'll be posting some photos shortly, and explaining some of the design decisions.

Common Features:
  • Atmega32u4 based design
  • 128x160 pixel TFT display
  • Thumbstick
  • Ultrasonic Ranger
  • IR Remote Control Decoder
  • Magnetometer Compass
  • Sound Effects
  • PWM Motor H-Bridge control
  • PWM LED/Laser output

I've added all related files (ritzing designs, exported images, and library parts) to the GitHub repository:

Pro Micro Variant

This is the target hardware I developed for, the 16Mhz 5V variant of SparkFun's Pro Micro.

It might look as though there is one pin free: D14. Nope. Once SPI is switched on, it gets locked as MISO and can't be used for general I/O, even though our SPI Apprentice device doesn't have a data output line. Every single pin is used.

Arduino Leonardo Variant

The genuine Arduino "Leonardo" is completely compatible with the Pro Micro, and even has a few extra pins to play with. (though not as many as you think)

Remember that SDA and SCL are 'aliases' of pins D2 and D3, so we really only have D11, D12, A4 and A5 unused. (Though that is enough for many things!) And on the Leo, the SPI bus is on the ICSP header (only) so make sure you have the right cables.

I'll try to post errata here, but the files on GitHub will always be the 'canonical' ones. Make sure to check those before you start building. :-)

Parts List

Thursday, March 20, 2014

Review: Hack (The Facebook Language)

I design new computer languages, when I can't avoid it. It's something I've been doing since before my Honors Thesis, in which I created a distributed language that allowed programs to dart and jump from machine to machine, architecture independent, at will.

So when someone announces a "Major New Language", I tend to have a look-in. When Facebook announces a new language, well...

Here's the thing. I don't want to be mean, but it's going to come out, because some of the criticisms I have border on cruel and unusual. In fact the more I read the spec, the more I wonder what they were thinking...

Actually, I do know what they were thinking. They wanted to improve PHP. That is a laudable goal, but frankly, and this is going to sound mean, the fastest way to do that is to stop using it. Hack is designed to solve problems which are central to Facebook's business, but irrelevant to the majority of platforms and users that run PHP. For those platforms, PHP's shortcoming are not a bug, they are a feature.

Case in point: Async. Hack has some lovely Awaitable metas (sort of similar to generics) which you stick on a function and when you call it, you instantly get back a kind of 'handle' to the result which should be available eventually. 

That's nice. It makes parallelizing parts of the code really simple. 

But it's not that you couldn't already do something similar in PHP (granted, clunky as hell) it's just that most Hosting providers had the PCE (Process Control Extensions) turned off, or would suspend your account if you called too many long-running shell scripts, besides killing your script after 30 seconds of runtime, no matter what you set the timeout to.

Hack Async solves the problem internally for Facebook where they trust all their code, but doesn't do a thing for Dreamhost or Gator clients. In fact it guarantees they can't run it, because even the simplest script can now balloon out into infinite parallel threads, and there's no resource management to limit it.

Also, easy creation of parallel algorithms means easy creation of deadlock problems.. where are the tools to solve those?

Other issues that are important to a company with a huge monolithic codebase are static typing and annotations (check) immutable database tuples (check) generics (check) collections (check) and UML modelling. (um, no!)

Then there are missed opportunities. Traits are broken for the one thing they should be useful for - macros which prevent copypasta and tortuous subclassing in cases where even generics can't cope. It could have been a "literate programming" primitive. Now I don't know what it's for. "Implementing Static Interfaces" apparently.

What really amazes me is that "Lambda Expressions" are more than halfway down the list. "Nullables" (the ability to make static types include 'null') is given more prominence. That's like casually mentioning "oh, and we've got warp drive." after explaining the many benefits of the cup-holder. 

Probably because, like warp drive, it is very hard to retrofit after-the-fact and make it completely functional. Javascript had it built into its core, and still had difficulties.

There are some genuinely nice things; Iterators and Continuations are great (I'm still waiting for them in Javascript, ahem!) although frankly very few people know how to use Continuations, and Iterators are just a standard class interface, and I feel another missed opportunity to combine two things which are really the same thing.

But none of that addresses my original point; which is that Hosting providers chose PHP because it was limited and constrained, and could be used to sandbox user accounts on a shared machine. Everything in Hack was already available in C++ and UNIX, but we were not allowed that full power, because abuses (or bad code) brought down the system for everyone, partly because hosting providers are quite terrible sysadmins.

But also because our languages are really bad at allocating finite resources like memory, bandwidth, and CPU power among sub-processes. Or informing us how much is available, or being consumed by our own processes. But that's another discussion...

Because PHP was largely cut-off from the OS, (deliberately!) Hack has incorporated large chunks of the OS inside itself to make that interface frictionless and performant. Switching on all that power will go wonderfully inside the walled garden of the Facebook server farm and other such corporate installations, but it doesn't address anyone else's needs.

This is an heroic effort to get Facebook out of the language dead-end they had painted themselves into, a way to transition from PHP to C++ incrementally, without re-writing the whole damn codebase. Because they're right: PHP starts breaking down when you get too large and complex. I'm amazed they've pushed it this far. It's designed to spam out a web page in 100ms, not multithread a distributed computation. Fortunately, Turing Complete means what it says on the box.

But if you're looking for a cool server-side language for your next project, I'd still go with node.js, unless you too have a million lines of legacy PHP hung around your neck like a dead albatross.

That's not the most resounding accolade, granted, but at least Hack has a definite use-case. (More than can be said for many languages) For PHP programmers with a formal background, it offers a partial escape from our sad and sorry lives. Well, if the sysadmin would just install it...

Monday, March 17, 2014

Overview: Arduino Graphical User Interface classes

Part of last week's release was a very hacked-together GUI for my programmable droid interface. Here's a couple of screenshots that I've posted before:

Not shown is the thumb-stick that controls the square cursor that you can see near the middle of the screen. The basic features are:
  • point-and-click style interface using thumbstick.
  • line-oriented layout that favors runs of text symbols
  • clicks initiate immediate action, or enter menu mode where moving the joystick will modify the item, after which clicking again will commit the choice.
  • redraw manager maintains a list of updated regions, and can be called incrementally by the main loop.
  • screens can be 'paged', so that the cursor is either limited to the edge of the screen, or navigates over to a new page, or a different screen.
If you take a look in <unorthodox_droid_raster.h>, you'll find a set of 'Pager' classes that are 'bound' to the Raster classes. (it asks the raster classes to draw things) These are refactors of the older <unorthodox_droid_gfx.h> classes that were bound to the Adafruit GFX library.

I have no intention of 'abstracting' them both - the raster classes are the only ones in use, but the older classes are left as examples and prototypes for people who have already committed to the GFX library, or who perhaps want to port the classes to a third hardware/driver library.

Frankly, the classes aren't really important except as examples. That's going to be the hardest part of this explanation: my 'library' isn't what you expect. There's really not a lot of code there - which is the point. What I have is a way of breaking down the problem so that custom GUIs can be created within the very hard limitations of the Arduino.

Here's a couple of things to consider, issues which make this particular version of the problem different from 'classic' cases:
  • Low screen bandwidth - because the screen is usually on the end of a serial link, instead of dual-port SRAM on a video card, simply clearing the screen can take over a second.
  • Write-only - the screens I work with have no way to read pixels back from the display.
  • Memory mismatch - Ardino total RAM is 2.5K. The bytes needed to display a 24-bit 320x256 image is 245K. The Arduino would need 100 times more RAM to keep a local 'copy' of what is on the screen.
If we assume our UI is made up from some 'fixed' elements and some 'variable' bits, what we're really asking is to create a structure that fits within the tiny 2.5K Arduino RAM, that we can modify, which will somehow be able to repaint that entire 245K screen surface.

There's a name for trying to keep something big in a small space: Compression. 

Therefore the real secret to building an Arduino GUI is not about widget toolkits and pen styles, it's going to be about efficiently accessing compressed data. 

The moment we have a cursor pointer moving around, we're going to have to redraw little chunks of the screen where it is, and when it leaves. We have to be able to efficiently reach into the structure to redraw line spans when the cursor moves, or text content changes.

Compression is all about context. The briefest recorded telegram conversation consisted of one symbol sent by each participant. The first sent "?", and the second replied "!". In their context it made perfect sense, and couldn't have been clearer. If you haven't heard the story, then it is less so.

So our first layer of compression is to cut down the problem space in a way that 'everybody knows about', but that doesn't limit us too much. Since most of the 'interfaces' that I needed to build were essentially various kinds of configuration menus, and menus are lists of text items, it was pretty clear I could build such an interface out of "lines of text" and not much else. (Curses, you say.)

Font glyphs (especially european ones) can be compactly represented in 6x8 pixel blocks, or even 5x7 if you always assume a 'spacer' pixel between them. 8x8 might sound neater, but the aspect ratio of 6x8 glyphs is a lot nicer. 8x8 fonts look very 'fat', and fit less per line on small screens.

If we assume our screen is made of a grid of 6x8 pixel blocks, each one of which might contain a character, then we vastly simplify our data load from say 320x256x3 bytes (RGB pixels) to 53x32x2 bytes (assuming one character glyph and 'style' byte per cell)

Thus, keeping a coloured fixed-font grid would require 3.3K of memory. Still too much.

Wow. We can't even keep a full-screen character grid. Even the cheapest microcomputer of the 80's could do that.

We make another assumption: on most of our 'menu based' screen, there are going to be lists of things. "lists" by definition are a bunch of lines that have a similar format. If they have the same general format, but different 'drop-in' values, then that's essentially good old printf().

In the screenshots shown at the top, you can squint and notice that most screens only have two 'line formats' - the header line, and the bulk list lines. (empty lines not counted) and in fact the classes that back these screens have a very simple 'redraw' handler which tests if the row is <1 (in which case it draw the header) or >1 (in which case it subtracts two, adds the page offset, and draws that item number)

These classes are not filled with data structures that are parsed by an outside framework - they are full of if/then and case statements that hard-code the screen layout into compiled instructions - the fastest possible way to render.

The RasterPager class is the base class that each screen is built from. (What might be called a 'screen' is internally referred to as a Pager, because it 'knows how to draw a page'.)

Pagers are asked by the render manager to draw RasterSpan objects which are essentially defined as a sequential span of characters on a single text line, plus some raster context. "line 10, chars 8-17" is a typical span. 

The draw manager has a list of two bytes for each line to record the 'invalid' span. (how much we need to redraw next time - which may be nothing) If the manager is asked to 'invalidate' more spans on a line which already has one, the existing span is extended to include those characters as well. 

When the draw manager is given a 'time slice' to spend on rendering pixels, it finds the next line which has a span that needs to be drawn, and asks the Pager to redraw it.

The draw manager has no idea how the span is going to be painted - it might have character glyphs drawn into it, or just pretty patterns - it just knows which parts of which lines are waiting to be done. When it comes time to do the job, the draw manager passes the span request on to the Pager class.

So when the Pager gets the request to draw parts of 'line 0' (for example) it can resolve this in any way it likes. Case statement, array lookup, random number generator, whatever. In many ways, the Pager classes are like HTML 'stylesheet' classes; they mostly contain formatting information.

Because we are interested in rendering lines of coloured text, the RasterPager class has a method called format_span() which works very similarly to printf() in that it has a fixed "format string" parameter (compactly stored in program flash memory) and a set of variable parameters in an array.

format_span() is written to efficiently render sub-spans of the entire line - it can efficiently skip over parts that aren't involved, and doesn't buffer. It can extract single digits from the middle of decimal-formatted and aligned numeric values, and use single parameters multiple times in multiple ways, as decimal values, style indexes, or string selectors.

Here's the top two line definitions used by the SignalPager screen shown. (there's a debug footer I've omitted) It looks bulky in source, but the first string is only 14 bytes long (compiled) and the second is 26 bytes. 

// title line
PROGMEM prog_uint8_t SignalPager_title_line[] = {
// fragment metadata 
RasterPager::Span | 6,
RasterPager::Span | 9 | RasterPager::Last,
// fragment instructions
RasterPager::Lit | 1, ' ',
RasterPager::Lit | 9, 'X','-','D','R','O','I','D',' ','1'

// list line
PROGMEM prog_uint8_t SignalPager_signal_line[] = {
// fragment metadata 
RasterPager::Style | RasterPager::Vector | 5, RasterPager::Span | 2,
RasterPager::Style | 6, RasterPager::Span | 4,
RasterPager::Style | 3, RasterPager::Span | 2,
RasterPager::Style | 0, RasterPager::Span | 7,
RasterPager::Style | 3, RasterPager::Span | 5,
RasterPager::Style | RasterPager::Vector | 3, RasterPager::Span | 1 | RasterPager::Last,
// fragment instructions
RasterPager::Lit | 2, '<',' ', 
RasterPager::Dec | 0,
RasterPager::Lit | 2, ' ','(', 
RasterPager::Dec | 1,
RasterPager::Lit | 3, ' ',')',' ',
RasterPager::Lit | 1, '>'

Here's the overloaded draw_span() method on that class, which selects which of those formats to use (based on the line) and then stuffs the vector with the relevant parameters, and then calls format_span().

void draw_span(RasterSpan * s) {
int row = s->row;
prog_uint8_t * line_format = 0;
int line_vector[8];
if(row==0) {
line_format = SignalPager_title_line;
} else if(row==1) {
} else if(row<18) {
int line = row + scroll_y*16 - header;
line_format = SignalPager_signal_line;
line_vector[0] = line;
line_vector[1] = droid->signal_value[line];
line_vector[2] = droid->fs.token_size(line&0x7F);
line_vector[3] = line_vector[2] ? 2 : 3;
line_vector[4] = droid->fs.token_size((line&0x3F)|0x80);
line_vector[5] = (line<64) ? (line_vector[4] ? 2 : 3) : 7;
format_span(s, line_format, line_vector);

Note that apart from the function calls to obtain the parameter values, (which we have to do anyway) the code essentially consists of stuffing a local fixed array with values (very cheap) and then calling one method. Remember that function calls have overhead, and can incur anywhere from a dozen to hundreds of bytes of compiled program code. Calling the raster equivalent of drawText() and setColor() for each parameter directly would clearly take more code bytes - the less calls we need to make, the better. One is pretty minimal. One shared is even better.

The chain now goes: the RasterDroid draw manager knows which bits of which lines to redraw, so it incrementally calls the pager draw_span() for each one, and the pager passes that span request (along with the appropriate line definition and looked-up parameters) over to format_span(), which then draws individual font symbols to the raster device using draw_text() as needed. Whew!

This is how we decompress the instruction 'redraw span 8-10 on line 3' into pixels which hit the screen. Not with data structures - because they don't fit in RAM - but with pure code. Our program flash budget is much bigger, so we depend on that instead. Also, by compiling the code we create the fastest possible execution path, optimized for each 'screen' while sharing the common 'drawing primitive' they all use.

This is all totally backwards, when compared with how a UI toolkit like Windows or Aqua or GTK does things - with abstract widgets. But abstraction is a luxury that, on our memory budget, we can't afford. Knowing our context means abandoning polymorphism, to some degree, by "hardcoding magic numbers", or in this case; format strings and click positions.

Although... since the code is so structured, there's no reason why a 'widget layout' system couldn't be created which code-generates these classes. That system could be full of interesting widget code that aligns columns or draws ASCII art, but that can all be done pre-compile.