Tuesday, November 17, 2015

Harr Harr!

Here's today's development screenshot of Astromech: (from the virtual DSP desk.)

An image that will give your PNG decompressor conniptions, no doubt. The middle screen-full of leafy trees is a live webcam feed from out my window. The pink lines all across it are because it's a shitty webcam that cost $6 off ebay.

The left-hand screen is a 256x256 real-time Fast-Fourier Transform of the webcam luminance. That's not big news, Astromech has always done that. Its first trick.

The right-hand screen is the new thing for today. It's a 512x512 "H-Transform" which likely originally stands for "Two-Dimensional Harr Transform". I also call it the "Hubble Transform", because it's the basis of the compression format the Hubble data team invented in order to distribute 600Gb of their pretty pictures.

The full text I'm following here is Tiled Image Convention for Storing Compressed Images in FITS Binary Tables  published by NASA.

Don't let that NASA appellation fool you into thinking there's anything hard about the H-Transform. Compared to the FFT or Cosine Transform or Huffman coding it's very, very simple. And the best thing about the H-Transform is that it's parallelizable on WebGL. (as you can see.)

That's how I'm doing this in real time, (about 12 frames per second I'd guess, limited by webcam speed) in my browser, and my CPU usage is 20%.

Why the H-Transform? Why not just use something browser-supplied like h264, or V8 or a stream of JPEG/PNG images? (MJPEG!) which is built into most modern browsers? Well, in a nutshell, because nice as they are, they're not "scientific".

There a really big difference between a compressor that optimizes for the human perceptual system, and a compressor that tries to preserve the scientific integrity of the source data. The H-Transform is the second type.

Similar to a 'MIPMap', the H-Transform encodes a pyramid of lower-resolution (but higher entropy)
versions of the source image into the lower-left corner, like a fractal.
The larger 'residual' areas become easier to compress.

That's why NASA trusts data that has been stored in that format. It has certain very nice mathematical properties. It's a 'lossless' compressor, but one with a tuneable 'noise floor'. If that seems a contradiction, welcome to the magic of the quantized H-transform, where 60:1 compression ratios are possible.

There's a couple of stages to go before that image on the right is turned into a FITS file, but the hard part is done and the rest is just shuffling bits around. Well, assuming the browser will let me save a stream of data to a file. That's really tricky, it seems.

Update: 22/Nov

The whole thing provably works now, since I've also implemented the inverse H-transform. (there were a few bugs)

The inverse of the transformed cat is also a cat. Well, you'd expect that, surely?
Basically in this example, the middle webcam image is encoded into the right-hand image (which looks fractal yet empty - that's the H- transform) and then that is run through the Inverse transform (a separate bit of code that does everything in a different order, using the big mostly-empty texture) to go in the left-hand window.

It's almost too easy.

And so the fact that the two left images look boringly identical is a good thing, given the (poor quality to begin with) data has been mangled twice in between by me. Cat sitting on warm computer staring at cursor. It's a common test case around here.


  1. Awesome! Very leafy lossless images, and I'm guessing it runs fast! :)

  2. Indeed - probably 60fps if I had a camera that put that much data out. And it even works properly now! I just wrote the Inverse transform to match, and that worked out all the bugs. Now I can round-trip an image and not be able to tell the difference from the original. That's the prerequisite before quantizing the shit out of it to push the compression ratios. Plus it's always good to be able to read what you wrote.