Monday, May 27, 2013

Transform and Roll Out

I've spent most of the last week deep in the Fast Fourier Transform and WebGL Shaders, two things I have not had that much experience with before. Oh sure, I knew the theory, but there's a big difference between that and being able to write actual code.

This paper was extremely useful. My implementation really only added one line.

It didn't help that I set myself a pretty hard challenge on this one. Implement a famously brain-bending algorithm that I didn't entirely understand in a language/environment that I'd never used before. And I have to admit, this time yesterday morning I was staring at frustration at corrupted transforms with no idea how to fix them.

But, when in doubt, work it out by hand. I knew I was close. Exactly half a pixel off, in fact. That's pretty close.

Turns out the gl_FragCoord  coordinates passed to the Fragment Shader are not integers. They are half-integers, which align with the pixel's center. So they run [0.5, 1.5, 2.5...]. (It seems DirectX, which the original paper targeted, doesn't do this.)

Because I like to show the code, here's the WebGl fragment shader which performs a radix-2 Stockham FFT pass on a texture.

<script id="fft-rows" type="x-fragment-shader">
precision mediump int;
precision highp float;
precision highp sampler2D;

uniform sampler2D image; // source texture
uniform float N; // FFT Size (eg. 256)
uniform float Ns; // FFT Pass Size (eg. 128)

// perform a dual-fft (operate on the texture XY as one complex number, and ZW independantly)
// pass for a muti-pass transform of a size declared in the uniforms
vec4 dualfft_rows(vec2 p) {
p.x = floor(p.x);
float base = floor(p.x / Ns) * (Ns/2.0);
float offset = mod(p.x, Ns/2.0);
float iy  = p.y / N;
float ix0 = (base + offset) / N;
float ix1 = ix0 + 0.5;
vec4 v0 = texture2D(image, vec2(ix0, iy) );
vec4 v1 = texture2D(image, vec2(ix1, iy) );
float angle = radians(-360.0*p.x/Ns);
vec2 t = vec2( cos(angle), sin(angle) );
// transform two complex number planes at once
return v0 + vec4(t.x*v1.x - t.y*v1.y,
t.y*v1.x + t.x*v1.y,
t.x*v1.z - t.y*v1.w,
t.y*v1.z + t.x*v1.w);

void main() {
gl_FragColor = dualfft_rows( gl_FragCoord.xy );

It computes the twiddle factors explicitly instead of looking them up in a texture (as other variants do, since modern cards probably perform trig faster than texture accesses) and performs two independent FFTs on the xy and zw pairs per pass. Floating Point textures are essential for this.

Of course there's another 1500 lines of code wrapped around this little fragment shader, but a lot of that is 'test' scaffolding that can come down now. However there seems to be a good 700 lines of code that I've packaged into a 'GPU' class that acts as a friendly wrapper around WebGL, and lets me 'compute' with textures like I'm doing simple algebra.

The main point is this: two weeks ago, I had favorable initial impressions of WebGL. There were some pretty demos. It worked great as a toy. But I had some reservations about how far I could take it. Since then, I've implemented one of the most important and powerful scientific analysis tools there is - the Fourier transform. And it flies.

Today's challenge will be to combine the working FFT code with some other experimental code, and jump from images to transforming video. I've already got the code to access the webcam from inside the browser. HTML5 makes all this integration not only possible, but almost trivial.

Lastly, this is a huge load off my mind. I'd been proceeding on the assumption this was possible, (that I can actually write a working FFT in GLSL) so I'm gratified that has proved correct. I spend last weekend standing in a very, very cold paddock capturing astronomical imagery on that assumption. Now I can process that data.

Beyond that, I've now got new tools in my bag of tricks. And a new respect for browsers.

Sunday, May 12, 2013

WebGL Initial Impressions

You know how you put things off, and then when you finally make yourself do it you discover it's actually great fun, and you wonder why you ever waited? I had a week like that with WebGL and the FFT.

WebGL is another of the HTML5 technologies which is poised to make an enormous impact on how we use the internet, and how browser code can accesses our hardware. It is essentially a binding of the OpenGL ES 2.0 spec within Javascript.

Web pages that can do OpenGL. Think about that for a minute.

Then consider that WebGL isn't some wonderful future standard, it's already in most major browsers. (IE people have to wait until 11.) If you have Chrome or Mozilla and a 3D Card, you already have it. Really. Mobile devices also support it (that's really important.) depending on their hardware. (Most new phones have a 3D chip, if only for fast 2D compositing.)

So yes, MineCraft and Quake could be written in Javascript now, and run in the browser on Android. That's great. Lots of people are writing new games, as you would expect.

There's also a second reason why you want to access the GPU from Javascript - it's an enormous chunk of optimized processing power that we can use to do cool stuff. On many machines, it's actually more powerful than the core CPU.

Texture Shaders, in particular, are special programs that are downloaded to the GPU and run per-pixel, in parallel, on perhaps hundreds of 'texture units'. They can perform massive parallel computations that are fast on a lot of useful algorithms besides drawing animation, like Fourier transforms.

The first texture shader I wrote in WebGL computed mandelbrot sets. In a 1024x1024 canvas element, I was generating 1024 iteration mandelbrots at real-time framerates over 10fps. At say 8 math operations per iteration, that calculation consumed 80GFlops, easily. And I wasn't pushing it.

80GFLOPS. Eighty Billion Floating-point operations per second. The original Cray Supercomputers weren't that fast.

And another thing... I've written OpenGL code before, and WebGL is just better. First, there's no mucking around with video capabilities or window handles or bit planes. You put a canvas element on a HTML page. Then you ask for it's OpenGL context. The browser does all the rest. Compared to how it used to be done, getting a GL context is sheer ease.

Then it all gets confusing again when you can't find how to render simple triangles, until you realize the big normalization they did in the OpenGL 'ES' specs is to take away all the 'simplified' methods, and leave only the 'useful' methods, whenever there was a duplication of functionality. You can't draw single triangles anymore. Only triangle strips. (of which the simplest strip is a single triangle, so you can still get the same outcome, just with the 'more useful' method)

Odd enough, this is also a joy to use, once you get your head around it. It makes writing the equivalent of 'hello world' a little harder, but once you've coded minimal vertex and fragment shaders just to get a single quad on the screen, you're already most of the way towards your end goal. Since you're always using the 'hard shader language' and not 'easy shader language', you don't have to stop and re-code once your program exceeds the capabilities of the simple primitives.

All shaders - simple and complex - fit the same basic pattern and become more interchangeable, while shortening the full spec. In the normally expanding world of standards, (to the point they can't be implemented in a single human lifetime anymore) this is an amazing and refreshing achievement.

It does, however, come at the cost of a bigger initial learning curve. GPUs compute in a different way. (That's kind of the point) And it's not necessarily obvious why they function the way they do... GPU card are more a collection of "hacks that worked" than any coherent plan to create a platform. I know a lot about the history of how they evolved to this point, and I've done a lot of parallel programming on SIMD machines too, so most of the shader concepts are already familiar to me.

WebGL is getting a 'Dark Arts' reputation because of this, and not without good reason. We leave the comforting worlds of Paragraps and CSS for matrix math and rendering pipelines.

What I'm discovering is this; it's a Dark Art that's well worth learning. You can perform some very powerful juju.

Assuming your browser can take it, look at some of these: