Tuesday, July 24, 2012

BananaBread (or any compiled codebase) Startup Experience

This post talks about startup experience improvements in BananaBread. BananaBread is a specific 3D game engine compiled from C++ to JavaScript, but the principles are general and could be applied to any large compiled codebase.

Starting up "nicely"

Starting up as quickly as possible is always best on every platform. This is a general issue so I won't focus on it, because on the web, there is also another important criterion, which is to start up in as asynchronous a way as possible. By asynchronous I mean to not run in a single large event on the main thread: Instead it is better for as much as possible to be done on background threads (web workers) and for what does run on the main thread to at least be broken up into small pieces.

Why is being asynchronous important? A single long-running event makes the page unresponsive - no input events are being handled and no output is being shown. This might seem not that important for startup, when there is little interaction anyhow. But even during startup you want to at least show a progress bar to give the user an indication that things are moving along, and also most browsers will eventually warn the user about nonresponsive web pages, showing a scary "slow script" dialog with an option to cancel the script or close the page.

Asynchronize all the things..?

If you're writing a new codebase, you would indeed make everything asynchronous. All pure startup calculations would be done in background threads, and main thread events would be very short. Here is an example of such a recently-launched product: The worst main thread stall during startup seems to be about half a second, not bad at all, and a friendly progress bar updates you on the current status. When you are writing a new codebase it is straightforward to design in a way that makes nice startup like that achievable.

When you're compiling an existing codebase, things are harder, though. A normal desktop application does not need to be written in an asynchronous manner, while it might have a main loop that can easily be made asynchronous (run each main loop iteration separately), startup can be just a single continuous process of execution with some periodic notifications to update a progress meter. That is exactly the situation with Cube 2, the game engine compiled to JavaScript in the BananaBread project.

Now, there is a way to have long-running JavaScript code in browsers: Run it in a web worker. That would be the perfect solution, however, workers do not have access to WebGL or audio, and there is no way for them to send synchronous messages to the main thread, so even proxying those APIs to them is not practical. So unless you can easily box off "pure calculation" parts into workers, you do need to run most or all of your code on the main thread.

But you can still asynchronize even such a codebase: Here is what startup was like until recently: BananaBread r13, and and here is what it looks like now: BananaBread r15. The worst main thread stall is 1.4 seconds on my laptop, which is not great but definitely enough to prevent "slow script" warnings on most machines, and there is now a progress bar.

Means of asynchronization

The first important thing is to find small chunks of computation that are easily done ahead of time and their results cached for later:
  • In BananaBread jpg and png images must be decoded into pixel data. Emscripten does that during the preloading phase, each one is decoded by a separate Image element. This not only breaks things up into small pieces, it also uses the browser's native decoders, so it happens faster than if we had compiled a decoding library with the rest of the game engine. (A clever browser might also do these decodings in parallel..)
  • Crunch files need to be decompressed using a compiled JavaScript decoder. We do that during preloading as well, with the decoder running in a web worker.
  • Cube 2 levels (or maps as they are called) are gzip compressed, and the engine decompresses them during startup. I refactored that and BananaBread now decompresses them using zee.js during preloading, also in a worker.
Taking these three points together, BananaBread can use three cores during the preloading phase. This is actually an improvement on the original game engine which is single-threaded!

After preloading, the compiled engine starts to run and we are necessarily single-threaded. The important thing to do at this stage is to at least break up the startup code into small-enough pieces to avoid freezing the main thread. This requires refactoring the original source code, and is not the most fun thing in the world, but definitely possible. Emscripten provides an API to help with this (emscripten_push_main_loop_blocker etc.), you can define a queue of functions to be called in sequence, each in a separate invocation. So the tricky part is just to deal with the codebase you are porting.

Over a few days I broke up the biggest functions called during startup, getting from a maximum of 6 seconds to 1.4 seconds. Browsers seem to complain after around 10 seconds, so 1.4 isn't perfect, but on machines 7x slower than my laptop things should still be ok. Further breaking up will be hard as it starts to get into intricate parts of the game engine - it's possible, but it would take serious time and effort.

Other notes

Of course, there are other big factors with startup:
  • Download time: My personal server that hosts the BananaBread links above is not that fast, and doesn't even do gzip compression. We hope to get a more serious hosting solution before BananaBread hits 1.0.
  • GPU factors: BananaBread compiles a lot of shaders and uploads a lot of textures during startup. On the plus side, the time these take is probably not much different than a native build of the engine, but it's noticeable there too.
  • Data: Smaller levels lead to faster startup and vice versa. Our levels aren't done yet, we'll optimize them some more.
  • Subjective factors: You can't render during long-running JavaScript calculations, but music will keep playing, which makes for a less-boring experience for the user. Also, a nice splash screen during startup is a good idea, we should do that... ;)

5 comments:

  1. Running BananaBread (and similar games) in Workers would be a big win. Other than WebGL or audio, what other synchronous APIs do you need in Web Workers to make them viable?

    ReplyDelete
  2. GL, 2D canvas, and Audio are the big ones I think. Aside from them, IndexedDB and small things like performance.now (I haven't checked if they are already available in workers or not, some of them might be).

    ReplyDelete
  3. IndexedDB would have to be converted to WebIDL. But that all sounds quite doable.

    ReplyDelete
  4. Does cloudpartytime also use Emscripten? The website is so cool~~~~

    ReplyDelete
  5. No, I was giving cloud party as an example of a modern WebGL site that is *not* compiled code (so it was presumably easier for them to make startup as nice as it is there).

    ReplyDelete