Starting up "nicely"Starting up as quickly as possible is always best on every platform. This is a general issue so I won't focus on it, because on the web, there is also another important criterion, which is to start up in as asynchronous a way as possible. By asynchronous I mean to not run in a single large event on the main thread: Instead it is better for as much as possible to be done on background threads (web workers) and for what does run on the main thread to at least be broken up into small pieces.
Why is being asynchronous important? A single long-running event makes the page unresponsive - no input events are being handled and no output is being shown. This might seem not that important for startup, when there is little interaction anyhow. But even during startup you want to at least show a progress bar to give the user an indication that things are moving along, and also most browsers will eventually warn the user about nonresponsive web pages, showing a scary "slow script" dialog with an option to cancel the script or close the page.
Asynchronize all the things..?If you're writing a new codebase, you would indeed make everything asynchronous. All pure startup calculations would be done in background threads, and main thread events would be very short. Here is an example of such a recently-launched product: The worst main thread stall during startup seems to be about half a second, not bad at all, and a friendly progress bar updates you on the current status. When you are writing a new codebase it is straightforward to design in a way that makes nice startup like that achievable.
But you can still asynchronize even such a codebase: Here is what startup was like until recently: BananaBread r13, and and here is what it looks like now: BananaBread r15. The worst main thread stall is 1.4 seconds on my laptop, which is not great but definitely enough to prevent "slow script" warnings on most machines, and there is now a progress bar.
Means of asynchronizationThe first important thing is to find small chunks of computation that are easily done ahead of time and their results cached for later:
- In BananaBread jpg and png images must be decoded into pixel data. Emscripten does that during the preloading phase, each one is decoded by a separate Image element. This not only breaks things up into small pieces, it also uses the browser's native decoders, so it happens faster than if we had compiled a decoding library with the rest of the game engine. (A clever browser might also do these decodings in parallel..)
- Cube 2 levels (or maps as they are called) are gzip compressed, and the engine decompresses them during startup. I refactored that and BananaBread now decompresses them using zee.js during preloading, also in a worker.
After preloading, the compiled engine starts to run and we are necessarily single-threaded. The important thing to do at this stage is to at least break up the startup code into small-enough pieces to avoid freezing the main thread. This requires refactoring the original source code, and is not the most fun thing in the world, but definitely possible. Emscripten provides an API to help with this (emscripten_push_main_loop_blocker etc.), you can define a queue of functions to be called in sequence, each in a separate invocation. So the tricky part is just to deal with the codebase you are porting.
Over a few days I broke up the biggest functions called during startup, getting from a maximum of 6 seconds to 1.4 seconds. Browsers seem to complain after around 10 seconds, so 1.4 isn't perfect, but on machines 7x slower than my laptop things should still be ok. Further breaking up will be hard as it starts to get into intricate parts of the game engine - it's possible, but it would take serious time and effort.
Other notesOf course, there are other big factors with startup:
- Download time: My personal server that hosts the BananaBread links above is not that fast, and doesn't even do gzip compression. We hope to get a more serious hosting solution before BananaBread hits 1.0.
- GPU factors: BananaBread compiles a lot of shaders and uploads a lot of textures during startup. On the plus side, the time these take is probably not much different than a native build of the engine, but it's noticeable there too.
- Data: Smaller levels lead to faster startup and vice versa. Our levels aren't done yet, we'll optimize them some more.