It's often said that the web needs a bytecode. For example, the very first comment in a very recent article on video codecs on the web
A proper standardized bytecode for browsers would (most likely) allow
developers a broader range of languages to choose from as well as hiding
the source code from the browser/viewer (if that's good or not is
subjective of course).
And other comments continue with
Just to throw a random idea out there: LLVM bytecode. That
infrastructure already exists, and you get to use the ton of languages
that already have a frontend for it (and more in the future, I'm sure).
replacing it with a bytecode so we can use decent languages again.
Put a proper bytecode engine in the browser instead, and those people
the rest of us that use serious languages could use them too.
Honestly, .Net/Mono would probably be the best bet. It's mature, there
are tons of languages targeting it, and it runs pretty much everywhere
already as fast as native code
Ignoring the nonproductive JS-hating comments, basically the point is that people want to use various languages on the web, and they want those languages to run fast. Bytecode VMs have been very popular since Java in the 90's, and they show that multiple languages can run in a single VM while maintaining good performance, so asking for a bytecode for the web seems to make sense at first glance.
But already in the quotes above we see the first problem: Some people want one bytecode, others want another, for various reasons. Some people just like the languages on one VM more than another. Some bytecode VMs are proprietary or patented or tightly controlled by a single corporation, and some people don't like some of those things. So we don't actually have a candidate
for a single universal bytecode for the web. What we have is a hope for an ideal bytecode - and multiple potential candidates.
Perhaps though not all of the candidates are relevant? We need to pin down the criteria for determining what is a "web bytecode". The requirements as mentioned by those requesting it include
- Support all the languages
- Run code at high speed
To those we can add two additional requirements that are not mentioned in the above quotations, but are often heard:
- Be a convenient compiler target
- Have a compact format for transfer
In addition we must add the requirements that anything that runs on the web must fulfill,
- Be standardized
- Be platform-independent
- Be secure
, as listed in the 7 requirements above. And of course this is not the first time that has been said, see here
Some of the motivation for a new bytecode appears to come from an elegance
(Note that I'm not saying we shouldn't try. We should. But we shouldn't stop trying at the same time to also improve the current situation in a gradual way. My point is that the latter is more likely to succeed.)
"one bytecode to rule them all"
- Fast - runs all languages at their maximal speed
- Portable - runs on all CPUs and OSes
- Safe - sandboxable so it cannot be used to get control of users' machines
The elusive perfect universal bytecode would need to do all three, but it seems to me that we can only pick two.
, but that was done before the leaps in JS performance that came with CrankShaft, TypeInference, IonMonkey, DFG, etc.).
Yet another area where decisions must be made is garbage collection. Different languages have different patterns of usage, both determined by the language itself and the culture around the language. For example, the new garbage collector planned for LuaJIT 3.0
, a complete redesign from scratch, is not going to be a copying GC, but in other VMs there are copying GCs. Another concern is finalization: Some languages allow hooking into object destruction, either before or after the object is GC'd, while others disallow such things entirely. A design decision on that matter has implications for performance. So it is doubtful that a single GC could be truly optimal for all languages, in the sense of being "perfect" and letting everything run at maximal speed.
So any VM must make decisions and tradeoffs about fundamental features. There is no obvious optimal solution that is right for everything. If there were, all VMs would look the same, but they very much do not. Even relatively similar VMs like the JVM and CLR (which are similar for obvious historic reasons) have fundamental differences.
Perhaps a single VM could include all the possible basic types - both "normal" doubles and ints, and NaNboxed doubles? Both Pascal-type strings and C-type strings? Both asynchronous and synchronous APIs for everything? Of course all these things are possible, but they make things much more complicated. If you really want to squeeze every last ounce of performance out of your VM, you should keep it simple - that's what LuaJIT does, and very well. Trying to support all the things will lead to compromises, which goes against the goal of a VM that "runs all languages at their maximal speed".
(Of course there is one way to support all the things at maximal speed: Use a native platform as your VM. x86 can run Java, LuaJIT and JS all at maximal speed almost by definition. It can even be sandboxed in various ways. But it has lost the third property of being platform-independent.)
and get the best of both worlds that way, instead of putting everything we need in one VM? That sounds like an interesting idea at first, but it has technical difficulties and downsides
, is complex, and would likely regress existing performance.
Do we actually need "maximal speed"?
So I don't think there is much to gain, technically speaking, from considering a new
bytecode for the web. The only clear advantage such an approach could give is perhaps a more elegant solution, if we started from scratch and designed a new solution with less baggage. That's an appealing idea, and in general
elegance often leads to better results, but as argued earlier there would likely be no significant technical advantages to elegance in this
particular case - so it would be elegance for elegance's sake.
I purposefully said we don't need a new
In summary, we already have what practically amounts to a bytecode VM in our browsers.