A proper standardized bytecode for browsers would (most likely) allow developers a broader range of languages to choose from as well as hiding the source code from the browser/viewer (if that's good or not is subjective of course).And other comments continue with
Just to throw a random idea out there: LLVM bytecode. That infrastructure already exists, and you get to use the ton of languages that already have a frontend for it (and more in the future, I'm sure).Ignoring the nonproductive JS-hating comments, basically the point is that people want to use various languages on the web, and they want those languages to run fast. Bytecode VMs have been very popular since Java in the 90's, and they show that multiple languages can run in a single VM while maintaining good performance, so asking for a bytecode for the web seems to make sense at first glance.
Honestly, .Net/Mono would probably be the best bet. It's mature, there are tons of languages targeting it, and it runs pretty much everywhere already as fast as native code
But already in the quotes above we see the first problem: Some people want one bytecode, others want another, for various reasons. Some people just like the languages on one VM more than another. Some bytecode VMs are proprietary or patented or tightly controlled by a single corporation, and some people don't like some of those things. So we don't actually have a candidate for a single universal bytecode for the web. What we have is a hope for an ideal bytecode - and multiple potential candidates.
Perhaps though not all of the candidates are relevant? We need to pin down the criteria for determining what is a "web bytecode". The requirements as mentioned by those requesting it include
- Support all the languages
- Run code at high speed
- Be a convenient compiler target
- Have a compact format for transfer
- Be standardized
- Be platform-independent
- Be secure
(Note that I'm not saying we shouldn't try. We should. But we shouldn't stop trying at the same time to also improve the current situation in a gradual way. My point is that the latter is more likely to succeed.)
- Fast - runs all languages at their maximal speed
- Portable - runs on all CPUs and OSes
- Safe - sandboxable so it cannot be used to get control of users' machines
Yet another area where decisions must be made is garbage collection. Different languages have different patterns of usage, both determined by the language itself and the culture around the language. For example, the new garbage collector planned for LuaJIT 3.0, a complete redesign from scratch, is not going to be a copying GC, but in other VMs there are copying GCs. Another concern is finalization: Some languages allow hooking into object destruction, either before or after the object is GC'd, while others disallow such things entirely. A design decision on that matter has implications for performance. So it is doubtful that a single GC could be truly optimal for all languages, in the sense of being "perfect" and letting everything run at maximal speed.
So any VM must make decisions and tradeoffs about fundamental features. There is no obvious optimal solution that is right for everything. If there were, all VMs would look the same, but they very much do not. Even relatively similar VMs like the JVM and CLR (which are similar for obvious historic reasons) have fundamental differences.
Perhaps a single VM could include all the possible basic types - both "normal" doubles and ints, and NaNboxed doubles? Both Pascal-type strings and C-type strings? Both asynchronous and synchronous APIs for everything? Of course all these things are possible, but they make things much more complicated. If you really want to squeeze every last ounce of performance out of your VM, you should keep it simple - that's what LuaJIT does, and very well. Trying to support all the things will lead to compromises, which goes against the goal of a VM that "runs all languages at their maximal speed".
(Of course there is one way to support all the things at maximal speed: Use a native platform as your VM. x86 can run Java, LuaJIT and JS all at maximal speed almost by definition. It can even be sandboxed in various ways. But it has lost the third property of being platform-independent.)
So I don't think there is much to gain, technically speaking, from considering a new bytecode for the web. The only clear advantage such an approach could give is perhaps a more elegant solution, if we started from scratch and designed a new solution with less baggage. That's an appealing idea, and in general elegance often leads to better results, but as argued earlier there would likely be no significant technical advantages to elegance in this particular case - so it would be elegance for elegance's sake.
No mention of pNaCl at all?ReplyDelete
The blogpost is long enough as it is ;) But what specifically do you think should be addressed regarding PNaCl?ReplyDelete
The problem I have with JS and asm.js in particular is that the arguments for making it JS could be made for any of the alternatives. But JS has particular disadvantages that are being ignored just because.ReplyDelete
- Given the enormous speed difference between normal JS and asm.js, and the kinds of things asm.js is being used for, I cannot imagine anyone wanting to run asm.js code in an unsupported browser: it would simply be too slow to be useful. It's like trying to play a modern game on a 5 year old graphics card... possible in theory, unplayable in practice.
- Proposals for incorporating more types and e.g. SIMD instructions naturally cannot do so at a language level, and must instead of use convoluted wrapper objects. Which means there are now real native types and faux-native types, and some types 'are more equal than others'.
I can only conclude that asm.js is a massive fallacy, of virtue-by-association, which picks the compromise that nobody really wanted, but only seems good because it is currently the least offensive.
I'm reminded of all the arguments of why XML was a great idea, and how none of those things ever materialized: XML was "human-friendly" but rarely written by hand, and the tools built upon it, such as XSLT, gained nothing from being written in XML, quite the opposite.
Instead, JSON took over, by virtue of including only the parts people actually wanted, and mapping cleanly to the types and structures of numerous languages rather than just the one thing we had before (SGML/HTML).
LLVM to me seems like the JSON of JITs. It would be an real intermediary format, not a hacked one, it's had many more years of research behind it than asm.js, and it's not reinventing the wheel just for the web's sake.
Unfortunately it seems asm.js-mania has already struck, and just like JSON, we'll probably have to wait 10 years before everyone finally admits they dove in head-first without really considering the alternatives seriously.
Don't dismiss the usefulness of asm.js being a subset of JS. For one thing, it means there is less work to specify and test asm.js than there would be for other bytecode formats. For another thing, asm.js code that runs well on a phone in an asm.js-supporting browser will probably also run well on a fast desktop in a browser that doesn't support asm.js. There are also use-cases involving porting of legacy code where raw performance is not a big issue.Delete
Proposals for incorporating more types are mostly focused on BinaryData and other new features in ES6, so these are real language features, not "fake".
Furthermore, you haven't listed any real benefits for LLVM. "Not a hacked one" isn't a tangible benefit, and it's also not true, since LLVM bitcode was not designed to be portable and actually isn't. Which means anything LLVM-based, like PNaCl, has more work to do, apparently more work than asm.js based on comparing our asm.js efforts with Google's PNaCl effort.
> The problem I have with JS and asm.js in particular is that the arguments for making it JS could be made for any of the alternatives.Delete
As I said in the article, yes, JS as a multilanguage VM is comparable to other multilanguage VMs (JVM, CLR). The main benefit it has is that it is already standardized and present in all web browers. That is the one specific argument, that cannot be made for the alternatives.
> Given the enormous speed difference between normal JS and asm.js, and the kinds of things asm.js is being used for, I cannot imagine anyone wanting to run asm.js code in an unsupported browser: it would simply be too slow to be useful
This is simply not true. Look at
where you can see v8 doing very well in many cases despite not having special asm.js optimizations. As another example, try running Epic Citadel in Chrome (requires a special build currently due to a memory bug and a network bug) - despite not having special asm.js optimizations, it runs quite well.
> LLVM to me seems like the JSON of JITs. It would be an real intermediary format, not a hacked one, it's had many more years of research behind it than asm.js, and it's not reinventing the wheel just for the web's sake.
LLVM is not portable. You can see asm.js as a portable variant of LLVM IR, in fact emscripten compiles LLVM into asm.js - so there is a clear equivalence between the two.
We can expect the speed gap between desktop and mobile to decrease, so this isn't really a relevant argument for the future of the web. Neither is it an attractive argument that asm.js is excellent for doing something today in a browser that native excelled at 10+ years ago. And whether asm.js's BinaryData is derived from ES6 or not doesn't change them being square pegs for round holes, which map poorly to other languages and are cumbersome to work with in ES6 itself.Delete
The argument that asm.js is here today and PNaCl isn't strikes me as a classic open source "talk is silver, code is gold" argument and in line with what I've come to expect of Mozilla over the past 15 years. I'm not trying to troll, I've just seen this in open source communities over and over again: there is no room for truly big projects, so instead the only thing that gets done is that which can be done in incremental steps.
LLVM is a technology with proven potential. In fact, it's the LLVM-driven emscripten that makes asm.js viable in the first place, is it not? If that's not a strong sign that it's the LLVM technology and not the asm.js subset where the magic is, then I don't know what is. I admit I haven't worked with LLVM much directly, but it strikes me as exactly what asm.js pretends to be: a high-level assembly-like language.
Having LLVM infrastructure in the browser also has interesting implications for WebGL and GLSL. Indeed as far as I understand, Apple used LLVM as the JIT in CoreGraphics for on-demand hardware acceleration, which worked so well nobody really noticed. That's where the web is going if you really look forward, instead of trying to make demos from 1999 run well...
"so there is a clear equivalence between the two."Delete
There is an obvious equivalence between any two Turing complete languages. That doesn't mean that that equivalence is elegant. Look at what the demoscene is doing today, rather than in 2006, and ask yourself if asm.js will get us closer to having that run in a browser any time soon...
> LLVM is a technology with proven potential. In fact, it's the LLVM-driven emscripten that makes asm.js viable in the first place, is it not? If that's not a strong sign that it's the LLVM technology and not the asm.js subset where the magic is, then I don't know what is.Delete
Who said otherwise? Of course the "magic" is LLVM + the JS VM's backend optimizers (IonMonkey in Firefox, CrankShaft in Chrome, etc.).
Emscripten compiles LLVM into asm.js and optimizes it, and asm.js is just a subset that is easy to optimize. Most of the work is done by LLVM and the JS VMs.
Should we directly put LLVM in the browser as opposed to first compiling it to something portable like asm.js? It would be elegant, but also nonportable. (See PNaCl for an effort to make it portable.)
> There is an obvious equivalence between any two Turing complete languages. That doesn't mean that that equivalence is elegant.
I agree, and made a point in the article to talk about how a solution-from-scratch could be more elegant. But the question is if that elegance translates into benefits aside from aesthetics. I argued it does not, in this very specific case.
I gave you two IMO important ones: types other than int32/double and SIMD. How well does asm.js auto-vectorize after being baked into JS form for example?Delete
As I mentioned in the article, SIMD will be challenging to do in JS. There is no simple solution.Delete
As for types other than int32 and double, the issue is with int64 and float32. This also has no simple solution, it will require new standardization work to fully optimize. I suspect SIMD is more important though based on the numbers I've seen so far (so I focused on that in the article and did not mention float32s and int64s), but it would depend on the workload of course.
"no room for truly big projects" seriously? Have you seen what we're doing with Rust?Delete
"LLVM is a technology with proven potential. In fact, it's the LLVM-driven emscripten that makes asm.js viable in the first place, is it not? If that's not a strong sign that it's the LLVM technology and not the asm.js subset where the magic is, then I don't know what is. I admit I haven't worked with LLVM much directly, but it strikes me as exactly what asm.js pretends to be: a high-level assembly-like language."Delete
Speaking as someone who *does* work with LLVM on a daily basis, I think it would be unfortunate if it become part of the Web platform. There is a very good argument here:
* LLVM was never designed to be portable. It has lots of unportable stuff in it (TargetData optimizations, target-specific calling conventions such as struct passing by-value/by-pointer encoded in the IR). Trying to make it portable is an inelegant hack, just like asm.js.
* Formalization and specification of LLVM IR would be very difficult. It's not defined formally.
* Undefined behavior in LLVM IR is completely unspecified. We know what happens when undefined behavior creeps into the Web stack: content starts to rely on the quirks of a particular implementation, and nobody can upgrade without breaking content.
* The LLVM instruction set, bitcode format, semantics, intrinsics, etc. are not stable and change all the time. This is because it is a compiler IR.
"the tools built upon it, such as XSLT, gained nothing from being written in XML, quite the opposite"Delete
I don't think you know how XSLT is used. It's quite common to use XSLT to generate XSLT (to generate XSLT) in software such as Apache Cocoon. This might sound convoluted but it's really just code generation which is a useful technique that can reduce the complexity of code -- it often allows programmers to maintain a separation of concerns by dynamically generating code that is tailored in some way. If XSLT was in a different language then code generation would have been significantly more complex.
Using the prefix format similar to Scheme or postfix similar to Forth would eliminate the parsing requirements and make it a compact format without needing to be parsed. Throw in a standard hash for names of functions, variables, etc, and you could make it even more compact even.
A binary representation could go a long way to being a "bytecode" without enforcing certain machine models.
The reason I mentioned pNaCl is that it aims to provide fast, portable, and safe execution for static languages (the same target domain as emscripten+asm.js). It seems odd to have written such a long post without mentioning a project that specifically aims to solve exactly the problem described.ReplyDelete
Speaking of Rust gives me an idea: put short-term hacks in Firefox and long-term solutions in Servo...
True, I did mention the JVM and CLR but I could have also mentioned PNaCl, Flash, etc., since those also aim to run multiple static languages in a portable way.Delete
Definitely PNaCl (and Flash) is interesting, and aims to do solve a very similar problem. In terms of performance, I don't know where PNaCl currently stands (but I would be very curious to see numbers), and in terms of standardization, it has not even been specced as far as I know. So I am not sure what to comment about it, except that in general it is a very cool approach and I am impressed by the technical achievements of that team.
I suggest you watch David's I/O presentation on Thursday at 5:20. It should be live streamed.Delete
How is pNaCl any different to the CLR or JVM? It appears to be, from a cynical point of view, a Google version of those two in an attempt for Google control.Delete
Is it just a power play? I doubt any of the other players could adopt it, the same way Google couldn't really adopt Flash or Silverlight (except in limited circumstances).
these questions strike me as essential:ReplyDelete
1. what do developers WANT to do?
2. what CAN they do?
(1) being a strategic issue, (2) being a tactical issue. mozilla seems to be projecting (2) into (1), i believe unwisely. consider the rate of churn on the web. have any of you maintained a site whose code remained stable over a four year period? in my experience, either the requirements and/or market changes, or simple bitrot sets in. so why the strategic bet on tools used to build mid-term code bases? well maybe es6, but it has taken far too long to arrive. c++, the world's most complex language, will likely complete two major standards revs before es6 arrives. meanwhile new tools like go, rust and even dart seem to be meeting the true demand of (1) - they are allowing developers to do what they WANT, which is build better things with better tools. no one seems to be griping about leaving perfectly adequate tools behind, they want a better future.
i suppose my point being that fast js is still js, which is still a weak tool, and developers seem keen on having something better, and that continuity on the web is a non-issue, the web rebuilds itself every four or five years anyway.
what about this candidate from 2006ReplyDelete
that Mozilla never bothered to look into ?
Actually I really hope Firefox loses its market share (looks like it's ongoing: http://gs.statcounter.com/) and loses its power.
And I will never use asm.js. Simply because it's too slow on non asm.js supported browsers. Epic Citadel at 20 fps on the latest Core i7-3770K is a joke. Slower than Flash Player!
Thank you for this excellent and very informative post! Your arguments made a very convincing case for asm.js (or some similar solution) as the browser byt code we're looking for.ReplyDelete
Claims Pages Documents: Insurance Claims Documents FormsReplyDelete
for more information: claims pages forms
How can this be considered remotely good enough? I'd be annoyed if apps were draining my battery 25% faster, never mind 200%.
No one considers that good enough, that's why JS engine devs are working hard to push things further.Delete
You can track progress here:
Interesting read... thanks!ReplyDelete
We already have the JVM that can take any language you throw at it.ReplyDelete