Year in review: Spidermonkey in 2014 part 1

Spidermonkey, the JavaScript engine used in Mozilla Firefox, has undergone big changes in 2014. Just like I did last year I’ll go over the major changes that landed throughout the year. Though this time I’ll split the post into a series of three or four posts. This will allow me to explain the changes a bit more in depth and also keep the posts lighter.

Looking back at 2014 most changes that happened were to improve to the Garbage collection (GC) algorithm, improve our second (Baseline) and third tier compiler (IonMonkey) and implement the EcmaScript 6 (ES6) specification. After months of preparation 2014 was the year for our GC team. We finally came to a state where we improved the Garbage Collection algorithm dramatically.  We also still kept looking to the Octane 2.0 benchmark while also other benchmarks started to get on our examination table more often (BrowserMark, jsbench, dromaeo …). Throughout the year we achieved again a 40% improvement to our Octane score and Browsermark 2.1 improved with 5%, showing the huge effort of everybody involved. Nexto performance improvements we also increased our efforts implementing the ES6 standard. While the standard isn’t released yet, it will get released soon and we want to support most things before that happens.

In this first part I will look at the first three releases that were developed in 2014: Firefox 29 to Firefox 31.

Firefox 29 a.k.a Australis

ARM simulator

Our JavaScript engine is tested quite elaborately on desktop machines (i686 and x86_64 chipset), but less extensively on phones (ARM chipset). Most developers don’t have spare systems running these chips and if they have it is quite hard to set up a testing environment. To improve the situation the ARM simulator of Chrome has been ported to Mozilla. This makes it possible to test, run or debug the biggest and most error prone part of the JavaScript engine (JIT code) on a regular desktop by simulating the ARM instructions. As a result phones should now experience an even more stable JavaScript engine.

Recompile without invalidation

Our highest and fastest tier to run JavaScript code, IonMonkey, was not able to recompile (Ion) code for a JS function while the old Ion code was still present. One had to first throw the old Ion code away, before starting a new IonMonkey compilation for that particular JS function. This was mostly not an issue since recompilation mostly happens when the old Ion code is not correct anymore (E.g. when we observed a new type). For a few cases this was not optimal. Therefore since this release we can recompile a JS function while keeping the old Ion code, until the compilation is finished. The benefits are twofold here. Recompiling a function with an entry at a loop nexto the function entry (OSR) is now less expensive. Secondly this might brings us closer to “Optimization levels” where we want to recompile a JS function with more optimizations, taking longer to compile, and in the meantime run the current old ion code.

EcmaScript 6: Promises

In Firefox 29 ES6 Promises were also enabled by default. Though this is not really the work of the Spidermonkey team, but from the DOM team. Firefox is already using this improvement to the Javascript language, proposed for Harmony (ECMAScript 6) in the frontend extensively and now it was deemed that the specification about Promises wouldn’t change much anymore, making it safe to release on the web.

Exact Stack Rooting

Most higher level languages, just like JavaScript, provide automatic memory management (Garbage collection). This removes the need for developers to free resources themself when they are not needed anymore. (I.e. when they are dead). Our current Garbage collection (GC) algorithm was approximate as it didn’t know exactly which objects were alive, so it had to be conservative. Exact stack rooting was all about changing that to make sure we now exactly which objects are live. This brings some overhead since we now must keep track of every object using a container (rooting), but it also is the most important thing needed before we can start experimenting with more advanced GC schemas like Generational GC and Compacting GC.

Firefox 30

EcmaScript 6: Generators

Basic support of ES6 generators has landed. This new feature adds the support of generator functions and the yield statement and got many people excited. Normally a function runs to completion and returns the result. This is different for generators. Such a function will stop at a yield instruction and upon next execution go further at that yield till the next yield or termination of that function. In this release the support for the Interpreter, our lowest tier, has landed.

Improve type information at branches

JavaScript is a weakly typed or even untyped language. So every instructions has to work with every type. In order to specialize an instruction and make it faster the instructions get annotated with observed types (Type Inference). This works mostly fine, except at branches. It happens a lot at branches that a variable gets tested for being not null and undefined or having a specific type. This information gets lost and we keep using the full type information of that variable in the succeeding code. With this release an infrastructure has been added to reason about these branches in our highest optimization tier, IonMonkey and decrease the types accordingly.

Enable IonMonkey for chrome scripts by default.

IonMonkey was introduced in Firefox 18, more than a year ago, and it has been enabled  by default for quite some time for normal scripts. But it wasn’t enabled for Firefox its own UI and addons (chrome content). Firefox only used the interpreter and baseline in that case. Since this release IonMonkey and TI sort of graduated and TI and IonMonkey for chrome scripts is now enabled by default. A huge win for addons, since they now can get the full JavaScript performance potential of IonMonkey.

Firefox 31

EcmaScript 6: more math methods

ES6 proposes a lot of new Math methods like Math.sign, Math.log2, Math.imul, Math.trunc, Math.cosh, Math.hypot … Most things were already possible to do in JS with some extra code, but now there are functions standardized, making it easier to perform such operations.

Faster ES6 arrow functions

This release also brought performance improvements to the arrow function introduced by ES6. Using arrow functions should now have a similar performance as normal functions, which means an improvement of up to 64 times. This is important since having bad performance for new ES6 features will also decrease the uptake of these new features and apparently our frontend Firefox developers already use this new feature a lot.

Backtracking register allocator

Just like a regular compiler IonMonkey needs to fit the used variables into a limited set of registers. This is done using a register allocator.  IonMonkey has the ability to run with different register allocators. It uses by default LSRA, but also acquired a backtracking register allocator. This new allocator, based on the new register allocator used in LLVM 3.0, should be faster and improve the allocated registers. For octane-zlib this was an improvement of 21%. In this release this new allocator is enabled by default for asm.js code. Pending some bugs and performance issues on regular code, it will also get enabled by default in normal IonMonkey code in 2015.

Read more about the LLVM 3.0 register allocator.
Mozilla Firefox JS engine Januari 2012

There are again a lot of improvements in the JavaScript engine in Mozilla Firefox. Here I will list the speed improvements and regressions happened in the month Januari. This time with three benchmarks. They are know as sunspider (a popular benchmark released by WebKit), V8 (released by google) and jslinux (a pc emulator running linux). The improvements will eventually come into Firefox 12 together with the improvements of the last week of December.

Current JS engine (Jaegermonkey + Type Interference)

Benchmark   December 31st   Januari 26th
jslinux      5306.0 ms         4596.0 ms   improvement of 15%
V8           7196.6            6953.4      regression in score of 3.5%
sunspider    165.87 ms        167.445 ms   regression of 0.1%

The biggest difference is the huge improvement on the jslinux benchmark. This is like Bhackett promised due to  landing of bug 706914. This compiles scripts in smaller chunks. This way the compilation time gets decreased.

Detailed overview
Note this list just gives an indication what revisions/bugs potentially increased/decreased performance. It could be that I’ve listed bugs that actually don’t give the listed speed-up and there could be revisions that have a huge influence on the speed, but that I failed to list here.

be81e5f7850f – Bug 714218 – Specialize some get* implementations to do property-type-specific handling, with their getGeneric forwarding to the appropriate specific implementation.
sunspider: regression of 0.8%

db8ea6327311 – Bug 607692 – Inline parseInt(, <0|10>) in JM.
sunspider: improvement of 0.5%

23f3b97f655a – Bug 710163: (part 2) fix EXT_context_loss semantics.
sunspider: regression of 0.4%

faf5f8842fec – Bug 703157 – Don’t modify dictionary shapes in place.
sunspider: improvement of 0.5%

78d17e22a223 – Bug 712714 – Remove JOF_CALLOP.
sunspider: regression of 0.6%

e12b877ae637 – Convert a couple always-true appends to infallibleAppend, since that’s what they should have been using
sunspider: improvement of 1%

d0c192e5bd41 – Bug 706914 – Compile large scripts in chunks.
jslinux: improvement of 17.4%
Huge improvement and indeed fixes the regression introduced last month by compiling switch and try blocks.

96a9dffede07 – Bug 717494 – Pass scope chain explicitly to FindProperty
jslinux: improvement of 0.7%

a85cf7f0d235 – Bug 716512 – make sure that gcparam in shell cannot set MAX_GC_BYTES to a value les than the current GC_BYTES.
V8: improvement of 0.8%

7ab4f1ebc7cc – Backout 54cd89b0f1fa (Bug 712714 backout). Talos will probably report fake regressions for this patch, do not back out for this reason.
V8: regression of 2.5%
Note: Like the commit message already states, this could be a fake regression. But I’m not sure. Every timing I do, shows this is a regression …

Mozilla Firefox JS engine December 2011

In this post I want to iterate over the performance improvements and regressions of the javascript engine in Mozilla Firefox. I will do this by building the cutting edge JS shell of Mozilla. (I use the branch mozilla-inbound). The improvements will not be visible in Firefox immediatly. The improvements that happened before December 23th will go in Firefox 11. The rest will go into Firefox 12.

Current JS engine (Jaegermonkey + Type Interference)

Currently the JavaScript Engine in Firefox uses JaegerMonkey and Type Interference. While JaegerMonkey is relatively old, Type Interference is rather new. Type Interference got enabled only in the newest version (FF 9).

Benchmark    December 1st    December 31st
jslinux         5298.0 ms    5476.5 ms     regression of 180 ms
V8              6811.7       7113.4        improvement in score of 300

One of the regressions I found was the following one:
eaac85c4c05f – Bug 704387: Generate SSA information for scripts containing switch and try blocks
regression of 130ms on jslinux
This increases the compilation time of scripts, with the possibility of creating faster code. In case of jslinux the gained run time doesn’t outweight the extra compilation time and therefor runs slower. Bhackett informed me that the increased compilation time will get lowered with help of bug 706914.