Tracelogger gui updates

Tracelogger is one of the tools JIT devs (especially me) use to look into performance issues and to improve the JS engine of Firefox performance-wise. It traces which functions are executing together with extra information like which engine is running. How long compilation took. How many times we are GC’ing and if we are calling VM functions …

I made the GUI a bit more powerful. First of all I moved the computation of the overview to a web worker. This should help the usability of the tool. Next to that I made it possible to toggle the categories on and off. That might make it easier to understand the graphs. I also introduced a settings popup. In the settings popup you can now choice to see absolute (cpu ticks) or relative (%) timings.

Screenshot from 2016-05-03 09:36:41

There are still a lot of improvements that are possible. Eventually it should be possible to zoom on graphs, toggle scripts on/off, see full times of scripts (instead of self time only) and maybe make it possible to show another graph (like a flame chart). Hopefully one day.

This is off course open source and available at:
https://github.com/h4writer/tracelogger/tree/master/website

Year in review: Spidermonkey in 2014 part 3

In the first two parts I listed the major changes in the Javascript engine of Mozilla Firefox in 2014 and enumerated the major changes that happened starting from Firefox 29 till Firefox 34. In part 3 I will iterate the major changes in the JavaScript engine in the last two releases that were developed in 2014.

If you haven’t read the first parts yet, I would encourage you to do that first.
<- Year in review: Spidermonkey in 2014 part 1
<- Year in review: Spidermonkey in 2014 part 2

Firefox 35

JIT RegExp.prototype.exec and RegExp.prototype.test

When Firefox 32 was released the regular expression engine was replaced with Irregexp. The new engine had just like its predecessor a small jit where regular expressions get compiled to native code. And just like its predecessor the easiest way to embed the regular expression engine is to use C code. Consequently a normal execution of a regular expression looked as following. The js code goes into C code preparing the regular expression engine jit, whereafter the regular expression engine gets called. In this release we eliminated the middle step (the c code) and now jump directly from JS jit to regular expression jit, removing the overhead the c code provided when calling RegExp.prototype.exec or RegExp.prototype.test.

Read more about this in the bug report

GVN + UCE combined

Just like most compilers Ionmonkey has an optimization called Global Value Numbering (GVN). It tries to remove or replace redundant instructions. In our implementation it is also the place where most replaces based on inputs happen, like constant folding, identity removal … After this pass we run Unreachable Code Elimination (UCE), which eliminates branches which are never taken. Optimizations taking place during GVN can improve the efficiency of UCE. More folded instructions can increase how many code that can be found to be dead. On the other hand removed code can again make it possible for GVN to optimize some extra instructions. Before Firefox 35 we only ran both passes once. As a result we sometimes didn’t find the most optimal code. With this release GVN and UCE are now combined, making it possible to have the same optimizations as running GVN and UCE multiple times after each other, but doing so in only one pass.

Learn more about this in an explanatory video

Lazy linking when recompiling code

The compilation of Ion code happens in three phases. First we have the graph creation phase (happens in IonBuilder), afterwards we do all sort of optimizations on that graph, ending with a linking phase, which finishes the compilation. In this sequence only the optimizing part can be done off the main thread. The other two phases block execution of JavaScript code. Lazy linking is about improving the last phase. Currently linking happens eagerly. As soon as a graph is ready we will try to link it. Even if we won’t ever execute that code (again) or if it gets invalidated due to better types. With lazy linking we wait until we want to execute that code before linking.

Read more about this in the bug report.

Compile non-CNG functions

Compile and Go (CNG) functions give extra performance since the caller cannot modify objects on the scope chain between compilation and execution [1]. With this warranty compilers can optimize access to these objects better. Now IonMonkey can only compile such functions. Non-CNG were stuck in the baseline compiler. In Firefox 34 and 35 these limitations were mostly removed. Given our most important class of non-CNG functions are in addons and chrome content. This will give again a nice boost to performance of these.

Read more about this in the bug reports: bug 1064777, bug 1045529 and bug 911570

Firefox 36

Baseline compile generators

Like mentioned in part 1, Firefox 30 saw the introduction of ES6 generators. In that release only supported for the interpreter (our first tier) was added. This was because the initial implementation tried to support this feature touching as little code as possible. But this method also disabled support for higher tiers. Six releases later we are now proud to also have support for generators in our second tier, the baseline compiler and that even before ES6 is released.

Wingo did the beginning of this huge task and has written a blogpost about it.
Read the blogpost about support of compiling generators in Baseline

ES6 Symbols

For the first time in a very long time a new primitive type was added to the engine. This all has to do with the upcoming spec. of ES6 Symbols. Nexto null, undefined, boolean, number and string, symbol is now present on that list. A Symbol is a unique and immutable primitive value. Without going too deep it can enable hiding of properties or fix name clashes between properties. It also can helps with not breaking existing codebases when new property names to the language are introduced. Not immediately something people need to use, but it can open new and maybe better ways to do some things.

Read more about this in the developer reference
Stackoverflow: why bring symbols to JS

Selfhosting String.prototype.substr, String.prototype.substring and String.prototype.slice

In Firefox 20 the selfhosting infrastructure landed. Since that release we can implement JavaScript features in JavaScript itself, instead of writing it in C. During runtime this selfhosted function will just get executed like somebody would have scripted in JavaScript. The major improvement here is that we remove the overhead from calling from JavaScript out to C and back. This gave improvements for e.g. “array.map(function() { /* … */})” since the C step was fully eliminated between calling the function and the function given in the argument. In this release substr, substring and slice are now also selfhosted. For these functions the speedup is mostly because the edge case checks (start is positive and length is smaller than string length) are now done in JavaScript and IonMonkey can reason about them and potentially remove those checks!

Read more about this in the bug report

Year in review: Spidermonkey in 2014 part 1

Spidermonkey, the JavaScript engine used in Mozilla Firefox, has undergone big changes in 2014. Just like I did last year I’ll go over the major changes that landed throughout the year. Though this time I’ll split the post into a series of three or four posts. This will allow me to explain the changes a bit more in depth and also keep the posts lighter.

Looking back at 2014 most changes that happened were to improve to the Garbage collection (GC) algorithm, improve our second (Baseline) and third tier compiler (IonMonkey) and implement the EcmaScript 6 (ES6) specification. After months of preparation 2014 was the year for our GC team. We finally came to a state where we improved the Garbage Collection algorithm dramatically.  We also still kept looking to the Octane 2.0 benchmark while also other benchmarks started to get on our examination table more often (BrowserMark, jsbench, dromaeo …). Throughout the year we achieved again a 40% improvement to our Octane score and Browsermark 2.1 improved with 5%, showing the huge effort of everybody involved. Nexto performance improvements we also increased our efforts implementing the ES6 standard. While the standard isn’t released yet, it will get released soon and we want to support most things before that happens.

In this first part I will look at the first three releases that were developed in 2014: Firefox 29 to Firefox 31.

Firefox 29 a.k.a Australis

ARM simulator

Our JavaScript engine is tested quite elaborately on desktop machines (i686 and x86_64 chipset), but less extensively on phones (ARM chipset). Most developers don’t have spare systems running these chips and if they have it is quite hard to set up a testing environment. To improve the situation the ARM simulator of Chrome has been ported to Mozilla. This makes it possible to test, run or debug the biggest and most error prone part of the JavaScript engine (JIT code) on a regular desktop by simulating the ARM instructions. As a result phones should now experience an even more stable JavaScript engine.

Read more about this in the bug report.

Recompile without invalidation

Our highest and fastest tier to run JavaScript code, IonMonkey, was not able to recompile (Ion) code for a JS function while the old Ion code was still present. One had to first throw the old Ion code away, before starting a new IonMonkey compilation for that particular JS function. This was mostly not an issue since recompilation mostly happens when the old Ion code is not correct anymore (E.g. when we observed a new type). For a few cases this was not optimal. Therefore since this release we can recompile a JS function while keeping the old Ion code, until the compilation is finished. The benefits are twofold here. Recompiling a function with an entry at a loop nexto the function entry (OSR) is now less expensive. Secondly this might brings us closer to “Optimization levels” where we want to recompile a JS function with more optimizations, taking longer to compile, and in the meantime run the current old ion code.

Read more about this in the bug report.

EcmaScript 6: Promises

In Firefox 29 ES6 Promises were also enabled by default. Though this is not really the work of the Spidermonkey team, but from the DOM team. Firefox is already using this improvement to the Javascript language, proposed for Harmony (ECMAScript 6) in the frontend extensively and now it was deemed that the specification about Promises wouldn’t change much anymore, making it safe to release on the web.

Read more about Promises.

Exact Stack Rooting

Most higher level languages, just like JavaScript, provide automatic memory management (Garbage collection). This removes the need for developers to free resources themself when they are not needed anymore. (I.e. when they are dead). Our current Garbage collection (GC) algorithm was approximate as it didn’t know exactly which objects were alive, so it had to be conservative. Exact stack rooting was all about changing that to make sure we now exactly which objects are live. This brings some overhead since we now must keep track of every object using a container (rooting), but it also is the most important thing needed before we can start experimenting with more advanced GC schemas like Generational GC and Compacting GC.

Read more about this in the bug report.

Firefox 30

EcmaScript 6: Generators

Basic support of ES6 generators has landed. This new feature adds the support of generator functions and the yield statement and got many people excited. Normally a function runs to completion and returns the result. This is different for generators. Such a function will stop at a yield instruction and upon next execution go further at that yield till the next yield or termination of that function. In this release the support for the Interpreter, our lowest tier, has landed.

Read more about Generators.

Improve type information at branches

JavaScript is a weakly typed or even untyped language. So every instructions has to work with every type. In order to specialize an instruction and make it faster the instructions get annotated with observed types (Type Inference). This works mostly fine, except at branches. It happens a lot at branches that a variable gets tested for being not null and undefined or having a specific type. This information gets lost and we keep using the full type information of that variable in the succeeding code. With this release an infrastructure has been added to reason about these branches in our highest optimization tier, IonMonkey and decrease the types accordingly.

Read more about this in the bug report.

Enable IonMonkey for chrome scripts by default.

IonMonkey was introduced in Firefox 18, more than a year ago, and it has been enabled  by default for quite some time for normal scripts. But it wasn’t enabled for Firefox its own UI and addons (chrome content). Firefox only used the interpreter and baseline in that case. Since this release IonMonkey and TI sort of graduated and TI and IonMonkey for chrome scripts is now enabled by default. A huge win for addons, since they now can get the full JavaScript performance potential of IonMonkey.

Read more about this in the bug report.

Firefox 31

EcmaScript 6: more math methods

ES6 proposes a lot of new Math methods like Math.sign, Math.log2, Math.imul, Math.trunc, Math.cosh, Math.hypot … Most things were already possible to do in JS with some extra code, but now there are functions standardized, making it easier to perform such operations.

Faster ES6 arrow functions

This release also brought performance improvements to the arrow function introduced by ES6. Using arrow functions should now have a similar performance as normal functions, which means an improvement of up to 64 times. This is important since having bad performance for new ES6 features will also decrease the uptake of these new features and apparently our frontend Firefox developers already use this new feature a lot.

Read more about this.

Backtracking register allocator

Just like a regular compiler IonMonkey needs to fit the used variables into a limited set of registers. This is done using a register allocator.  IonMonkey has the ability to run with different register allocators. It uses by default LSRA, but also acquired a backtracking register allocator. This new allocator, based on the new register allocator used in LLVM 3.0, should be faster and improve the allocated registers. For octane-zlib this was an improvement of 21%. In this release this new allocator is enabled by default for asm.js code. Pending some bugs and performance issues on regular code, it will also get enabled by default in normal IonMonkey code in 2015.

Read more about the LLVM 3.0 register allocator.
Read more about this in the bug report.

Update: The next post in this series has been published:
Read part 2 of ‘Year in review: spidermonkey in 2014’ ->