• Making `this` a real binding in SpiderMonkey

    Last week I landed bug 1132183, a pretty large patch rewriting the implementation of this in SpiderMonkey.

    How this Works In JS

    In JS, when a function is called, an implicit this argument is passed to it. In strict mode, this inside the function just returns that value:

    function f() { "use strict"; return this; }
    f.call(123); // 123

    In non-strict functions, this always returns an object. If the this-argument is a primitive value, it's boxed (converted to an object):

    function f() { return this; }
    f.call(123); // returns an object: new Number(123)

    Arrow functions don't have their own this. They inherit the this value from their enclosing scope:

    function f() {
        "use strict";
        () => this; // `this` is 123

    And, of course, this can be used inside eval:

    function f() {
        "use strict";
        eval("this"); // 123

    Finally, this can also be used in top-level code. In that case it's usually the global object (lots of hand waving here).

    How this Was Implemented

    Until last week, here's how this worked in SpiderMonkey:

    • Every stack frame had a this-argument,
    • Each this expression in JS code resulted in a single bytecode op (JSOP_THIS),
    • This bytecode op boxed the frame's this-argument if needed and then returned the result.

    Special case: to support the lexical this behavior of arrow functions, we emitted JSOP_THIS when we defined (cloned) the arrow function and then copied the result to a slot on the function. Inside the arrow function, JSOP_THIS would then load the value from that slot.

    There was some more complexity around eval: eval-frames also had their own this-slot, so whenever we did a direct eval we'd ensure the outer frame had a boxed (if needed) this-value and then we'd copy it to the eval frame.

    The Problem

    The most serious problem was that it's fundamentally incompatible with ES6 derived class constructors, as they initialize their 'this' value dynamically when they call super(). Nested arrow functions (and eval) then have to 'see' the initialized this value, but that was impossible to support because arrow functions and eval frames used their own (copied) this value, instead of the updated one.

    Here's a worst-case example:

    class Derived extends Base {
        constructor() {
            var arrow = () => this;
            // Runtime error: `this` is not initialized inside `arrow`.
            // Call Base constructor, initialize our `this` value.
            // The arrow function now returns the initialized `this`.

    We currently (temporarily!) throw an exception when arrow functions or eval are used in derived class constructors in Firefox Nightly.

    Boxing this lazily also added extra complexity and overhead. I already mentioned how we had to compute this whenever we used eval.

    The Solution

    To fix these issues, I made this a real binding:

    • Non-arrow functions that use this or eval define a special .this variable,
    • In the function prologue, we get the this-argument, box it if needed (with a new op, JSOP_FUNCTIONTHIS) and store it in .this,
    • Then we simply use that variable each time this is used.

    Arrow functions and eval frames no longer have their own this-slot, they just reference the .this variable of the outer function. For instance, consider the function below:

    function f() {
        return () => this.foo();

    We generate bytecode similar to the following pseudo-JS:

    function f() {
        var .this = BoxThisIfNeeded(this);
        return () => (.this).foo();

    I decided to call this variable .this, because it nicely matches the other magic 'dot-variable' we already had, .generator. Note that these are not valid variable names so JS code can't access them. I only had to make sure with-statements don't intercept the .this lookup when this is used inside a with-statement...

    Doing it this way has a number of benefits: we only have to check for primitive this values at the start of the function, instead of each time this is accessed (although in most cases our optimizing JIT could/can eliminate these checks, when it knows the this-argument must be an object). Furthermore, we no longer have to do anything special for arrow functions or eval; they simply access a 'variable' in the enclosing scope and the engine already knows how to do that.

    In the global scope (and in eval or arrow functions in the global scope), we don't use a binding for this (I tried this initially but it turned out to be pretty complicated). There we emit JSOP_GLOBALTHIS for each this-expression, then that op gets the this value from a reserved slot on the lexical scope. This global this value never changes, so the JITs can get it from the global lexical scope at compile time and bake it in as a constant :) (Well.. in most cases. The embedding can run scripts with a non-syntactic scope chain, in that case we have to do a scope walk to find the nearest lexical scope. This should be uncommon and can be optimized/cached if needed.)

    The Debugger

    The main nuisance was fixing the debugger: because we only give (non-arrow) functions that use this or eval their own this-binding, what do we do when the debugger wants to know the this-value of a frame without a this-binding?

    Fortunately, the debugger (DebugScopeProxy, actually) already knew how to solve a similar problem that came up with arguments (functions that don't use arguments don't get an arguments-object, but the debugger can request one anyway), so I was able to cargo-cult and do something similar for this.

    Other Changes

    Some other changes I made in this area:

    • In bug 1125423 I got rid of the innerObject/outerObject/thisValue Class hooks (also known as the holy grail). Some scope objects had a (potentially effectful) thisValue hook to override their this behavior, this made it hard to see what was going on. Getting rid of that made it much easier to understand and rewrite the code.
    • I posted patches in bug 1227263 to remove the this slot from generator objects, eval frames and global frames.
    • IonMonkey was unable to compile top-level scripts that used this. As I mentioned above, compiling the new JSOP_GLOBALTHIS op is pretty simple in most cases; I wrote a small patch to fix this (bug 922406).


    We changed the implementation of this in Firefox 45. The difference is (hopefully!) not observable, so these changes should not break anything or affect code directly. They do, however, pave the way for more performance work and fully compliant ES6 Classes! :)

  • Bye Wordpress, hello Jekyll!

    This week I migrated this blog from Wordpress to Jekyll, a popular static site generator. This post explains why and how I did this, maybe it will be useful to someone.


    Wordpress powers thousands of websites, is regularly updated, has a ton of features. Why did I abandon it?

    I still think Wordpress is great for a lot of websites and blogs, but I felt it was overkill for my simple website. It had so many features I never used and this came at a price: it was hard to understand how everything worked, it was hard to make changes and it required regular security updates.

    This is what I like most about Jekyll compared to Wordpress:

    • Maintainance, security: I don't blog often, yet I still had to update Wordpress every few weeks or months. Even though the process is pretty straight-forward, it got cumbersome after a while.
    • Setup: Setting up a local Wordpress instance with the same content and configuration was annoying. I never bothered so the little development I did was directly on the webserver. This didn't feel very good or safe. Now I just have to install Jekyll, clone my repository and generate plain HTML files. No database to setup. No webserver to install (Jekyll comes with a little webserver, see below).
    • Transparency: With Wordpress, the blog posts were stored somewhere in a MySQL database. With Jekyll, I have Markdown files in a Git repository. This makes it trivial to backup, view diffs, etc.
    • Customizability: After I started using Jekyll, customizing this blog (see below) was very straight-forward. It took me less than a few hours. With Wordpress I'm sure it'd have taken longer and I'd have introduced a few security bugs in the process.
    • Performance: The website is just some static HTML files, so it's fast. Also, when writing a blog post, I like to preview it after writing a paragraph or so. With Wordpress it was always a bit tedious to wait for the website to save the blog post and reload the page. With Jekyll, I save the markdown file in my text editor and, in the background, jekyll serve immediately updates the site, so I can just refresh the page in the browser. Everything runs locally.
    • Hosting: In the future I may move this blog to GitHub Pages or another free/cheaper host.

    Why Jekyll?

    I went with Jekyll because it's widely used, so there's a lot of documentation and it'll likely still be around in a year or two. Octopress is also popular but under the hood it's just Jekyll with some plugins and changes, and it seems to be updated less frequently.


    I decided to use the default template and customize it where needed. I made the following changes:

    • Links to previous/next post at the end of each post, see post.html
    • Pagination on the homepage, based on the docs. I also changed the home page to include the contents instead of just the post title.
    • Archive page, a list of posts grouped by year, see archive.html
    • Category pages. I wrote a small plugin to generate a page + feed for each category. This is based on the example in the plugin documentation. See _plugins/category-generator.rb and _layouts/category.html
    • List of categories in the header of each post (with a link to the category page), see post.html
    • Disqus comments and number of comments in the header of each post, based on the docs, see post.html. I was able to export the Wordpress comments to Disqus.
    • In _config.yml I changed the post URL format ("permalink" option) to not include the category names. This way links to my posts still work.
    • Some minor tweaks here and there.

    I still want to change the code highlighting style, but that can wait for now.


    After using Jekyll for a few hours, I'm a big fan. It's simple, it's fun, it's powerful. If you're tired of Wordpress and Blogger, or just want to experiment with something else, I highly recommend giving it a try.

  • Using Rust to generate Mercurial short-hash collisions

    At Mozilla, we use Mercurial for the main Firefox repository. Mercurial, like Git, uses SHA1 hashes to identify a commit.

    Short hashes

    SHA1 hashes are fairly long, a string of 40 hex characters (160 bits), so Mercurial and Git allow using a prefix of that, as long as the prefix is unambiguous. Mercurial also typically only shows the first 12 characters (let’s call them short hashes), for instance:

    $ hg id
    $ hg log -r tip
    changeset:   242221:312707328997
    tag:         tip

    And those are the hashes most Mercurial users use, for instance they are posted in Bugzilla whenever we land a patch etc.

    Collisions with short hashes are much more likely than full SHA1 collisions, because the short hashes are only 48 bits long. As the Mercurial FAQ states, such collisions don’t really matter, because Mercurial will check if the hash is unambiguous and if it’s not it will require more than 12 characters.

    So, short hash collisions are not the end of the world, but they are inconvenient because the standard 12-chars hg commit ids will become ambiguous and unusable. Fortunately, the mozilla-central repository at this point does not contain any short hash collisions (it has about 242,000 commits).

    Finding short-hash collisions

    I’ve wondered for a while, can we create a commit that has the same short hash as another commit in the repository?

    A brute force attack that works by committing and then reverting changes to the repository should work, but it’d be super slow. I haven’t tried it, but it’d probably take years to find a collision. Fortunately, there’s a much faster way to brute force this. Mercurial computes the commit id/hash like this:

    hash = sha1(min(p1, p2) + max(p1, p2) + contents)

    Here p1 and p2 are the hashes of the parent commits, or a null hash (all zeroes) if there’s only one parent. To see what contents is, we can use the hg debugdata command:

    $ hg debugdata -c 34828fed1639
    Carsten "Tomcat" Book <cbook@mozilla.com>
    1430739274 -7200
    ...list of changed files...
    merge mozilla-inbound to mozilla-central a=merge

    Perfect! This contains the commit message, so all we have to do is append some random data to the commit message, compute the (short) hash, check if there’s a collision and repeat until we find a match.

    I wrote a small Rust program to brute-force this. You can use it like this (I used the popular mq extension, there are other ways to do it):

    $ cd mozilla-central
    $ echo "Foo" >> CLOBBER # make a random change
    $ hg qnew patch -m "Some message"
    $ hgcollision
    Got 242223 prefixes
    Generated random prefix: 1631965792_
    Tried 242483200 hashes
    Found collision! Prefix: b991f0726738, hash: b991f072673876a64c7a36f920b2ad2885a84fac
    Add this to the end of your commit message: 1631965792_24262171

    After about 2 minutes it’s done and tells us we have to append “1631965792_24262171” to our commit message to get a collision! Let’s try it (we have to be careful to preserve the original date/time, or we’ll get a different hash):

    $ hg log -r tip --template "{date|isodatesec}"
    2015-05-05 20:21:59 +0200
    $ hg qref -m "Some message1631965792_24262171" -d "2015-05-05 20:21:59 +0200"
    $ hg id
    b991f0726738 patch/qbase/qtip/tip
    $ hg log -r b991f0726738
    abort: 00changelog.i@b991f0726738: ambiguous identifier!

    Voilà! We successfully created a Mercurial short hash collision!

    And no, I didn’t use this on any patches I pushed to mozilla-central..


    The Rust source code is available here. It was my first, quick-and-dirty Rust program but writing it was a nice way to get more familiar with the language. I used the rust-crypto crate to calculate SHA1 hashes, installing and using it was much easier than I expected. Pretty nice experience.

    The program can check about 100 million hashes in one minute on my laptop. It usually takes about 1-5 minutes to find a collision, this also depends on the size of the repository (mozilla-central has about 242,000 commits). It’d be easy to use multiple threads (you can also just use X processes though) and there are probably a lot of other ways to improve it. For this experiment it was good and fast enough to get the job done :)

  • Fast arrow functions in Firefox 31

    Last week I spent some time optimizing ES6 arrow functions. Arrow functions allow you to write function expressions like this:

    a.map(s => s.length);

    Instead of the much more verbose:

    a.map(function(s){ return s.length });

    Arrow functions are not just syntactic sugar though, they also bind their this-value lexically. This means that, unlike normal functions, arrow functions use the same this-value as the script in which they are defined. See the documentation for more info.

    Firefox has had support for arrow functions since Firefox 22, but they used to be slower than normal functions for two reasons:

    1. Bound functions: SpiderMonkey used to do the equivalent of |arrow.bind(this)| whenever it evaluated an arrow expression. This made arrow functions slower than normal functions because calls to bound functions are currently not optimized or inlined in the JITs. It also used more memory because we’d allocate two function objects instead of one for arrow expressions.
      In bug 989204 I changed this so that we treat arrow functions exactly like normal function expressions, except that we also store the lexical this-value in an extended function slot. Then, whenever this is used inside the arrow function, we get it from the function’s extended slot. This means that arrow functions behave a lot more like normal functions now. For instance, the JITs will optimize calls to them and they can be inlined.
    2. Ion compilation: IonMonkey could not compile scripts containing arrow functions. I fixed this in bug 988993.

    With these changes, arrow functions are about as fast as normal functions. I verified this with the following micro-benchmark:

    function test(arr) {
        var t = new Date;
        arr.reduce((prev, cur) => prev + cur);
        alert(new Date - t);
    var arr = [];
    for (var i=0; i<10000000; i++) {

    I compared a nightly build from April 1st to today’s nightly and got the following results:

    We’re 64x faster because Ion is now able to inline the arrow function directly without going through relatively slow bound function code on every call.

    Other browsers don’t support arrow functions yet, so they are not used a lot on the web, but it’s important to offer good performance for new features if we want people to start using them. Also, Firefox frontend developers love arrow functions (grepping for “=>” in browser/ shows hundreds of them) so these changes should also help the browser itself :)

  • Using segfaults to interrupt JIT code

    (I just installed my own blog so I decided to try it out by writing a bit about interrupt checks :) )

    Most browsers allow the user to interrupt JS code that runs too long, for instance because it’s stuck in an infinite loop. This is especially important for Firefox as it uses a single process for chrome and content (though that’s about to change), so without this dialog a website could hang the browser forever and the user is forced to kill the browser and could lose work. Firefox will show the slow script dialog when a script runs for more than 10 seconds (power users can customize this).


    Firefox uses a separate (watchdog) thread to interrupt script execution. It triggers the “operation callback” (by calling JS_TriggerOperationCallback) every second. Whenever this happens, SpiderMonkey promises to call the operation callback as soon as possible. The browser’s operation callback then checks the execution limit, shows the dialog if necessary and returns true to continue execution or false to stop the script.

    How this works internally is that JS_TriggerOperationCallback sets a flag on the JSRuntime, and the main thread is responsible for checking this flag every now and then and invoke the operation callback if it’s set. We check this flag for instance on JS function calls and loop headers. We have to do this both for scripts running in the interpreter and the JITs, of course. For example, consider this function:

    function f() {
        for (var i=0; i<100000000; i++) {

    Until Firefox 26, IonMonkey would emit the following code for this loop:

    Note that the loop itself is only 4 instructions, but we need 2 more instructions for the interrupt check. These 2 instructions can measurably slow down tight loops like this one. Can we do better?


    OdinMonkey is our ahead-of-time (AOT) compiler for asm.js. When developing Odin, we (well, mostly Luke) tried to shave off as much overhead as possible, for instance we want to get rid of bounds checks and interrupt checks if possible to close the gap with native code. The result is that Odin does not emit loop interrupt checks at all! Instead, it makes clever use of signal handlers.

    When the watchdog thread wants to interrupt Odin execution on the main thread, it uses mprotect (Unix) or VirtualProtect (Windows) to clear the executable bit of the asm.js code that’s currently executing. This means any asm.js code running on the main thread will immediately segfault. However, before the kernel terminates the process, it gives us one last chance to interfere: because we installed our own signal handler, we can trap the segfault and, if the address is inside asm.js code, we can make the signal handler return to a little trampoline that calls the operation callback. Then we can either jump back to the faulting pc or stop execution by returning from asm.js code. (Note that handling segfaults is serious business: if the faulting address is not inside asm.js code, we have an unrelated, “real” crash and we must be careful not to interfere in any way, so that we don’t sweep real crashes under the rug.)

    This works really well and is pretty cool: asm.js code has no runtime interrupt checks, just like native code, but we can still interrupt it and show our slow script dialog.


    A while later, Brian Hackett wanted to see if we could make IonMonkey (our optimizing JIT) as fast as OdinMonkey on asm.js code. This means he also had to eliminate interrupt checks for normal JS code running in Ion (we don’t bother doing this for our Baseline JIT as we’ll spend most time in Ion code anyway).

    The first thought is to do exactly what Odin does: mprotect all Ion-code, trigger a segfault and return to some trampoline where we handle the interrupt. It’s not that simple though, because Ion-code can modify the GC heap. For instance, when we store a boxed Value, we emit two machine instructions on 32-bit platforms, to store the type tag and the payload. If we use signal handlers the same way Odin does, it’s possible we store the type tag but are interrupted before we can store the payload. Everything will be fine until the GC traces the heap and crashes horribly. Even worse, an attacker could use this to access arbitrary memory.

    There’s another problem: when we call into C++ from Ion code, the register allocator tracks GC pointers stored in registers or on the stack, so that the garbage collector can mark them. If we call the operation callback at arbitrary points though, we don’t have this information. This is a problem because the operation callback is also used to trigger garbage collections, so it has to know where all GC pointers are.

    What Brian implemented instead is the following:

    1. The watchdog thread will mprotect all Ion code (we had to use a separate allocator for Ion code so that we can do this efficiently).
    2. The main thread will segfault and call our signal handler.
    3. The signal handler unprotects all Ion-code again and patches all loop backedges (jump instructions) to jump to a slow, out-of-line path instead.
    4. We return from the signal handler and continue execution until we reach the next (patched) loop backedge and call the operation callback, show the slow script dialog, etc.
    5. All loop backedges are patched again to jump to the loop header.

    Note that this only applies to the loop interrupt check: there’s another interrupt check when we enter a script, but for JIT code we combine it with the stack overflow check: when we trigger the operation callback, the watchdog thread also sets the stack limit to a very high value so that the stack check always fails and we also end up in the VM where we can handle the operation callback and reset the stack limit :)


    Firefox 26 and newer uses signal handlers and segfaults for interrupting Ion code. This was a measurable speedup, especially for tight loops. For example, the empty for-loop I posted earlier runs 33% faster (43 ms to 29 ms). It helps more interesting loops as well, for instance Octane-crypto got ~8% faster.