Firefox Front-End Performance Update #17

Hello, folks. I wanted to give a quick update on what the Firefox Front-end Performance team is up to, so let’s get into it.

The name of the game continues to be start-up performance. We made some really solid in-roads last quarter, and this year we want to continue to apply pressure. Specifically, we want to focus on reducing IO (specifically, main-thread IO) during browser start-up.

Reducing main thread IO during start-up

There are lots of ways to reduce IO – in the best case, we can avoid start-up IO altogether by not doing something (or deferring it until much later). In other cases, when the browser might be servicing events on the main thread, we can move IO onto another thread. We can also re-organize, pack or compress files differently so that they’re read off of the disk more efficiently.

If you want to change something, the first step is measuring it. Thankfully, my colleague Florian has written a rather brilliant test that lets us take accounting of how much IO is going on during start-up. The test is deterministic enough that he’s been able to write a whitelist for the various ways we touch the disk on the main thread during start-up, and that whitelist means we’ve made it much more difficult for new IO to be introduced on that thread.

That whitelist has been processed by the team, and have been turned into bugs, bucketed by the start-up phase where the IO is occurring. The next step is to estimate the effort and potential payoff of fixing those bugs, and then try to whittle down the whitelist.

And that’s effectively where we’re at. We’re at the point now where we’ve got a big list of work in front of us, and we have the fun task of burning that list down!

Being better at loading DLLs on Windows

While investigating the warm-up service for Windows, Doug Thayer noticed that we were loading DLLs during start-up oddly. Specifically, using a tool called RAMMap, he noticed that we were loading DLLs using “read ahead” (eagerly reading the entirety of the DLL into memory) into a region of memory marked as not-executable. This means that anytime we actually wanted to call a library function within that DLL, we needed to load it again into an executable region of memory.

Doug also noticed that we were unnecessarily doing ReadAhead for the same libraries in the content process. This wasn’t necessary, because by the time the content process wanted to load these libraries, the parent process would have already done it and it’d still be “warm” in the system file cache.

We’re not sure why we were doing this ReadAhead-into-unexecutable-memory work – it’s existence in the Firefox source code goes back many many years, and the information we’ve been able to gather about the change is pretty scant at best, even with version control. Our top hypothesis is that this was a performance optimization that made more sense in the Windows XP era, but has since stopped making sense as Windows has evolved.

UPDATE: Ehsan pointed us to this bug where the change likely first landed. It’s a long and wind-y bug, but it seems as if this was indeed a performance optimization, and efforts were put in to side-step effects from Prefetch. I suspect that later changes to how Prefetch and SuperFetch work ultimately negated this optimization.

Doug hacked together a quick prototype to try loading DLLs in a more sensible way, and the he was able to capture quite an improvement in start-up time on our reference hardware:

This graph measures various start-up metrics. The scatter of datapoints on the left show the “control” build, and they tighten up on the right with the “test” build. Lower is better.

At this point, we all got pretty excited. The next step was to confirm Doug’s findings, so I took his control and test builds, and tested them independently on the reference hardware using frame recording. There was a smaller1, but still detectable improvement in the test build. At this point, we decided it was worth pursuing.

Doug put together a patch, got it reviewed and landed, and we immediately saw an impact in our internal benchmarks.

We’re also seeing the impact reflected in Telemetry. The first Nightly build with Doug Thayer’s patch went out on April 14th, and we’re starting to see a nice dip in some of our graphs here:

This graph measures the time at which the browser window reports that it has first painted. April 14th is the second last date on the X axis, and the Y axis is time. The top-most line is plotting the 95th percentile, and there’s a nice dip appearing around April 14th.

There are other graphs that I’d normally show for improvements like this, except that we started tracking an unrelated regression on April 16th which kind of muddies the visualization. Bad timing, I guess!

We expect this improvement to have the greatest impact on weaker hardware with slower disks, but we’ll be avoiding some unnecessary work for all Windows users, and that gets a thumbs-up in my books.

If all goes well, this fix should roll out in Firefox 68, which reaches our release audience on July 9th!


  1. My test machine has SuperFetch disabled to help reduce noise and inconsistency with start-up tests, and we suspect SuperFetch is able to optimize start-up better in the test build 

Firefox Front-End Performance Update #16

With Firefox 67 only a few short weeks away, I thought it might be interesting to take a step back and talk about some of the work that the Firefox Front-end Performance team is shipping to users in that particular release.

To be clear, this is not an exhaustive list of the great performance work that’s gone into Firefox 67 – but I picked a few things that the front-end team has been focused on to talk about.

Stop loading things we don’t need right away

The fastest code is the code that doesn’t run at all. Sometimes, as the browser evolves, we realize that there are components that don’t need to be loaded right away during start-up, and can instead of deferred until sometime after start-up. Sometimes, that means we can wait until the very last moment to initialize some component – that’s called loading something lazily.

Here’s a list of things that either got deferred until sometime after start-up, or made lazy:

FormAutofillContent and FormValidationChild

These are modules that support, you guessed it, Form Autofill – that part of the browser that helps you fill in web forms, and makes sure forms are passing validation. We were loading these modules too early, and now we load them only when there are forms to auto-fill or validate on a page.

The hidden window

The Hidden Window is a mysterious chunk of code that manages the state of the global menu bar on macOS when there are no open windows. The Hidden Window is also sometimes used as a singleton DOM window where various operations can take place. On Linux and Windows, it turns out we were creating this Hidden Window far early than needs be, and now it’s quite a bit lazier.

Page style

Page Style is a menu you can find under View in the main menu bar, and it’s used to switch between alternative style sheets on a page. It’s a pretty rarely used feature from what we can tell, but we were scanning pages for their alternative stylesheets far earlier than we needed to. We were also scanning pages that we know don’t have alternative stylesheets, like the about:home / about:newtab page. Now we only scan normal web pages, and we do so only after we service the idle event queue.

Cache invalidation

The Startup Cache is an important part of Firefox startup performance. It’s primary job is to cache computations that occur during each startup so that they only have to happen every once in a while. For example, the mark-up of the browser UI often doesn’t change from startup to startup, so we can cache a version of the mark-up that’s faster to read from disk, and only invalidate that during upgrades.

We were invalidating the whole startup cache every time a WebExtension was installed or uninstalled. This used to be necessary for old-style XUL add-ons (since those could cause changes to the UI that would need to go into the cache), but with those add-ons no longer available, we were able to remove the invalidation. This means faster startups more often.

Don’t touch the disk

The disk is almost always the slowest part of the system. Reading and writing to the disk can take a long time, especially on spinning magnetic drives. The less we can read and write, the better. And if we’re going to read, best to do it off of the main thread so that the UI remains responsive.

Old XUL icons code

We were reading from the disk on the main thread to search for window-specific icons to display in the window titlebar.

Firefox doesn’t use window-specific icons, so we made it so that we skip these checks. This means less disk activity, which is great for responsiveness and start-up!

Hitting every directory on the way down

We noticed that when we were checking that a directory exists on Windows (to write a file to it), we were using the CreateDirectoryW Windows API. This API checks each folder on the way down to the last one to see if they exist. That’s a lot of disk IO! We can avoid this if we assume that the parent directories exist, and only fall into the slow path if we fail to write our file. This means that we hit the faster path with less IO more often, which is good for responsiveness and start-up time.

Enjoy your Faster Fox!

Firefox 67 is slated to ship with these improvements on May 14th – just a little over a month away. Enjoy!

Firefox Front-End Performance Update #15

Firefox 66 has been released, Firefox 67 is out on the beta channel, and Firefox 68 is cooking for the folks on the Nightly channel! These trains don’t stop!

With that, let’s take a quick peek at what the Firefox Front-end Performance team has been doing these past few weeks…

Volunteer Contributor Highlight: Nikki!

I first wanted to call out some great work from Nikki, who’s a volunteer contributor. Nikki fixed a bug where we’d stall the parent process horribly if ever hovering a link with a really really long URL (like a base64 encoded Data URL). Stalling the parent process is the worst, because it makes everything else seem slow as a result.

Thank you for your work, Nikki!

Document Splitting Foundations for WebRender (In-Progress by Doug Thayer)

An impressive set of patches were recently queued to land, which should bring document splitting to WebRender, but in a disabled state. The gfx.webrender.split-render-roots pref is what controls it, but I don’t think we can reap the full benefits of document splitting until we get retained display lists enabled in the parent process for the UI. I believe, at that point, we can start enabling document splitting, which means that updating the browser UI area will not involve sending updates to the content area for WebRender.

In other WebRender news, it looks like it should be enabled by default for some of our users on the release channel in Firefox 67, due to be released in mid-May!

Warm-up Service (In-Progress by Doug Thayer)

Doug has written the bits of code that tie a Firefox preference to an HKLM registry key, which can be read by the warm-up service at start-up. The next step is to add a mode to the Firefox executable that loads its core DLLs and then exits, and then have the warm-up service call into that mode if enabled.

Once this is done, we should be in a state where we can user test this feature.

Startup Cache Telemetry (In-Progress by Doug Thayer)

Two things of note here:

  1. With the probes having now uplifted to Beta, data will slowly trickle in these next few days that will show us how the Firefox startup cache is behaving in the wild for users that aren’t receiving two updates a day (like our Nightly users). This important, because oftentimes, those updates cause some or all of the startup cache to be invalidated. We’re eager to see how the startup caches are behaving in the wild on Beta.
  2. One of the tests that was landed for the startup cache Telemetry appears to have caught an issue with how the QuantumBar code works with it – this is useful, because up until now, we’ve had very little automated testing to ensure that the startup cache is working as expected.

Smoother Tab Animations (Paused by Felipe Gomes)

UX, Product and Engineering have been having discussions about how the new tab animations work, and one thing has been decided upon: we want our User Research team to run some studies to see how tab animations are perceived before we fully commit to changing one of the fundamental interactions in the browser. So, at this time, Felipe is pausing his efforts here until User Research comes back with some information on guidance.

Browser Adjustment Project (Concluded by Gijs Kruitbosch)

We originally set out to see whether or not we could do something for users running weaker hardware to improve their browsing experience. Our initial hypothesis was that by lowering the frame rate of the browser on weaker hardware, we could improve the overall page load time.

This hypothesis was bolstered by measurements done in late 2018, where it appeared that by manually lowering the frame rate on a weaker reference laptop, we could improve our internal page load benchmarks by a significant degree. This measurement was reproduced by Denis Palmeiro on Vicky Chin’s team, and so Gijs started implementing a runtime detection mechanism to do that lowering of the frame rate for machines with 2 or fewer cores where each core’s clockspeed was 1.8Ghz or slower1.

However, since then, we’ve been unable to reproduce the same positive effect on page load time. Neither has Denis. We suspect that recent work on the RefreshDriver, which changes how often the RefreshDriver runs during the page load window, is effectively getting the same kind of win2.

We did one final experiment to see whether or not lowering the frame rate would improve battery life, and it appeared to, but not to a very high degree. We might revisit that route were we tasked with trying to improve power usage in Firefox.

So, to reduce code complexity, Gijs landed patches to remove the low-end hardware switches and frame rate lowering code today. This experiment and project is now concluded. It’s not a satisfying end with a slum dunk perf win, but you can’t win them all.

Better about:newtab Preloading (Completed by Gijs Kruitbosch)

The patch to preload about:newtab in an idle callback has landed and stuck! This means that we don’t preload about:newtab immediately after opening a new tab (which is good for responsiveness right around the time when you’re likely to want to do something), and also means that we have the possibility of preloading the first new tab in new windows! Great job, Gijs!

Experiments with the Process Priority Manager (In-Progress by Mike Conley)

I had a meeting today with Saptarshi, one of our illustrious Data Scientists, to talk about the upcoming experiment. One of the things he led me to conclude was that this experiment is going to have a lot of confounds, and it will be difficult to conclude things from.

Part of the reason for that is because there are often times when a background tab won’t actually have its content process priority lowered. The potential reasons for this are:

  1. The tab is running in a content process which is also hosting a tab that is running in the foreground of either the same or some other browser window.
  2. The tab is playing audio or video.

Because of this, we can’t actually do things like measure how page load is being impacted by this feature because we don’t have a great sense of how many tabs have their content process priorities lowered. That’s just not a thing we collect with Telemetry. It’s theoretically possible, either due to how many windows or videos or tabs our Beta users have open, that very few of them will ever actually have their content process priorities lowered, and then the data we’d draw from Telemetry would be useless.

I’m working with Saptarshi now to try to find ways of either altering the process priority manager or adding new probes to reduce the number of potential confounds.

Grab bag of other performance improvements


  1. These criteria for what makes “weak hardware” was mostly plucked from the air, but we had to start somewhere. 

  2. But for all users, not just users on weaker hardware. 

Firefox Front-End Performance Update #14

We’re only a few weeks away from Firefox 67 merging from the Nightly channel to Beta, and since my last update, a number of things have landed.

It’s the end of a long week for me, so I apologize for the brevity here. Let’s check it out!

Document Splitting Foundations for WebRender (In-Progress by Doug Thayer)

dthayer is still trucking along here – he’s ironed out a number of glitches, and kats is giving feedback on some APZ-related changes. dthayer is also working on a WebRender API endpoint for generating frames for multiple documents in a single transaction, which should help reduce the window of opportunity for nasty synchronization bugs.

Warm-up Service (In-Progress by Doug Thayer)

dthayer is pressing ahead with this experiment to warm up a number of critical files for Firefox shortly after the OS boots. He is working on a prototype that can be controlled via a pref that we’ll be able to test on users in a lab-setting (and perhaps in the wild as a SHIELD experiment).

Startup Cache Telemetry (In-Progress by Doug Thayer)

dthayer landed this Telemetry early in the week, and data has started to trickle in. After a few more days, it should be easier for us to make inferences on how the startup caches are operating out in the wild for our Nightly users.

Smoother Tab Animations (In-Progress by Felipe Gomes)

UX, Product and Engineering are currently hashing out the remainder of the work here. Felipe is also aiming to have the non-responsive tab strip bug fixed soon.

Lazier Hidden Window (Completed by Felipe Gomes)

After a few rounds of landings and backouts, this appears to have stuck! The hidden window is now created after the main window has finished painting, and this has resulted in a nice ts_paint (startup paint) win on our Talos benchmark!

This is a graph of the ts_paint startup paint Talos benchmark. The highlighted node is the first mozilla-central build with the hidden window work. Lower is better, so this looks like a nice win!

There’s still potential for more improvements on the hidden window, but that’s been split out to a separate project / bug.

Browser Adjustment Project (In-Progress by Gijs Kruitbosch)

This project appears to be reaching its conclusion, but with rather unsatisfying results. Denis Palmeiro from Vicky Chin’s team has done a bunch of testing of both the original set of patches that Gijs landed to lower the global frame rate (painting and compositing) from 60fps to 30fps for low-end machines, as well as the new patches that decrease the frequency of main-thread painting (but not compositing) to 30fps. Unfortunately, this has not yielded the page load wins that we wanted1. We’re still waiting to see if there’s a least a power-usage win here worth pursuing, but we’re almost ready the pull the plug on this one.

Better about:newtab Preloading (In-Progress by Gijs Kruitbosch)

Gijs has a set of patches that should make this possible, which will mean (in theory) that we’ll present a ready-to-roll about:newtab when users request one more often than not.

Unfortunately, there’s a small snag with a test failure in automation, but Gijs is on the case.

Experiments with the Process Priority Manager (In-Progress by Mike Conley)

The Process Priority Manager has been enabled in Nightly for a number of weeks now, and no new bugs have been filed against it. I filed a bug earlier this week to run a pref-flip experiment on Beta after the Process Priority Manager patches are uplifted later this month. Our hope is that this has a neutral or positive impact on both page load time and user retention!

Make the PageStyleChild load lazily (Completed by Mike Conley)

There’s an infrequently used feature in Firefox that allows users to switch between different CSS stylesheets that a page might offer. I’ve made the component that scans the document for alternative stylesheets much lazier, and also made it skip non web-pages, which means (at the very least) less code running when loading about:home and about:newtab



  1. This was unexpected – we ran an experiment late in 2018 where we noticed that lowering the frame rate manually via the layout.frame_rate pref had a positive impact on page load time… unfortunately, this effect is no longer being observed. This might be due to other refresh driver work that has occurred in the meantime. 

Firefox Front-End Performance Update #13

It’s been just a little over two weeks since my last update, so let’s see where we are!

A number of our projects are centered around trying to improve start-up time. Start-up can mean a lot of things, so we’re focused specifically on cold start-up on the Windows 10 2018 reference device when the machine is at rest.

If you want to improve something, the first thing to do is measure it. There are lots of ways to measure start-up time, and one of the ways we’ve been starting to measure is by doing frame recording analysis. This is when we capture display output from a testing device, and then analyze the videos.

This animated GIF shows eight videos. The four on the left are Firefox Nightly, and the four on the right are Google Chrome (71.0.3578.98). The videos are aligned so that both browsers are started at the same time.

The four on the left are Firefox Nightly, and the four on the right are Google Chrome (71.0.3578.98)

Some immediate observations:

  • Firefox Nightly is consistently faster to reach first paint (that’s the big white window)
  • Firefox Nightly is consistently faster to draw its toolbar and browser UI
  • Google Chrome is faster at painting its initial content

This last bullet is where the team will be focusing its efforts – we want to have the initial content painted and settled much sooner than we currently do.

Document Splitting Foundations (In-Progress by Doug Thayer)

After some pretty significant refactorings to work better with APZ, Doug posted a new stack of patches late last week which will sit upon the already large stack of patches that have already landed. There are still a number of reviews pending on the main stack, but this work appears to be getting pretty close to conclusion, as the patches are in the final review and polish stage.

After this, once retained display lists are enabled in the parent process, and an API is introduced to WebRender to generate frames for multiple documents in a single transaction, we can start thinking about enabling document splitting by default.

Warm-up Service (In-Progress by Doug Thayer)

A Heartbeat survey went out a week or so back to get some user feedback about a service that would speed up the launching of Firefox at the cost of adding some boot time to Windows. The responses we’ve gotten back have been quite varied, but can be generally bucketed into three (unsurprising) groups:

  • Users who say they do not want to make this trade
  • Users who say they would love to make this trade
  • Users who don’t care at all about this trade

Each group is sufficiently large to warrant further exploration. Our next step is to build a version of this service that we can turn on and off with a pref and test either in a lab and/or out in the wild with a SHIELD study.

Startup Cache Telemetry (In-Progress by Doug Thayer)

We do a number of things to try to improve real and perceived start-up time. One of those things is to cache things that we calculate at runtime during start-up to the disk, so that for subsequent start-ups, we don’t have to do those calculations again.

There are a number of mechanisms that use this technique, and Doug is currently adding some Telemetry to see how they’re behaving in the wild. We want to measure cache hits and misses, so that we know how healthy our cache system is out in the wild. If we get signals back that our start-up caches are missing more than we expect, this will highlight an important area for us to focus on.

Smoother Tab Animations (In-Progress by Felipe Gomes)

UX has gotten back to us with valuable feedback on the current implementation, and Felipe is going through it and trying to find the simplest way forward to address their concerns.

Having been available (though disabled by default) on Nightly, we’ve discovered one bug where the tab strip can become unresponsive to mouse events. Felipe is currently working on this.

Lazy Hidden Window (In-Progress by Felipe Gomes)

Under the hood, Firefox’s front-end has a notion of a “hidden window”. This mysterious hidden window was originally introduced long long ago1 for MacOS, where it’s possible to close all windows yet keep the application running.

Since then, it’s been (ab)used for Linux and Windows as well, as a safe-ish place to do various operations that require a window (since that window will always be around, and not go away until shutdown).

That window opens pretty early during start-up, and Felipe found an old patch that was written, and then abandoned to make its construction lazier. Felipe thinks we can still make this idea work, and has noted that in our internal benchmarks, this shaves off a few percentage points on our start-up tests

Activity Stream seems to depend on the hidden window early enough that we think we’re going to have to find an alternative there, but once we do, we should get a bit of a win on start-up time.

Browser Adjustment Project (In-Progress by Gijs Kruitbosch)

Gijs updated the patch so that the adjustment causes the main thread to skip every other VSync rather than swithing us to 30fps globally2.

We passed the patch off to Denis Palmeiro, who has a sophisticated set-up that allows him to measure a pageload benchmark using frame recording. Unfortunately, the results we got back suggested that the new approach regressed visual page load time significantly in the majority of cases.

We’re in the midst of using the same testing rig to test the original global 30fps patch to get a sense of the magnitude of any improvements we could get here. Denis is also graciously measuring the newer patch to see if it has any positive benefits towards power consumption.

Better about:newtab Preloading (In-Progress by Gijs Kruitbosch)

By default, users see about:newtab / a.k.a Activity Stream when they open new tabs. One of the perceived performance optimizations we’ve done for many years now is to preload the next about:newtab in the background so that the next time that the user opens a tab, the about:newtab is all ready to roll.

This is a perceived performance optimization where we’re moving work around rather than doing less work.

Right now, we preload a tab almost immediately after the first tab is opened in a window. That means that the first opened tab is never preloaded, but the second one is. This is for historical reasons, but we think we can do better.

Gijs is working on making it so that we choose a better time to preload the tab – namely, when we’ve found an idle pocket of time where the user doesn’t appear to be doing anything. This should also mean that the first new tab that gets opened might also be preloaded, assuming that enough idle time was made available to trigger the preload. And if there wasn’t any idle time, that’s also good news – we never got in the users way by preloading when it’s clear they were busy doing something else

Experiments with the Process Priority Manager (In-Progress by Mike Conley)

The Process Priority Manager has been enabled on Nightly for a few weeks now. Except for a (now fixed) issue where audio playing in background tabs would drop samples periodically, it’s been all quiet for regression reports.

The next step is to file a bug to run an experiment on Beta to see how this work impacts page load time.

Enable the separate Activity Stream content process by default (Stalled by Mike Conley)

This work is temporarily stalled while I work on other things, so there’s not too much to report here.

Grab bag of notable performance work


  1. Check out that commit date – 2003! 

  2. The idea here being that we can then continue to composite scrolling and video at 60fps, but main thread paints will only be updated at 30fps