Firefox Performance Update #9

Hello, Internet! Here we are with yet another Firefox Performance Update for your consumption. Hold onto your hats – we’re going in!

But first a word from our sponsor: ScriptPreloader!

A lot of the Firefox front-end is written using JavaScript. With the possible exception of system add-ons that update outside of the normal release cycle, these scripts tend to be the same until you update.

About a year ago, Mozilla developer Kris Maglione had an idea: let’s try to optimize browser start time by noticing which scripts are being loaded during start-up, and then converting those scripts into a binary representation ((XDR, I think?)) that we can cache on disk. That way, next time we start up, we can just grab the cached binaries off of the disk, skip the parsing step and start executing the JavaScript right away.

Long-time Mozillians might know that we already do some aggressive caching to improve start time for things like XUL, XBL, manifests and other things that are read at start-up. I think we actually were already caching JavaScript files too – but I don’t think we were storing them pre-parsed. And the old caching stuff was definitely not caching scripts that were loading in content processes (since content processes didn’t exist when the old caching stuff was designed).

At any rate, my understanding is that the ScriptPreloader pays attention to script loads between main process start and the point where the first browser window fires the “browser-delayed-startup-finished” observer notification (after the window paints and does post-painting script loading). At that point, the ScriptPreloader examines the list of scripts that the parent and content processes have loaded, and ((My understanding breaks down here a little)) writes their pre-parsed bytecode representation to disk.

After that cache is written, the next time the main process or content processes start up, the cache is checked for the binary data. If it exists, this means that we can skip the parsing step. The ScriptPreloader goes one step further and starts to “decode” ((I assume that’s a type of de-serialization)) that binary format off of the main thread, even before those scripts are requested. Then, when the scripts are finally requested, they’re very much ready to execute right away.

When the ScriptPreloader landed, we saw some really nice wins in our start-up performance!

I’m now working on a series of patches in this bug that will widen the window of time where we note scripts that we can cache. This will hopefully improve the speed of privileged scripts that run up until the idle point of the first browser window.

And now for some Performance Project updates!

Early first blank paint (lead by Florian Quèze)

User Research has hired a contractor to perform a study to validate our hypothesis that the early first blank paint perceived performance optimization will make Firefox seem like it’s starting faster. More data to come out of that soon!

Faster content process start-up time (lead by Felipe Gomes)

The patches that Felipe wrote a few weeks back have landed and have had a positive impact! The proof is in the pudding – let’s look at some graphs:

The cpstartup impact. Those two clusters are test runs “before” and “after” Felipe’s patches landed, respectively.

The above graph shows a nice drop in the cpstartup Talos test. The cpstartup test measures the time it takes to boot up the content process and have it be ready to show you web pages.

This is a screen capture of a Base Content JS improvement in the AreWeSlimYet test. This graph measures the amount of memory that content processes consume via JavaScript not long after starting up.

In the graph above, we can see that the patches also helped reduce the memory that content processes use by default, by making more scripts only load when they’re needed.

It’s always nice to see our work have an impact in our graphs. Great work, Felipe! Keep it up!

LRU cache for tab layers (lead by Doug Thayer)

The patch to introduce the LRU cache landed last week, and was enabled for a few days so we could collect some data on its performance impact.

The good news is that it appears that this has had a significant and positive impact on tab switch times – tab switch times went down, and the number of Nightly instances reporting tab switch spinners went down by about 10%. Great work, Doug!

A number of bugs were filed against the original bug due to some glitchy edge-cases that we don’t handle well just yet.

We also detected a ~8% resident memory regression in our automated testing suites. This was expected (keeping layers around isn’t free!) and gave us a sense of how much memory we might consume were we to enable this by default.

The experiment is concluded for now, and we’re going to disable the cache for a bit while we think about ways to improve the implementation.

ClientStorageTextureSource for macOS (lead by Doug Thayer)

This project should allow us to be more efficient when uploading layers to the compositor on macOS. Doug has solved the crashing issues he was getting in automation(yay!), and is now attempting to figure out some Talos regressions on the MotionMark test suite. Deeper profiling is likely required to untangle what’s happening there.

Swapping DataURLs for Blobs in Activity Stream (lead by Jay Lim)

Jay’s patch to swap out DataURLs for Blobs for Activity Stream images has passed a first round of review from Mardak! He’s now waiting for a second review from k88hudson, and then hopefully this can land and give us a bit of a memory win. Having done some analysis, we expect this buy back quite a bit of memory that was being contained within those long DataURL strings.

Caching Activity Stream JS in the JS Bytecode Cache (lead by Jay Lim)

After examining the JavaScript Bytecode Cache that’s used for Web Content, Jay has determined that it’s really not the right mechanism for caching the Activity Steam scripts.

However, that ScriptPreloader that I was talking about earlier sounds like a much more reasonable candidate. Jay is now doing a deep dive on the ScriptPreloader to see whether or not the Activity Stream scripts are already being cached – and if not, why not.

Tab warming (lead by Mike Conley)

No news is good news here. Tab warming continues to ride and no new bugs have been filed. The work to reduce the number of paints when warming tabs has stalled a bit while I dealt with a rather strange cpstartup Talos regression. Ultimately, I think I can get rid of the second paint when warming by keeping background tabs display port suppressed ((This is an optimization that we do that shrinks the painted area to just the region that’s visible to the browser. We normally paint a bit outside the viewable area so that it’s ready when a user starts scrolling)), and then only triggering the display port unsuppression after a tab switch. This will happily take advantage of a painting mechanism that Doug Thayer put in as part of the LRU cache experiment.

Firefox’s Most Wanted: Performance Wins (lead by YOU!)

Before we go into the grab-bag list of performance-related fixes – have you seen any patches landing that should positively impact Firefox’s performance? Let me know about it so I can include it in the list, and give appropriate shout-outs to all of the great work going on! That link again!

Grab-bag time

And now, without further ado, a list of performance work that took place in the tree:

(🌟 indicates a volunteer contributor)

Thanks, folks!