Tag Archives: firefox

Firefox Front-End Performance Update #12

Well, here I am again – apologizing about a late update. Lots of stuff has been going on performance-wise in the Firefox code-base, and I’ll just be covering a small section of it here.

You might also notice that I changed the title of the blog series from “Firefox Performance Update” to “Firefox Front-end Performance Update”, to reflect that the things the Firefox Front-end Performance team is doing to keep Firefox speedy (though I’ll still add a grab-bag of other performance related work at the end).

So what are we waiting for? What’s been going on?

Migrate consumers to the new Places Observer system (Paused by Doug Thayer)

Doug was working on this later in 2018, and successfully ported a good chunk of our bookmarks code to use the new batched Places Observer system. There’s still a long-tail of other call sites that need to be updated to the new system, but Doug has shifted focus from this to other things in the meantime.

Document Splitting (In-Progress by Doug Thayer)

With WebRender becoming an ever-closer reality to our general user population, Doug has been focusing on “Document Splitting”, which makes WebRender more efficient by splitting updates that occur in the browser UI from updates that occur in the content area.

This has been a pretty long-haul task, but Doug has been plugging away, and landed a significant chunk of the infrastructure for this. At this time, Doug is working with kats to make Document Splitting integrate nicely with Async-Pan-Zooming (APZ).

The current plan is for Document Splitting to land disabled by default, since it’s blocked by parent-process retained display lists (which still have a few bugs to shake out).

Warm-up Service (In-Progress by Doug Thayer)

Doug is investigating the practicalities of having a service run during Windows start-up to preload various files that Firefox will need when started.

Doug’s prototype shows that this can save us something like 1 second of net start-up time, at least on the reference hardware.

We’re still researching this at multiple levels, and haven’t yet determined if this is a thing that we’d eventually want to ship. Stay tuned.

Smoother Tab Animations (In-Progress by Felipe Gomes)

After much ado, simplification, and review back-and-forth, the initial set of new tab animations have landed in Nightly. You can enable them by setting browser.tabs.newanimations to true in about:config and then restarting the browser. These new animations run entirely on the compositor, instead of painting at each refresh driver tick, so they should be smoother than the current animations that we ship.

There are still some cases that need new animations, and Felipe is waiting on UX for those.

Overhauling about:performance (V1 Completed by Florian Quèze)

The new about:performance shipped late last year, and now shows both energy as well as memory usage of your tabs and add-ons.

The current iteration allows you to close the tabs that are hogging your resources. Current plans should allow users to pause JavaScript execution in busy background tabs as well.

Browser Adjustment Project (In-Progress by Gijs Kruitbosch)

Gijs has landed some patches in Nightly (which have recently uplifted to Beta, and are only enabled on early Betas), which lowers the default frame rate of Firefox from 60fps to 30fps on devices that are considered “low-end”1.

This has been on Nightly for a while, but as our Nightly population tends to skew to more powerful hardware, we expect not a lot of users have experienced the impact there.

At least one user has noticed the lowered frame rate on Beta, and this has highlighted that our CPU sampling code doesn’t take dynamic changes to clock speed into account.

While the lowered frame rate seemed to have a positive impact on page load time in the lab on our “low-end” reference hardware, we’re having a much harder time measuring any appreciable improvement in CI. We have scheduled an experiment to see if improvements are detectable via our Telemetry system on Beta.

We need to be prepared that this particular adjustment will either not have the desired page load improvement, or will result in a poorer quality of experience that is not worth any page load improvement. If that’s the case, we still have a few ideas to try, including:

  • Lowering the refresh driver tick, rather than the global frame rate. This would mean things like scrolling and videos would still render at 60fps, but painting the UI and web content would occur at a lower frequency.
  • Use the hardware vsync again (switching to 30fps turns hardware vsync off), but just paint every other time. This is to test whether or not software vsync results in worse page load times than hardware vsync.

Avoiding spurious about:blank loads in the parent process (Completed by Gijs Kruitbosch)

Gijs short-circuited a bunch of places where we were needlessly creating about:blank documents that we were just going to throw away (see this bug and dependencies). There are still a long tail of cases where we still do this in some cases, but they’re not the common cases, and we’ve decided to apply effort for other initiatives in the meantime.

Experiments with the Process Priority Manager (In-Progress by Mike Conley)

This was originally Doug Thayer’s project, but I’ve taken it on while Doug focuses on the epic mountain that is WebRender Document Splitting.

If you recall, the goal of this project is to lower the process priority for tabs that are only sitting in the background. This means that if you have tabs in the background that are attempting to use system resources (running JavaScript for example), those tabs will have less priority at the operating system level than tabs that are in the foreground. This should make it harder for background tabs to cause foreground tabs to be starved of processing resources.

After clearing a few final blockers, we enabled the Process Priority Manager by default last week. We also filed a bug to keep background tabs at a higher priority if they’re playing audio and video, and the fix for that just landed in Nightly today.

So if you’re on Windows on Nightly, and you’re curious about this, you can observe the behaviour by opening up the Windows Task Manager, switching to the “Details” tab, and watching the “Base priority” reading on your firefox.exe processes as you switch tabs.

Cheaper tabs in titlebar (Completed by Mike Conley)

After an epic round of review (thanks, Dao!), the patches to move our tabs-in-titlebar logic out of JS and into CSS landed late last year.

Along with simplifying our code, and hammering out at least one pretty nasty layout bug, this also had the benefit of reducing the number of synchronous reflows caused when opening new windows to zero.

This project is done!

Enable the separate Activity Stream content process by default (In-Progress by Mike Conley

There’s one known bug remaining that’s preventing us from letting the privileged content process from being enabled by default.

Thankfully, the cause is understood, and a fix is being worked on. Unfortunately, this is one of those bugs where the proper solution involves refactoring a bit of old crufty stuff, so it’s taking longer than I’d like.

Still, if all goes well, this bug should be closed out soon, and we can see about letting the privileged content process ride the trains.

Grab bag of notable performance work

This is an informal list of things that I’ve seen land in the tree lately that I believe will have a positive performance impact for our users. Have you seen something that you’d like to nominate for a future list? Submit the bug here!

Also, keep in mind that some of these landed months ago and already shipped to release. That’s what I get for taking so long to write a blog post.


  1. For now, “low-end” means a machine with 2 or fewer cores, and a clock speed of 1.8Ghz or slower 

Photon Engineering Newsletter #14

Just like jaws did last week, I’m taking over for dolske this week to talk about stuff going on with Photon Engineering. So sit back, strap in, and absorb Photon Engineering Newsletter #14!

If you’ve got the release calendar at hand, you’ll note that Nightly 57 merges to Beta on September 20th. Given that there’s usually a soft-freeze before the merge, this means that there are less than 4 weeks remaining for Photon development. That’s right – in less than a month’s time, folks on the Beta channel who might not be following Nightly development are going to get their first Photon experience. That’ll be pretty exciting!

So with the clock winding down, the Photon team has started to shift more towards polish and bug-fixing. At this point, all of the major changes should have landed, and now we need to buff the code to a sparkling sheen.

The first thing you may have noticed is that, after a solid run of dogefox, the icon has shifted again:

The new Nightly icon

We now return you to your regularly scheduled programming

The second big change are our new 60fps1 loading throbbers in the tabs, coming straight to you from the Photon Animations team!

The new loading throbber in Nightly

I think it’s fair to say that Photon Animations are giving Firefox a turbo boost!

Other recent changes

Menus and structure

Animations

  • Did we mention the new tab loading throbber?

Preferences

  • All MVP work is completed! The team is now fixing polish bugs. Outstanding!

Visual redesign

Onboarding

Performance


  1. The screen capturing software I used here is only capturing at 30fps, so it’s really not doing it justice. This tweet might capture it better. 

A printing story and a PSA: outparams over the IPC layer might not behave like you’d expect

Here’s a story about a printing bug on OS X, and a lesson about how our IPC layer works.

Last week, :ehsan came up to my desk and said “Mike… printing is broken on OS X with e10s enabled. Did you know this?”

It was sad to hear. Nobody, as far as I knew, had been touching printing code (besides bobowen, but he hadn’t landed his patches yet), which meant that some mystery change had landed and it was my responsibility (as the one who had implemented printing on OS X for e10s) to fix it.

The first step was to confirm it. Yep, it looked like I couldn’t print on my Nightly. Crap1. So we get the bug filed, and then I fired up my local build and lldb, and tried to trace out where the problem was occurring.

Drilling down to the regressing changeset

The problem was that my local build printed just fine. Eyebrow raised, I tried using my default profile with my local build to see if something about my profile was causing printing to break. Printing continued to work with my local build2.

Scratching my head, I used mozregression to bisect the problem down to a single changeset.

Here’s the bug it spat out:

Bug 1209930 Update Mac clang to match the version in use everywhere else

The hair stood up on the back of my neck. Printing got broken by a compiler upgrade? This had bad news written all over it.

Many try builds

So that, I guess, explained why I couldn’t reproduce the problem locally. I needed to build with a particular compiler in order to experience the bug.

The bad news was that I found out that the version of clang that we’d upgraded to didn’t work on my version of OS X (10.10.5). It was supposed to work on machines running OS X 10.7, but I didn’t have access to any3.

I was able to get debug builds off of try, but I had no luck convincing lldb to let me step through it, despite having the symbols and the right revision checked out4.

So this meant a lot of try builds. The good news is that I was able to add a bunch of logging to try, and just kinda walk away while it built. A few hours later, I’d have my build, and I’d get my results. I wasn’t blocked waiting on it to build, so I could work on other things.

What did the logging reveal?

Our printing code sometimes shows this progress dialog when printing starts. It’s put up so that while we’re laying out the page for printing, the user knows that stuff is happening. We show this progress dialog on Linux and Windows, but not on OS X (I guess OS X users don’t expect such a progress window… this was a decision made way back in the day).

The printing engine in the content process sends up a message to the parent saying, “I’m all set to print, let’s show the progress dialog!”. On OS X, the parent responds, “Uh, nope” (which is a return code of NS_ERROR_NOT_IMPLEMENTED), and the child is supposed to go, “Oh okay, I’ll just print right away”.

But here’s the kicker – along with message to show the progress dialog, the child sends an outparam5. That outparam is sent because for some platforms that show the progress dialog, we want to wait until the dialog closes before starting the actual print. On other platforms, we may just want to start printing right away when laying out the page finishes without waiting for the dialog to close.

That outparam is a bool initialized to a value of false in the print engine, and in the non-e10s case where we return “Sorry, no progress dialog” with NS_ERROR_NOT_IMPLEMENTED, that bool stays false, and when it stays false, it means that we start printing right away.

My logging showed that for some reason, that value was getting flipped to true despite not being touched by (seemingly) anything in the IPC sender / receiver nor the print progress dialog backend (which still just returns NS_ERROR_NOT_IMPLEMENTED when asked to open the printing dialog).

What the hell?

Here’s the PSA

Ehsan helped me dig into what was going on, and we figured it out. Here’s the big lesson / PSA here:

outparams over IPC, when untouched on the receiver side, get filled with values from uninitialized memory when returned to the sender.

I’ll give you an example. Here’s some sorta-pseudocode:

#include <stdio.h>

void someFunc(bool* aOutParam) {
}

int main(int argc, char** argv) {
  bool myBool = false;
  someFunc(&myBool);
  if (myBool) {
    printf("Wait, WHAT?");
    return 1;
  }
  return 0;
}

We should never enter that “Wait, WHAT?” conditional block because myBool remains false throughout. It’s initialized to false, and even though we pass a pointer to it to someFunc, someFunc never touches it, so it stays false, and so we skip the conditional and return 0 as expected.

If, however, you did the same thing, but over IPC… well, now you’re in for a fun treat.

On the receiver side of the IPC message, here’s a chunk of the generated code that calls into RecvShowPrintProgress on the parent side:

            bool notifyOnOpen;
            bool success;
            int32_t id__ = mId;
            if \((!(RecvShowProgress(mozilla::Move(browser), mozilla::Move(printProgressDialog), mozilla::Move(isForPrinting), (&(notifyOnOpen)), (&(success)))))) {
                mozilla::ipc::ProtocolErrorBreakpoint("Handler for ShowProgress returned error code");
                return MsgProcessingError;
            }

            reply__ = new PPrinting::Reply_ShowProgress(id__);

            Write(notifyOnOpen, reply__);
            Write(success, reply__);
            (reply__)->set_sync();
            (reply__)->set_reply();

That notifyOnOpen bool is the one that is the outparam in the child. Notice that it’s not being initialized to any value! That means its current value could be, well, anything. When we call into RecvShowProgress, we pass a pointer to that anything value. If RecvShowProgress doesn’t do anything with that pointer (like in the OS X case), well… that random value is what gets written to the IPC message that gets sent back to the child.

And somehow a clang upgrade made it far more likely that this value would evaluate to TRUE as a boolean instead of FALSE.

And so the child assumed that it’d have to wait for a dialog to close before starting to print – a dialog that would never open, because we don’t show it on OS X.

So the patch I eventually wrote initializes the outparam to false before calling into the OS X widget code that just returns NS_ERROR_NOT_IMPLEMENTED, and that fixed the bug.

It’s a small patch, but in my mind, it’s a big lesson, at least for me. I had assumed that outparams would work the same over IPC as they do in the same process, and that’s simply not the case.

I’ve filed bug 1220168 to try to make this sort of thing easier to spot in the future.


  1. Our automated testing story for printing is pretty bad. We test some of the built-in UI for printing, but as for actually sending stuff to printer drivers… zero automated tests, which explains why this had gone uncaught for so long. 

  2. If you’re wondering, I wasn’t printing out reams of paper to test this. XCode comes with a Printer Simulator, which is like a PDF printer, except that I can get a better sense of the communications being sent to the printer 

  3. Okay, there’s one in the office, but apparently trying to build on it is a slow nightmare 

  4. Happy to hear and share how to do this if someone will show me 

  5. Read this about passing multiple values back from a method call if you’re curious about what outparams are 

The Joy of Coding – Ep’s 23 – 29

Wow! I’ve been a way from this blog for too long. I also haven’t posted any new episodes for The Joy of Coding. I also haven’t been keeping up with my Things I’ve Learned posts.

Time to get back in the saddle. First thing’s first, here are 6 episodes of The Joy of Coding that have aired. Unfortunately, I haven’t put together summaries for any of them, but I’ve put their agendas near the videos so that might give some clues.

Here we go!

Episode 23

Agenda

Episode 24

Agenda

Episode 25

Agenda

Episode 26

Agenda

Episode 27

Agenda

Episode 28

Agenda

Episode 29

Agenda

The Joy of Coding (Ep. 20): Reviewin’ and Mystery Solvin’

After a two week hiatus, we’re back with Episode 20!

In this episode, I start off by demonstrating my new green screen1, and then dive right into reviewing some code to make the Lightweight Theme web installer work with e10s.

After that, I start investigating a mystery that my intern ran into a few days back, where for some reason, preloaded about:newtab pages were behaving really strangely when they were loaded in the content process. Strangely, as in, the pages wouldn’t do simple things, like reload when the user pressed the Reload button.

Something strange was afoot.

Do we solve the mystery? Do we figure out what’s going on? Do we find a solution? Tune in and find out!

Episode agenda.

References

Bug 653065 – Make the lightweight theme web installer ready for e10s
Bug 1181601 – Unable to receive messages from preloaded, remote newtab pageNotes
@mrrrgn hacks together a WebSocket server implementation in Go. To techno!


  1. Although throughout the video, the lag between the audio and the video gets worse and worse – sorry about that. I’ll see what I can do to fix that for next time.