The Joy of Coding (Ep. 13): Printing. Again!

Had to deal with some network issues during this video – sorry if people were getting dropped frames during the live show! I have personally checked this recording, and almost all frames are there.

The only frames that are missing are the ones where I scramble around to connect to the wired network, which was boring anyhow.

In this episode, I worked on proxying the print dialog from the content process on OS X. It was a wild ride, and I learned quite a bit about Cocoa stuff. It was also a throwback to my very first episode, where I essentially did the same thing for Linux!

We’ll probably polish this off in the next episode, or in the episode after.

Episode Agenda

References

Bug 1091112 – Print dialog doesn’t get focus automatically, if e10s is enabled – Notes

Things I’ve Learned This Week (May 4 – May 8, 2015)

How to convert an NSString to a Gecko nsAString

I actually discovered this during my most recent Joy of Coding episode – there is a static utility method to convert between native Cocoa NSStrings and Gecko nsAStrings – nsCocoaUtils::GetStringForNSString. Very handy, and works exactly as advertised.

An “Attach to Process by pid” Keyboard Shortcut for XCode

I actually have colleague Garvan Keeley to thank for this one, and technically I learned this on April 24th. It was only this week that I remembered I had learned it!

When I’m debugging Firefox on OS X, I tend to use XCode, and I usually attach to Firefox after it has started running. I have to navigate some menus in order to bring up the dialog to attach to a process by pid, and I was getting tired of doing that over and over again.

So, as usual, I tweeted my frustration:

AND LO, THE INTERNET SPOKE BACK:

It seems small, but the savings in time for something that I do so frequently quickly adds up. And it always feels good to go faster!

Electrolysis and the Big Tab Spinner of Doom

Have you been using Firefox Nightly and seen this big annoying spinner?

Big Tab Spinner of Doom in an e10s tab

Aw, crap. You again.

I hate that thing. I hate it.

Me, internally, when I see the spinner.

And while we’re working on making the spinner itself less ugly, I’d like to eliminate, or at least reduce its presence to the absolute minimum.

How do I do that? Well, first, know your enemy.

What does it even mean?

That big spinner means that the graphics part of Gecko hasn’t given us a frame yet to paint for this browser tab. That means we have nothing yet to show for the tab you’ve selected.

In the single-process Firefox that we ship today, this graphics operation of preparing a frame is something that Firefox will block on, so the tab will just not switch until the frame is ready. In fact, I’m pretty sure the whole browser will become unresponsive until the frame is ready.

With Electrolysis / multi-process Firefox, things are a bit different. The main browser process tells the content process, “Hey, I want to show the content associated with the tab that the user just selected”, and the content process computes what should be shown, and when the frame is ready, the parent process hears about it and the switch is complete. During that waiting time, the rest of the browser is still responsive – we do not block on it.

So there’s this window of time where the tab switch has been requested, and when the frame is ready.

During that window of time, we keep showing the currently selected tab. If, however, 300ms passes, and we still haven’t gotten a frame to paint, that’s when we show the big spinner.

So that’s what the big spinner means – we waited 300ms, and we still have no frame to draw to the screen.

How bad is it?

I suspect it varies. I see the spinner a lot less on my Windows machine than on my MacBook, so I suspect that performance is somehow worse on OS X than on Windows. But that’s purely subjective. We’ve recently landed some Telemetry probes to try to get a better sense of how often the spinner is showing up, and how laggy our tab switching really is. Hopefully we’ll get some useful data out of that, and as we work to improve tab switch times, we’ll see improvement in our Telemetry numbers as well.

Where is the badness coming from?

This is still unclear. And I don’t think it’s a single thing – many things might be causing this problem. Anything that blocks up the main thread of the content process, like slow JavaScript running on a web-site, can cause the spinner.

I also seem to see the spinner when I have “many” tabs open (~30), and have a build going on in the background (so my machine is under heavy load).

Maybe we’re just doing things inefficiently in the multi-process case. I recently landed profile markers for the Gecko Profiler for async tab switching, to help figure out what’s going on when I experience slow tab switch. Maybe there are optimizations we can make there.

One thing I’ve noticed is that there’s this function in the graphics layer, “ClientTiledLayerBuffer::ValidateTile”, that takes much, much longer in the content process than in the single-process case. I’ve filed a bug on that, and I’ll ask folks from the Graphics Team this week.

How you can help

UPDATE (June 1, 2015): Getting profiles from Windows is currently broken because the symbol server appears to be busted. Any profiles from Windows machines will be useless until this bug is fixed. Alternatively, set profiler.symbolicationUrl to http://symbolapi.mocotoolsstaging.net in about:config.

If you’d like to help me find more potential causes, Profiles are very useful! NOTE – I don’t mean “user profiles”, as in, your bookmarks / customizations / history, etc, in the profile folder. I don’t mean this thing. I mean a performance profile.

A performance profile is a read-out of everything that Firefox / Gecko is doing over a particular span of time. When the profiler is running, Firefox / Gecko will record where the process is in the stack every 1ms or so. It’ll also record information about how long since it’s serviced the event loop, which helps us find jank.

To help, grab the Gecko Profiler add-on, make sure it’s enabled, and then dump a profile when you see the big spinner of doom. The interesting part will be between two markers, “AsyncTabSwitch:Start” and “AsyncTabSwitch:Finish”. There are also markers for when the parent process displays the spinner – “AsyncTabSwitch:SpinnerShown” and “AsyncTabSwitch:SpinnerHidden”. The interesting stuff, I believe, will be in the “Content” section of the profile between those markers. Here are more comprehensive instructions on using the Gecko Profiler add-on.

And here’s a video of me demonstrating how to use the profiler, and how to attach a profile to the bug where we’re working on improving tab switch times:

And here’s the link I refer you to in the video for getting the add-on.

So hopefully we’ll get some useful data, and we can drive instances of this spinner into the ground.

I’d really like that.

Things I’ve Learned This Week (April 27 – May 1, 2015)

Another short one this week.

You can pass DOM Promises back through XPIDL

XPIDL is what we use to define XPCOM interfaces in Gecko. I think we’re trying to avoid XPCOM where we can, but sometimes you have to work with pre-existing XPCOM interfaces, and, well, you’re just stuck using it unless you want to rewrite what you’re working on.

What I’m working on lately is nsIProfiler, which is the interface to “SPS”, AKA the Gecko Profiler. nsIProfiler allows me to turn profiling on and off with various features, and then retrieve those profiles to send to a file, or to Cleopatra1.

What I’ve been working on recently is Bug 1116188 – [e10s] Stop using sync messages for Gecko profiler, which will probably have me adding new methods to nsIProfiler for async retrieval of profiles.

In the past, doing async stuff through XPCOM / XPIDL has meant using (or defining a new) callback interface which can be passed as an argument to the async method.

I was just about to go down that road, when ehsan (or was it jrmuizel? One of them, anyhow) suggested that I just pass a DOM Promise back.

I find that Promises are excellent. I really like them, and if I could pass a Promise back, that’d be incredible. But I had no idea how to do it.

It turns out that if I can ensure that the async methods are called such that there is a JS context on the stack, I can generate a DOM Promise, and pass it back to the caller as an “nsISupports”. According to ehsan, XPConnect will do the necessary magic so that the caller, upon receiving the return value, doesn’t just get this opaque nsISupports thing, but an actual DOM Promise. This is because, I believe, that DOM Promise is something that is defined via WebIDL. I think. I can’t say I fully understand the mechanics of XPConnect2, but this all sounded wonderful.

I even found an example in our new Service Worker code:

From dom/workers/ServiceWorkerManager.cpp (I’ve edited the method to highlight the Promise stuff):

// If we return an error code here, the ServiceWorkerContainer will
// automatically reject the Promise.
NS_IMETHODIMP
ServiceWorkerManager::Register(nsIDOMWindow* aWindow,
                               nsIURI* aScopeURI,
                               nsIURI* aScriptURI,
                               nsISupports** aPromise)
{
  AssertIsOnMainThread();

  // XXXnsm Don't allow chrome callers for now, we don't support chrome
  // ServiceWorkers.
  MOZ_ASSERT(!nsContentUtils::IsCallerChrome());

  nsCOMPtr<nsPIDOMWindow> window = do_QueryInterface(aWindow);

  // ...

  nsCOMPtr<nsIGlobalObject> sgo = do_QueryInterface(window);
  ErrorResult result;
  nsRefPtr<Promise> promise = Promise::Create(sgo, result);
  if (result.Failed()) {
    return result.StealNSResult();
  }

  // ...

  nsRefPtr<ServiceWorkerResolveWindowPromiseOnUpdateCallback> cb =
    new ServiceWorkerResolveWindowPromiseOnUpdateCallback(window, promise);

  nsRefPtr<ServiceWorkerRegisterJob> job =
    new ServiceWorkerRegisterJob(queue, cleanedScope, spec, cb, documentPrincipal);
  queue->Append(job);

  promise.forget(aPromise);
  return NS_OK;
}

Notice that the outparam aPromise is an nsISupports**, and yet, I do believe the caller will end up handling a DOM Promise. Wicked!


  1. Cleopatra is the web application that can be used to browse a profile retrieved via nsIProfiler 

  2. Like being able to read the black speech of Mordor, there are few who can. 

The Joy of Coding (Ep. 12): Making “Save Page As” Work

After giving some updates on the last bug we were working on together, I started a new bug: Bug 1128050 – [e10s] Save page as… doesn’t always load from cache. The problem here is that if the user were to reach a page via a POST request, attempting to save that page from the Save Page item in the menu would result in silent failure1.

Luckily, the last bug we were working on was related to this – we had a lot of context about cache keys swapped in already.

The other important thing to realize is that fixing this bug is a bandage fix, or a wallpaper fix. I don’t think those are official terms, but it’s what I use. Basically, we’re fixing a thing with the minimum required effort because something else is going to fix it properly down the line. So we just need to do what we can to get the feature to limp along until such time as the proper fix lands.

My proposed solution was to serialize an nsISHEntry on the content process side, deserialize it on the parent side, and pass it off to nsIWebBrowserPersist.

So did it work? Watch the episode and find out!

I also want to briefly apologize for some construction noise during the video – I think it occurs somewhere halfway through minute 20 of the video. It doesn’t last long, I promise!

Episode Agenda

References

Bug 1128050 – [e10s] Save page as… doesn’t always load from cache – Notes


  1. Well, it’d show something in the Browser Console, but for a typical user, I think that’s still a silent failure.