Category Archives: Mozilla

The Joy of Coding (Episode 3)

The third episode is up! My machine was a little sluggish this time, since I had OBS chugging in the background attempting to do a hi-res screen recording simultaneously.

Richard Milewski and I are going to try an experiment where I try to stream with OBS next week, which should result in a much higher-resolution stream. We’re also thinking about having recording occur on a separate machine, so that it doesn’t bog me down while I’m working. Hopefully we’ll have that set up for next week.

So this third episode was pretty interesting. Probably the most interesting part was when I discovered in the last quarter that I’d accidentally shipped a regression in Firefox 36. Luckily, I’ve got a patch that fixes the problem that has been approved for uplift to Aurora and Beta. A point release is also planned for 36, so I’ve got approval to get the fix in there too. \o/

Here are the notes for the bug I was working on. The review feedback from karlt is in this bug, since I kinda screwed up where I posted the review request with MozReview.

1 person likes this post.

The Joy of Coding (Episode 2)

The second episode is up! We seem to have solved the resolution problem this time around – big thanks to Richard Milewski for his work there. This time, however, my microphone levels were just a bit low for the first half-hour. That’s my bad – I’ll make sure my gain is at the right level next time before I air.

Here are the notes for the bug I was working on.

And let me know if there’s anything else I can do to make these episodes more useful or interesting.

1 person likes this post.

On unsafe CPOW usage, and “why is my Nightly so sluggish with e10s enabled?”

If you’ve opened the Browser Console lately while running Nightly with e10s enabled, you might have noticed a warning message – “unsafe CPOW usage” – showing up periodically.

I wanted to talk a little bit about what that means, and what’s being done about it. Brad Lassey already wrote a bit about this, but I wanted to expand upon it (especially since one of my goals this quarter is to get a handle on unsafe CPOW usage in core browser code).

I also wanted to talk about sluggishness that some of our brave Nightly testers with e10s enabled have been experiencing, and where that sluggishness is coming from, and what can be done about it.

What is a CPOW?

“CPOW” stands for “Cross-process Object Wrapper”1, and is part of the glue that has allowed e10s to be enabled on Nightly without requiring a full re-write of the front-end code. It’s also part of the magic that’s allowing a good number of our most popular add-ons to continue working (albeit slowly).

In sum, a CPOW is a way for one process to synchronously access and manipulate something in another process, as if they were running in the same process. Anything that can be considered a JavaScript Object can be represented as a CPOW.

Let me give you an example.

In single-process Firefox, easy and synchronous access to the DOM of web content was more or less assumed. For example, in browser code, one could do this from the scope of a browser window:

let doc = gBrowser.selectedBrowser.contentDocument;
let contentBody = doc.body;

Here contentBody corresponds to the <body> element of the document in the currently selected browser. In single-process Firefox, querying for and manipulating web content like this is quick and easy.

In multi-process Firefox, where content is processed and rendered in a completely separate process, how does something like this work? This is where CPOWs come in2.

With a CPOW, one can synchronously access and manipulate these items, just as if they were in the same process. We expose a CPOW for the content document in a remote browser with contentDocumentAsCPOW, so the above could be rewritten as:

let doc = gBrowser.selectedBrowser.contentDocumentAsCPOW;
let contentBody = doc.body;

I should point out that contentDocumentAsCPOW and contentWindowAsCPOW are exposed on <xul:browser> objects, and that we don’t make every accessor of a CPOW have the “AsCPOW” suffix. This is just our way of making sure that consumers of the contentWindow and contentDocument on the main process side know that they’re probably working with CPOWs3. contentBody.firstChild would also be a CPOW, since CPOWs can only beget more CPOWs.

So for the most part, with CPOWs, we can continue to query and manipulate the <body> of the document loaded in the current browser just like we used to. It’s like an invisible compatibility layer that hops us right over that process barrier.

Great, right?

Well, not really.

CPOWs are really a crutch to help add-ons and browser code exist in this multi-process world, but they’ve got some drawbacks. Most noticeably, there are performance drawbacks.

Why is my Nightly so sluggish with e10s enabled?

Have you been noticing sluggish performance on Nightly with e10s? Chances are this is caused by an add-on making use of CPOWs (either knowingly or unknowingly). Because CPOWs are used for synchronous reading and manipulation of objects in other processes, they send messages to other processes to do that work, and block the main process while they wait for a response. We call this “CPOW traffic”, and if you’re experiencing a sluggish Nightly, this is probably where the sluggishness if coming from.

Instead of using CPOWs, add-ons and browser code should be updated to use frame scripts sent over the message manager. Frame scripts cannot block the main process, and can be optimized to send only the bare minimum of information required to perform an action in content and return a result.

Add-ons built with the Add-on SDK should already be using “content scripts” to manipulate content, and therefore should inherit a bunch of fixes from the SDK as e10s gets closer to shipping. These add-ons should not require too many changes. Old-style add-ons, however, will need to be updated to use frame scripts unless they want to be super-sluggish and bog the browser down with CPOW traffic.

And what constitutes “unsafe CPOW usage”?

“unsafe” might be too strong a word. “unexpected” might be a better term. Brad Lassey laid this out in his blog post already, but I’ll quickly rehash it.

There are two main cases to consider when working with CPOWs:

  1. The content process is already blocked sending up a synchronous message to the parent process
  2. The content process is not blocked

The first case is what we consider “the good case”. The content process is in a known good state, and its primed to receive IPC traffic (since it’s otherwise just idling). The only bad part about this is the IPC traffic.

The second case is what we consider the bad case. This is when the parent is sending down CPOW messages to the child (by reading or manipulating objects in the content process) when the child process might be off processing other things. This case is far more likely than the first case to cause noticeable performance problems, as the main thread of the content process might be bogged down doing other things before it can handle the CPOW traffic – and the parent will be blocked waiting for the messages to be responded to!

There’s also a more speculative fear that the parent might send down CPOW traffic at a time when it’s “unsafe” to communicate with the content process. There are potentially times when it’s not safe to run JS code in the content process, but CPOWs traffic requires both processes to execute JS. This is a concern that was expressed to me by someone over IRC, and I don’t exactly understand what the implications are – but if somebody wants to comment and let me know, I’ll happily update this post.

So, anyhow, to sum – unsafe CPOW usage is when CPOW traffic is initiated on the parent process side while the content process is not blocked. When this unsafe CPOW usage occurs, we log an “unsafe CPOW usage” message to the Browser Console, along with the script and line number where the CPOW traffic was initiated from.

Measuring

We need to measure and understand CPOW usage in Firefox, as well as add-ons running in Firefox, and over time we need to reduce this CPOW usage. The priority should be on reducing the “unsafe CPOW usage” CPOWs in core browser code.

If there’s anything that working on the Australis project taught me, it’s that in order to change something, you need to know how to measure it first. That way, you can make sure your efforts are having an effect.

We now have a way of measuring the amount of time that Firefox code and add-ons spend processing CPOW messages. You can look at it yourself – just go to about:compartments.

It’s not the prettiest interface, but it’s a start. The second column is the time processing CPOW traffic, and the higher the number, the longer it’s been doing it. Naturally, we’ll be working to bring those numbers down over time.

A possibly quick-fix for a slow Nightly with e10s

As I mentioned, we also list add-ons in about:compartments, so if you’re experiencing a slow Nightly, check out about:compartments and see if there’s an add-on with a high number in the second column. Then, try disabling that add-on to see if your performance problem is reduced.

If so, great! Please file a bug on Bugzilla in this component for the add-on, mention the name of the add-on4, describe the performance problem, and mark it blocking e10s-addons if you can.

We’re hoping to automate this process by exposing some UI that informs the user when an add-on is causing too much CPOW traffic. This will be landing in Nightly near you very soon.

PKE Meter, a CPOW Geiger Counter

Logging “unsafe CPOW usage” is all fine and dandy if you’re constantly looking at the Browser Console… but who is constantly looking at the Browser Console? Certainly not me.

Instead, I whipped up a quick and dirty add-on that plays a click, like a Geiger Counter, anytime “unsafe CPOW usage” is put into the Browser Console. This has already highlighted some places where we can reduce unsafe CPOW usage in core Firefox code – particularly:

  1. The Page Info dialog. This is probably the worse offender I’ve found so far – humongous unsafe CPOW traffic just by opening the dialog, and it’s really sluggish.
  2. Closing tabs. SessionStore synchronously communicates with the content process in order to read the tab state before the tab is closed.
  3. Back / forward gestures, at least on my MacBook
  4. Typing into an editable HTML element after the Findbar has been opened.

If you’re interested in helping me find more, install this add-on5, and listen for clicks. At this point, I’m only interested in unsafe CPOW usage caused by core Firefox code, so you might want to disable any other add-ons that might try to synchronously communicate with content.

If you find an “unsafe CPOW usage” that’s not already blocking this bug, please file a new one! And cc me on it! I’m mconley at mozilla dot com.

4 people like this post.

  1. I pronounce CPOW as “kah-POW”, although I’ve also heard people use “SEE-pow”. To each his or her own. 

  2. For further reading, Bill McCloskey discusses CPOWs in greater detail in this blog post. There’s also this handy documentation

  3. I say probably, because in the single-process case, they’re not working with CPOWs – they’re accessing the objects directly as they used to. 

  4. And say where to get it from, especially if it’s not on AMO. 

  5. Source code is here 

The Joy of Coding (Episode 1)

Here’s the first episode! I streamed it last Wednesday, and it was mostly concerned with bug 1090439, which is about making the print dialog and progress calls from the child process asynchronous.

Here are the notes for that bug. I still haven’t closed it yet, so perhaps I’ll keep pressing on this next Wednesday when I stream Episode 2. We’ll see!

A note that I did struggle with some resolution issues in this episode. I’m working with Richard Milewski from the Air Mozilla team to make this better for the next episode. Sorry about that!

2 people like this post.

The Joy of Coding (or, Firefox Hacking Live!)

A few months back, I started publishing my bug notes online, as a way of showing people what goes on inside a Firefox engineer’s head while fixing a bug.

This week, I’m upping the ante a bit: I’m going to live-hack on Firefox for an hour and a half for the next few Wednesday’s on Air Mozilla. I’m calling it The Joy of Coding1. I’ll be working on real Firefox bugs2 – not some toy exercise-bug where I’ve pre-planned where I’m going. It will be unscripted, unedited, and uncensored. But hopefully not uninteresting3!

Anyhow, the first episode airs this Wednesday. I’ll be using #livehacking on irc.mozilla.org as a backchannel. Not sure what bug(s) I’ll be hacking on – I guess it depends on what I get done on Monday and Tuesday.

Anyhow, we’ll try it for a few weeks to see if folks are interested in watching. Who knows, maybe we can get a few more developers doing this too – I’d enjoy seeing what other folks do to fix their bugs!

Anyhow, I hope to see you there!

5 people like this post.

  1. Maybe I’ll wear an afro wig while I stream 

  2. Specifically, I’ll be working on Electrolysis bugs, since that’s what my focus is on these days. 

  3. I’ve actually piloted this for the past few weeks, streaming on YouTube Live. Here’s a playlist of the pilot episodes