I think in order to truly understand what the DocShell currently is, we have to find out where the idea of creating it came from. That means going way, way back to its inception, and figuring out what its original purpose was.
So I’ve gone back, peered through various archived wiki pages, newsgroup and mailing list posts, and I think I’ve figured out that original purpose.1
The original purpose can be, I believe, summed up in a single word: embedding.
Embedding
Back in the late 90’s, sometime after the Mozilla codebase was open-sourced, it became clear to some folks that the web was “going places”. It was the “bees knees”. It was the “cat’s pajamas”. As such, it was likely that more and more desktop applications were going to need to be able to access and render web content.
The thing is, accessing and rendering web content is hard. Really hard. One does not simply write web browsing capabilities into their application from scratch hoping for the best. Heartbreak is in that direction.
Instead, the idea was that pre-existing web engines could be embedded into other applications. For example, Steam, Valve’s game distribution platform, displays a ton of web content in its user interface. All of those Steam store pages? Those are web pages! They’re using an embedded web engine in order to display that stuff.2
So making Gecko easily embeddable was, at the time, a real goal, and a real project.
nsWebShell
The problem was that embedding Gecko was painful. The top-level component that embedders needed to instantiate and communicate with was called “nsWebShell”, and it was a pretty unwieldy. Lots of internal knowledge about the internal workings of Gecko was leaked through the nsWebShell component, and it’s interface changed far too often.
It was also inefficient – the nsWebShell didn’t just represent the top-level “thing that loads web content”. Instances of nsWebShell were also used recursively for subdocuments within those documents – for example, (i)frames within a webpage. These nested nsWebShell’s formed a tree. That’s all well and good, except for the fact that there were things that the nsWebShell loaded or did that only the top-level nsWebShell really needed to load or do. So there was definitely room for some performance improvement.
In order to correct all of these issues, a plan was concocted to retire nsWebShell in favour of several new components and a slew of new interfaces. Two of those new components were nsDocShell and nsWebBrowser.
nsWebBrowser
nsWebBrowser would be the thing that embedders would drop into the applications – it would be the browser, and would do all of the loading / doing of things that only the top-level web browser needed to do.
The interface for nsWebBrowser would be minimal, just exposing enough so that an embedder could drop one into their application with little fuss, point it at a URL, set up some listeners, and watch it dance.
nsDocShell
nsDocShell would be… well, everything else that nsWebBrowser wasn’t. So that dumping ground that was nsWebShell would get dumped into nsDocShell instead. However, a number of new, logically separated interfaces would be created for nsDocShell.
Examples of those interfaces were:
- nsIDocShell
- nsIDocShellTreeItem
- nsIDocShellTreeNode
- nsIWebNavigation
- nsIWebProgress
- nsIBaseWindow
- nsIScrollable
- nsITextScroll
- nsIContentViewerContainer
- nsIInterfaceRequestor
- nsIScriptGlobalObjectOwner
- nsIRefreshURI
So instead of a gigantic, ever changing interface, you had lots of smaller interfaces, many of which could eventually be frozen over time (which is good for embedders).
These interfaces also made it possible to shield embedders from various internals of the nsDocShell component that embedders shouldn’t have to worry about.
Ok, but… what was it?
But I still haven’t answered the question – what was the DocShell at this point? What was it supposed to do now that it was created.
This ancient wiki page spells it out nicely:
This class is responsible for initiating the loading and viewing of a document.
This document also does a good job of describing what a DocShell is and does.
Basically, any time a document is to be viewed, a DocShell needs to be created to view it. We create the DocShell, and then we point that DocShell at the URL, and it does the job of kicking off communications via the network layer, and dealing with the content once it comes back.
So it’s no wonder that it was (and still is!) a dumping ground – when it comes to loading and displaying content, nsDocShell is the central nexus point of communications for all components that make that stuff happen.
I believe that was the original purpose of nsDocShell, anyhow.
And why “shell”?
This is a simple question that has been on my mind since I started this. What does the “shell” mean in nsDocShell?
Y’know, I think it’s actually a fragment left over from the embedding work, and that it really has no meaning anymore. Originally, nsWebShell was the separation point between an embedder and the Gecko web engine – so I think I can understand what “shell” means in that context – it’s the touch-point between the embedder, and the embedee.
I think nsDocShell was given the “shell” monicker because it did the job of taking over most of nsWebShell’s duties. However, since nsWebBrowser was now the touch-point between the embedder and embedee… maybe shell makes less sense. I wonder if we missed an opportunity to name nsDocShell something better.
In some ways, “shell” might make some sense because it is the separation between various documents (the root document, any sibling documents, and child documents)… but I think that’s a bit of a stretch.
But now I’m racking my brain for a better name (even though a rename is certainly not worth it at this point), and I can’t think of one.
What would you rename it, if you had the chance?
What is nsDocShell doing now?
I’m not sure what’s happened to nsDocShell over the years, and that’s the point of the next few posts in this series. I’m going to be going through the commits hitting nsDocShell from 1999 until the present day to see how nsDocShell has changed and evolved.
Hold on to your butts.
Further reading
The above was gleaned from the following sources:
- http://www-archive.mozilla.org/projects/blackwood/webclient/extra/03-02-00-travis.txt
- http://www-archive.mozilla.org/projects/embedding/webbrowser.html
- https://developer.mozilla.org/en-US/docs/Gecko/Embedding_Mozilla/
- http://www-archive.mozilla.org/projects/webshell/
- http://www-archive.mozilla.org/projects/embedding/embedapiref/embedapiIX.html
- http://www-archive.mozilla.org/projects/embedding/embedshell/design.html
- http://www-archive.mozilla.org/projects/webshell/design.html
- http://www-archive.mozilla.org/projects/embedding/docshell.html
- http://www-archive.mozilla.org/newlayout/doc/webwidget.html
- http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsIDocShell.idl
- I’m very much prepared to be wrong about any / all of this. I’m making assertions and drawing conclusions by reading and interpreting things that other people have written about DocShell – and if the telephone game is any indication, this indirect analysis can be lossy. If I have misinterpreted, misunderstood, or completely missed the point in any of the above, please don’t hesitate to comment, and I will correct it forthwith. ↩ 
- They happen to be using WebKit, the same web engine that powers Safari, and (until recently) Chromium.According to this, they’re using the Chromium Embedding Framework to display this web content. There are a number of applications that embed Gecko. Firefox is the primary consumer of Gecko. Thunderbird is another obvious one – when you display HTML email, it’s using the Gecko web engine in order to lay it out and display it. WINE uses Gecko to allow Windows-binaries to browse the web. Making your web engine embeddable, however, has a development cost, and over the years, making Gecko embeddable seems to have become less of a priority. Servo is a next-generation web browser engine from Mozilla Research that aims to be embeddable. ↩
My understanding was that Valve is actually embedding Chromium (possibly via the Chromium Embedding Framework) rather than using WebKit directly.
Hm, possibly, yes – though I was going off of this:
from this.
Where did you read about embedding Chromium directly? I’d be happy to update my post with additional sources.
Looks like Steam uses “Chromium Embedded Framework”, according to:
https://developer.valvesoftware.com/wiki/Chromium_Embedded_Framework
Here’s a pretty authoritative source:
https://developer.valvesoftware.com/wiki/Chromium_Embedded_Framework
Excellent – post updated with the correction and new source. Thanks Daniel and Peter!
It’s worth noting that DocShell and “outer” nsGlobalWindow have a 1-to-1 correspondence and jointly implement the concept of a Browsing Context [1] in HTM5. The latter was basically reverse-engineered from the former though, so it doesn’t say much from a historical perspective.
The diagrams at [2] can be useful. Ping me if you need any explanations on this stuff. Thanks for writing this all up!
[1] http://www.whatwg.org/specs/web-apps/current-work/multipage/browsers.html#browsing-context
[2] https://developer.mozilla.org/en-US/docs/Inner_and_outer_windows
Hm, I’m kind of sad that embedding has lost its priority.
Especially now that webkit has such a monopoly on embedding, it would be good for the ecosystem to have some competition in this area as well.
So are there plans to improve the embedding in gecko, or is the plan simply to “wait for servo”?
I think it’s “wait for Servo”.