Category Archives: Computer Science

Python Metaclasses in Review Board

So, after diving into the Review Board extension code, I hit a little snag.

It turns out I never learned about Python metaclasses.  And the extension code in Djblets / Review Board uses them.  In the map I made of the Review Board extension code, I represented my confusion with an image of some kind of quantum-divide-by-zero implosion.

I’ll get to the why for metaclasses in a second.  First, I’ll demonstrate how it’s currently being used in the code.

The Way Things Are

This snippit is from the Djblets library, in the extensions folder, in base.py:

...

class ExtensionHook(object):
    def __init__(self, extension):
        self.extension = extension
        self.extension.hooks.add(self)
        self.__class__.add_hook(self)

    def shutdown(self):
        self.__class__.remove_hook(self)

class ExtensionHookPoint(type):
    def __init__(cls, name, bases, attrs):
        super(ExtensionHookPoint, cls).__init__(name, bases, attrs)

        if not hasattr(cls, "hooks"):
            cls.hooks = []

    def add_hook(cls, hook):
        cls.hooks.append(hook)

    def remove_hook(cls, hook):
        cls.hooks.remove(hook)

...

And here are those classes being used in the Review Board extension directory, where it defines its hooks:

...

class DashboardHook(ExtensionHook):
    __metaclass__ = ExtensionHookPoint

    def get_entries(self):
        raise NotImplemented

class NavigationBarHook(ExtensionHook):
    """
    A hook for adding entries to the main navigation bar.
    """
    __metaclass__ = ExtensionHookPoint

    def get_entry(self):
        """
        Returns the entry to add to the navigation bar.

        This should be a dict with the following keys:

            * `label`: The label to display
            * `url`:   The URL to point to.
        """
        raise NotImplemented
...

The idea here is that, if someone wanted to write an extension, they could add a hook to the navigation bar by subclassing the hooks, like so:

(from the rbreports prototype extension)

...

class ReportsDashboardHook(DashboardHook):
    def get_entries(self):
        return [{
            'label': 'Reports',
            'url': settings.SITE_ROOT + 'reports/',
        }]

class ReportsExtension(Extension):
    is_configurable = True

    def __init__(self):
        Extension.__init__(self)

        self.url_hook = URLHook(self, patterns('',
            (r'^reports/', include('rbreports.urls'))))

        self.dashboard_hook = ReportsDashboardHook(self)
...

So you can see the __metaclass__ thing up there in the DashboardHookDashboardHook subclasses ExtensionHook, and has ExtensionHookPoint as a metaclass.

Why? And what is a metaclass anyways?  Why is this useful at all?

Well, luckily, I think I’ve figured it out.

Behold – Metaclasses

Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).
— Python Guru Tim Peters

There are plenty of resources available regarding Python’s metaclasses.  And I ended up reading a ton of them trying to figure out just what the hell metaclasses actually do.

As it turns out, the best explanation I got was from Wikipedia:

In object-oriented programming, a metaclass is a class whose instances are classes. Just as an ordinary class defines the behavior of certain objects, a metaclass defines the behavior of certain classes and their instances.

In particular, check out their example on Cars.  That, for me,  more or less set the record straight on how metaclasses are used.

But why does Review Board use them when defining those hooks, like DashboardHook?

Why?

Forget the metaclasses for just a second.

Something is missing from that code that I posted up.

I’ll pose it as a question:  How exactly is Review Board supposed to know about ReportsDashboardHook?  Like…if we’re rendering the Dashboard, and want to fire off calls to all of our DashboardHooks…how do we do it?  How can we find all of the active extension subclasses for DashboardHook?

One way you could do it, is to dive into the extensions directories, and use regular expressions to find defined classes.  That sounds like a lot of work, brittle, and prone to error.  What else?

Next, you could go back to those extension files, “eval” them (shudder), and extract the globals(), looking for ExtensionHook subclasses.  That also sounds like lots of work, brittle, and prone to error.

You could go for class.__subclasses__…but this gives you every subclass, and we just want the hooks that are for active extensions.  We could filter out all of the ones that aren’t activated, but we’d have to do that for every hook, and that blows.

Next, we could have developers add their hooks to a registry after defining them.  So, we could have:

class ReportsDashboardHook(ExtensionHook):
  # ...
  # code goes here
  # ...

dashboard_hook_list.append(ReportsDashboardHook)

And then when an extension is shut down, we just remove it from the global hook list.  That doesn’t look so bad, does it?  The only problem, is now we have to trust that extension developers will know to add those hooks to the global_hook_list?  It’s another step, another thing to forget, and another point of failure.

And it seems silly – I mean, why go through all this effort?  Review Board has already seen these hooks…why should it be so hard to just access the list activated hooks easily?

And there it is. That’s why I think we’re using metaclasses.

If you go back to ExtensionHook and ExtensionHookPoint, you’ll notice that ExtensionHookPoint has a list as an instance variable called hooks.  That’s the global hook list for that class of hook (DashboardHook).  And then ExtensionHook, in its constructor, automatically adds itself to the Extension’s internal list of hooks, as well as the global hook list, with the add_hook method.  Shutting down the hook removes that hook from the global hook list.

So now, all we need to do to add our implemented hook to the list of hooks, is to simply subclass DashboardHook, or NavigationBarHook, or any of those Review Board hooks.  No need to add the hook to the global list manually – Djblets / Review Board will take care of it.  So now, we can get all of the DashboardHook subclasses for activated hooks, easily and consistently, with no extra work for the extension developer.  Same with the NavigationBarHook subclasses.

Smooth as glass.  Nice.

Anyhow, that’s my understanding of it.  If I’m totally off, I’ll let you know when I find out.

The Review Board Extension Life-Cycle

According to the timeline, I’m still in the community-bonding period for GSoC.  Coding for my project is supposed to start sometime towards the end of May.

So I’m using the time to do the following:

  • Close as many small, easy tickets as I can for the  upcoming Review Board release.  I’ve already posted some patches for review.  More forthcoming.
  • Get to know the tools I’ll be using.  Review Board is hosted on Github, and I’m relatively new to the whole DVCS thing.  I’ve been figuring out how to use Git, how to post patches, merging, branching, etc.
  • Get to know the area I’ll be working in.  I’ve been figuring out how Django apps organize themselves.  I’ve also drawn up a map of the current state of the extension framework to help me visualize it.
  • Get to know the other developers working on Review Board.  I’ve been hanging out in the #reviewboard-soc FreeNode IRC channel.  Very nice, and helpful people to work with.
  • Develop a plan of attack for my project

And this last point is the one I want to talk about.

The Review Board Extension Life-Cycle

An extension isn’t just some isolated piece of code that gets crammed into an application.  When you’ve got multiple extensions already installed and running, installing and activating a new extension is like introducing a new animal into an ecosystem.  You have to make sure that your new animal plays nice with the others, and that, in the morning, there won’t be a pile of rotting corpses where your application used to be.

I may have gotten carried away with my metaphor.

So let’s look at what I’m envisioning as the life-cycle for a Review Board extension.  I’ll start right from the top.

Getting the Extension

Ideally, this will work as nicely as WordPress’s implementation:  a Review Board administrator is given a catalog of extensions to choose from within the Administrator interface, and one click later, the desired extensions are downloaded and ready to be installed.

For all you system administrators out there, that last idea might make your toes curl.  A Review Board administrator is not necessarily a system administrator, and the system administrator knows what he/she likes on their machine.  An application that can go and download other applications can be dangerous.  We have to ensure that the application that we’re downloading is the one we’re trying to download.  The last thing we need is some man-in-the-middle to do something cute and bork the code review machine.  And we want to ensure that the extension functions as advertised.  No hidden features.  No self-destruct mechanisms.  No back doors.

Do I have a plan for this part?  Well….no, not really.  I don’t imagine I’ll get that far – I consider it a little out of my scope.  So, for my project, I think it’ll satisfy if the system admin (or Review Board admin) can manually download the extension, decompress it, and place it where Review Board can work with it.  That other stuff can come later (and I’ll try to design so that it can come later easily).

Installing the Extension

Ok, so at this point, we’ve got our extension downloaded in a place where Review Board can see it.

So now what?

Now we need to install the extension.  To me, that means letting the extension put its roots into the RB install by creating database tables, preparing initial data, and generally doing everything to make conditions suitable for the extension to function.

When an extension install begins, I imagine it will consider the following questions (in no particular order):

  1. Do I (the extension) depend on other extensions to function?  If so, are those extensions present?  If not, let the user know so that they can go get them.
  2. Are there some extensions already installed that will conflict with me, or make me behave badly?  If so, let the user know so that they can either remove that conflict, or find an alternative extension.
  3. Is the user entirely aware of what I can do?  Make sure that my capabilities, limitations, and behavioural quirks are known to the user.

Once those 3 questions are answered, and everything is looking good for the install, the extension will create the database tables it needs (if any) in order to function.  If anything goes wrong during this process, the database changes will be rolled back, and the user will get a full read out about what went wrong.

If nothing goes wrong, the extension will be installed.  The user might then be asked to set some initial operating parameters for the extension.

Ok great – the extension is installed.  Now what?

Activating the Extension

For something that sounds so dramatic, the explanation about what happens is pretty short:  Review Board simply becomes aware that the extension is activated, and passes data through the necessary hooks in order for the extension to function properly.  Upon activation, the extension should do a quick double-check to ensure that all prerequisites for the extension have been met.  This is because we can’t trust users to activate an extension immediately after downloading them.

The Extension Runs

While it’s activated, the extension will probably react to various events that happen on Review Board.  Tables will be updated.  View methods will be run.  Templates will be rendered.

If anything, ever, goes horribly wrong with an extension, the following will happen:

  1. A log entry will be written, dumping the error, and information about what the user was trying to do
  2. An error message will be displayed to the user, trying to tell them what exactly happened
  3. The extension (and its dependents) will be deactivated.  They will no longer react to events on Review Board.  Their tables will still be there, the settings will still, but the extension will be, in essence, asleep.  Dormant.  Non-reactive.

The administrator could try to reactivate the extension at this point.  They might try to contact the extension developer for support.

The Extension is Un-installed

If the user wants to rid themselves of an extension, they must first deactivate it.  This will put it (and its dependents) in the dormant state.  It just switches them off, nothing else.

Deactivated extensions can then be un-installed.  If the user chooses to un-install the extension, the extension database tables and settings will be wiped out.

The extension itself won’t be deleted though – at least, not within the scope of my project.  The extension files will need to be removed from Review Board manually.

At this point, any dependents that this uninstalled extension had will no longer be able to be activated.

Anyhow, that’s how I envision the life-cycle.  It’s my first go at it, so I’d love to hear some feedback if you have any.

Code Spelunking: Review Board Extensions

So this summer, I’m working on Review Board for the Google Summer of Code.

Until my GSoC acceptance, my romps into the code had been relatively shallow.  But with my proposal being given the green light, I’ve started doing more extensive explorations.

Review Board is built using the Django web framework.  I haven’t worked with Django before, but I have quite a bit of experience with Rails, so that should be an asset.  Using a web framework means having (relatively) predictable source code layout, and Review Board is no exception.

Djblets

At one point or another, the Review Board developers realized that a lot of their code wasn’t Review Board specific, and could be abstracted out into an external library.

That library is called Djblets.

Among other things, Djblets adds a DataGrid component for easy record sorting and pagination.  There are improvements to Django’s Authentication system.  Functions for easily displaying a user’s Gravatar.

And, low and behold, there is a branch of Djblets that provides classes and functions for giving a Django application an extension framework.  The classes are abstract enough so that, in your Django application, you can specify different types and behaviours for your Hooks.

Djblets -> Review Board

The Review Board extension branch takes these Djblets extension classes, and extends them into DashboardHooks, NavigationBarHooks, ReviewRequestDetailHooks…lots of different hooks.

So, Djblets creates the foundation abstractions.  Review Board makes these abstractions a little more specific.  And then an extension writer needs to instantiate and use these classes to design their extensions.  It sounds complicated, I know.

So Let’s Map It Out

When I start learning a new code base, I do a lot of drawing.  To me, getting to now a code base is like getting to know a city, and that means walking around it, and mapping it out.

So I’ve taken the liberty of mapping out the extension classes that I’ve found, and how they relate to one another.  Note that at the bottom of my map, a simple extension (RB Reports) is using some of those classes to hook itself into Review Board.  You can find this, and other extensions,here.

My map of the extension framework

Click here to check out my map of the current state of the extension framework

Now, before someone in the department starts complaining about my misuse of UML:  I’m not a UML guy.  I just wanted an easy piece of diagramming software, and the one that I found (Dia), did UML.  I just wanted something to draw boxes and lines. So please don’t freak out if you think I’m using the wrong symbols.

One symbol you might be wondering about is the blue quantum-flux-capacitor-implosion.

I’ll save that for a future post.

Some Notes About My Experiment Design

This is mostly a reminder note to myself, but I thought I’d post it publicly.

So, remember the experiment I’m conducting?  I’ve been testing out various components of it while I wait for ethics board approval, and some interesting questions have come up.  Some of these questions have already been raised in other posts (and in comment threads – thanks for the discussion, all!), but I want to summarize them here.

Access to the Internet

When participants are working on the assignments, they will be given full access to the Internet.

I had a bunch of conversations with undergrad instructors here at UofT, and across the board, during programming exercises, students have full access to the Internet.  They’re prohibited from just copying and pasting code from somewhere else into their assignments, but they can certainly look at online documentation and examples to get some ideas.  So I’m going to allow this as well, in order to better model an actual undergraduate assignment.

I’m also considering writing a script that will take a screen capture every 30 seconds or so in the background.  This way,  I can quickly get a sense of the participants activity during the assignments.  This should hopefully give me some idea of how much time they’re spending in the browser, and how much time they’re spending writing Python.

Participant Programming Ability

The only prerequisite for my participants is that they must have 4+ months of Python programming experience.  I’m not filtering out the stronger or weaker programmers.  I’m taking them all.

Why?  Because I think this will give me a more accurate model of the composition of a first or second year programming course.  From my experience, students in first and second year programming courses have a wide spectrum of strengths and abilities.

But what if they don’t complete any of my assignments?  What if they stare blankly at the screen, or give up, or just surf Reddit because they can’t understand what I’m asking from them?  That’s OK – completion of the assignments is not necessary, and again – this relaxation helps to better model a real programming course (because there are invariably students who don’t start, or don’t finish a programming assignment).

And it’s OK if they don’t do well.  I’m taking a reading of their programming abilities with the first assignment, getting them to do peer grading, and then taking a reading with a second assignment.  I don’t care what absolute grades they get, I care about what happens after they’ve done the grading.

I might also see some interesting trends – for example, participants who perform poorly on the first assignment might benefit greatly from the peer grading, whereas participants who perform strongly on the first assignment might see little benefit.  Or vice-versa.  Or neither.  Granted, I probably won’t be able to collect enough participants to make such statements with statistical strength, but if a signal appears to be there, it’d be an interesting direction for future work.

30 Minute Time Limit

Students will have 30 minutes maximum to complete each assignment.  And that’s absolute – it includes reading the assignment, surfing the Internet, and coding it up.

This is because a hard time limit like this, again, better models the state of classrooms as they are.  In an open-book exam, you’re not timed on how long you’re spent only writing, with the clock paused as you glance at your textbook.  You have a firm time limit, and that’s it.

In an ideal world, students would be given a more personal level of teaching, as opposed to this mechanical, factory-floor approach.  But if I (or anybody) is ever going to implement these ideas into a real classroom setting, I’ll want to make sure it can be easily adapted into the teaching environment as it already is.