Tag Archives: gsoc

MoMo All-Hands: Day 3 (Data-Driven, Don’t Be Creepy, Italian-Chinese Dinner, Hipster-slamming)

At around 7:30AM, I rolled out of bed, cleaned myself up, and headed down to breakfast.

Breakfast that day was similar to the day before:  yogurt and granola.  Coffee and juice.  The cakes, however, had gotten the axe, and had been replaced by scones.

Very tasty.  A bunch of us ate breakfast out on the meeting room patio.  Once again, it was a gorgeous morning.

After breakfast, we all went inside to talk about data. Specifically, that we aim to be data-driven.  This means that if we’re making a big decision about Thunderbird, or any of the other stuff we’re working on, we should probably have some solid data to back up those decisions.  It’s a good idea; the road to bad design is paved with good intentions, and lack of data.

But how exactly are we going to get this data?  Are we simply going to monitor our users without their knowledge, like Big Brother, and study them like lab rats?  Are we going to collect reams of data about them secretly and silently in the background, without telling our users or giving them a choice?

Of course not, because that’d be evil.  And creepy.  Don’t track me, bro.

Instead, we will always ask the user if they’re interested in submitting data for study.  In general, our data collection is opt-in – and instead of tracking individuals, we aggregate the data, so that we never have a single person as a data point.  Nice.

A lot of ideas got tossed around about how we can ask the users for data, and what type of data we were interested in.  Some very interesting discussions took place regarding the Thunderbird “funnel” (the action path from visiting the Mozilla Thunderbird website, to downloading TB, to installing TB, to running TB, to making TB something commonly used).  Our funnel is pretty wide, but some website tweaks might make it even wider.  I’m excited to hear more about it.

After that, lunch.  Roasted chicken, mashed potatoes, veggies…once again, very tasty.  Cake for dessert.  We were getting pretty spoiled.

Following lunch, a bunch of us went outside to hear Andrew Sutherland talk about Wmsy – his constraint-based widgeting framework.  This was one of the talks that took place out on the patio, and the sun was blazing.  Much sunscreen had to go on, and I wish I’d brought sunglasses, because the image of the giant yellow pads of paper-on-easels that Andrew was drawing on was slowly being burnt into my retinas.  And then, sunscreen started getting into my eyes.  And yet, despite the blazing heat, the blinding sun, and the burning chemicals in my eyes, I was able to get a lot out of the talk.  Wmsy is pretty cool, and you should check it out.

After that, we went inside, and there was a bunch of GSoC talk.  Mentors talked about how it was working with GSoC students, and what kind of GSoC students we’d be looking for.  Then, a big brainstorm happened where we came up with potential GSoC projects.

[simage=767,288]

As a former GSoC student, I have to say, it’s a really worthwhile program.  I had an awesome summer doing GSoC.  Highly recommended.  Thumbs up, Google.

After that, the meetings were over.  I headed upstairs to talk to my parents and Emily on Skype for a bit, and then headed down to the lobby for dinner.  A group of us were eating at “Chow Mein”, an Italian-Chinese fusion restaurant.

It was pretty good. Fettuccine on one side of my plate, barbecue pork fried rice on the other, and some salad…a delicious and eclectic meal.  As an added bonus, while refilling our glasses, our waiter told us in excruciating detail about how he got pulled over for DUI on his birthday.  On that note, we had a fantastic dessert, and then left.

The sun was down, and we walked slowly along the beach back towards the hotel.  We stopped off at the beach-side patio to hang out a bit first.

[simage=766,288]

We raced Mai Tai umbrellas, and trash-talked hipsters.  It was probably the most hipster thing I did in Hawaii.

And speaking of hipsters (mildly NSFW):

Eventually, I made it back to my hotel room, and fell asleep.

Review Board Statistics Extensions: Karma, Stopwatch, and FixIt

I just spent the long weekend in Ottawa and Québec City with my parents and my girlfriend Em.

During the long drive back to Toronto from Québec City, I had plenty of time to think about my GSoC project, and where I want to go with it once GSoC is done.

Here’s what I came up with.

Detach Reviewing Time from Statistics

I think it’s a safe assumption that my reviewing-time extension isn’t going to be the only one to generate useful statistical data.

So why not give extension developers an easy mechanism to display statistical data for their extension?

First, I’m going to extract the reviewing-time recording portion of the extension. Then, RB-Stats (or whatever I end up calling it), will introduce it’s own set of hooks for other extensions to register with.  This way, if users want some stats, there will be one place to go to get them.  And if an extension developer wants to make some statistics available, a lot of the hard work will already be done for them.

And if an extension has the capability of combining its data with another extensions data to create a new statistic, we’ll let RB-Stats manage all of that business.

Stopwatch

The reviewing-time feature of RB-Stats will become an extension on its own, and register its data with RB-Stats.  Once RB-Stats and Stopwatch are done, we should be feature equivalent with my demo.

Review Karma

I kind of breezed past this in my demo, but I’m interested in displaying “review karma”.  Review karma is the reviews/review-requests ratio.

But I’m not sure karma is the right word.  It suggests that a low ratio (many review requests, few reviews) is a bad thing.  I’m not so sure that’s true.

Still, I wonder what the impact will be to display review karma?  Not just in the RB-Stats statistics view, but next to user names?  Will there be an impact on review activity when we display this “reputation” value?

FixIt

This is a big one.

Most code review tools allow reviewers to register “defects”, “todos” or “problems” with the code up for review.  This makes it easier for reviewees to keep track of things to fix, and things that have already been taken care of.  It’s also useful in that it helps generate interesting statistics like defect density and defect detection rate (assuming Stopwatch is installed and enabled).

I’m going to tackle this extension as soon as RB-Stats, Stopwatch and Karma are done.  At this point, I’m quite confident that the current extension framework can more or less handle this.

Got any more ideas for me?  Or maybe an extension wish-list?  Let  me know.

Python Eggs: Sunny Side Up, and Other Goodies (or How I Learned to Stop Worrying and Start Coding)

Cooking with Eggs

Every now and then, the computer gods smile and give me a freebie.

I’ve been worrying my mind out over a few problems / obstacles for my Review Board extensions GSoC project.  In particular, I’ve been worrying about dealing with extension dependencies, conflicts, and installation.

I racked my brain.  I came up with scenarios.  I drew lots of big scary diagrams on a wipe board.

And then light dawned.

Batteries Come Included

Enter Setuptools and Python Eggs.

All of those things I was worried about having to build and account for?  When using Python Eggs, It’s all built in. Dependencies?  Taken care of. Conflicts?  Don’t worry about it.  Installation?  That’s what Setuptools and Python Eggs were built for!

In fact, it even looks like Setuptools was designed with extensible applications in mind.

Wait, really?  How?

Here’s the setup.py file for the rb-reports extension in the rb-extensions-pack on Github:

from setuptools import setup, find_packages

PACKAGE="RB-Reports"
VERSION="0.1"

setup(
    name=PACKAGE,
    version=VERSION,
    description="""Reports extension for Review Board""",
    author="Christian Hammond",
    packages=["rbreports"],
    entry_points={
        'reviewboard.extensions':
        '%s = rbreports.extension:ReportsExtension' % PACKAGE,
    },
    package_data={
        'rbreports': [
            'htdocs/css/*.css',
            'htdocs/js/*.js',
            'templates/rbreports/*.html',
            'templates/rbreports/*.txt',
        ],
    }
)

Pay particular attention to the “entry_points” parameter.  What this is doing, is registering rbreports.extension:ReportsExtension to the entry point “reviewboard.extensions”.

“Hold up!”, I hear you asking. “What’s an entry point?”

Entry Points

An entry point is a unique identifier associated with an application that can accept extensions.

The unique identifier for Review Board extensions is “reviewboard.extensions”.

This is the first handshake, more or less, between Review Board and any extensions:  in order for Review Board to “see” the extension, the extension must register an entry point at “reviewboard.extensions”.

This blog post shows how extensions can be found and loaded up.

Other Goodies

INSTALLED_APPS and Django

I remember also being worried about how to create tables in Django for extension models.  I thought “holy smokes, I’m going to have to either shoehorn some raw SQL into the extension manager, or maybe even trust the extension developers to write the CREATE TABLE queries themselves!”.

Luckily, there’s a better alternative.

Django knows about its applications through a dictionary called INSTALLED_APPS. When you add a new model to a Django project, you simply add the model app to the INSTALLED_APPS dictionary, and run “manage.py syncdb”.  Django does the magic, bingo-bango, and boom – tables created.

So if a new extension has some tables it needs created, I simply insert the app name of the extension into INSTALLED_APPS when the extension is installed, and call syncdb programmatically.  Tables created:  no sweat.

django-evolution

Creating tables is easy.  But what if an extension gets updated, and the table needs to be modified?  Sounds like we’ve got a mess on our hands.

And don’t expect Django to save you.  When you modify a model in Django, they expect you to into that DB and alter that table by hand:

[syncdb] creates the tables if they don’t yet exist. Note that syncdb does not sync changes in models or deletions of models; if you make a change to a model or delete a model, and you want to update the database, syncdb will not handle that.
From The Django BookChapter 5: Models

Thankfully, there’s a mechanism that’s already built into Review Board that makes this trouble go away:  django-evolution.  Django-evolution, when used properly, will automatically detect changes in application models, and alter the database tables accordingly.  This is how Review Board does upgrades.

And to top that off, RB co-founder Christian Hammond just became the django-evolution maintainer.

Wow.  Everything is falling neatly into place.

My GSoC Project: Review Board Extensions

If you didn’t already know, Review Board is an open-source web-based code review tool.  The MarkUs Team has been using Review Board for pre-commit code review for about a year now.  This has given the team a number of advantages:

  1. For a team that usually has a 4 month turnover, this allows us to quickly get new team members up to speed with how to contribute to MarkUs.  We review every change that they propose, and give them tips/guidance on how to make it fit in well with the application.  They learn, and the applications code stays healthy.
  2. We catch defects before they enter the code base.  Simple as that.
  3. We get a good sense of what other people are working on, and what is going on in the code.  Review Board has become a central conversation and learning hub for the developers on the MarkUs team.

So, the long and the short of it:  I like Review Board.  Review Board helps us write better code.  I want to make Review Board better.

So what am I proposing?

How to Avoid A Bloated Software Monster

You can never make some people happy.

No matter how decent your software is, someone will eventually come up to you and say:

Wow!  Your software would be perfect if only it had feature XYZ!  Sadly, because you don’t have feature XYZ, I can’t use it.  Please implement, k thx!

And so you either have to politely say “no”, and lose that user, or say “yes”, and add feature XYZ to the application.  And for users out there who don’t need, or don’t care about feature XYZ, that new feature just becomes a distraction and adds no value.  Make this happen a bunch of times, and you’ve got yourself a bloated mutha for a piece of software.

And we don’t want a bloated piece of software.  But we do want to make our users happy, and provide feature XYZ for them if they want it.

So what’s the solution?  We provide an extension framework (which is also sometimes called a plug-in architecture).

An extension framework allows developers to easily expand a piece of software to do new things.  So, if a user wants feature XYZ, we (or someone else) just creates and make available an extension that implements the feature.  The user installs the extension, activates it, and bam – our user is happy as a clam with their new feature.

And if we make it super-easy to develop them, third-party developers can write new, wonderful, interesting extensions to do things that…well, we wouldn’t have considered in the first place. It’s a new place for innovation.  What’s that old cliché?

If you build it [the plug-in framework], they will come [the third-party developers who write awesome things]

And the developers do come.  Just look at Firefox add-ons or WordPress plugins.  Entire ecosystems of extensions, doing things that the original developers would probably have never dreamed of doing on their own.  Hell, I’ve even written a Firefox add-on. And users love customizing their Firefox / WordPress with those extensions.  It adds value.

So we get wins all over the place:

  • Our user gets their feature
  • The software gets more attractive because it’s flexible and customizable
  • The original software developers get to focus on the core piece of software, and let the third-party developers focus on the fringe features

And this is where I think I can help Review Board.

(Before I go on, if you’re interested, here’s another article on the how and the why of plug-in architectures)

Review Board Extensions

So if you look at the Review Board Wiki, or glance at the mailing lists you see numerous requests from users for new features, for example:

It would be nice if the review board had a “next comment” button that is always available to click, or had a collapse/expand button. This would make it easier to see other people’s comments in cases like this.

It will be nice to have post-commit support. Instead of every post-commit review being a separate URL, if we could setup default rules for post-commit reviews to update an existing review providing the diff-between-diff features, it would be very useful.

The Review Board developers could smell the threat of bloated feature-creep from a mile away.  So, in a separate branch, they began working on integrating an extension framework into Review Board.

The extension branch, however, has been gathering dust, while the developers focus on more critical patches and releases.

My GSoC proposal is to finish off a draft of the extension framework, document it, and build a very simple extension for it.  My simple extension will allow me to record basic statistics about Review Board reviewers – for example, how long they spend on a particular review, their inspection rate, etc.

Having been a project lead MarkUs for so long, it’s going to be a good experience to be back on “the bottom” – to be the new developer who doesn’t entirely have a sense of the application code yet.  It’s going to be good to go code spelunking again.  I’ve done some preliminary explorations, and it’s reminding me of my first experiences with MarkUs.  Like a submarine using its sonar, I’m slowly getting a sense of the code terrain.

I’ll let you know what my first few sweeps find.

Ping!

I’ve done it again:  I’ve let dust gather on my blog.

Quick update:

  • I’ve finished my courses for this semester, and have gone into full-blown research mode.
  • My research proposal is going through ethics review, in order to make sure that I’m not going to blow things up (or hurt anybody if I do)
  • While my paperwork is reviewed, I’m refining my procedure and apparatus.  Better and better.
  • I’ve been accepted into Google Summer of Code this year – I’ll be working on Review Board.  Details about my project will be the subject of an upcoming post, which I will toss up shortly.
  • I may or may not be co-directing a radio play.  I’ll let you know.
  • The MarkUs team is about to release version 0.7, and a fresh batch of Summer students will soon be here at UofT to work on it!
  • I have not forgotten about the UCDP trip to Poland.  I still have to tell you what we saw and did at Auschwitz.  Cripes – it’s almost a year since I returned, and I’m only half-way through the whole story.  And there’s a ton more to tell.  Coming soon.

Stay tuned.