Finding Good Answers Quickly

I recently had a difficult problem to solve: assigning 94 students into teams of four. The first reason it’s difficult is that you end up with 23.5 teams, and since we can’t chop students in half, we have to go with 22 teams of four, and two teams of three.

But we’re not done yet: the competition should be fair, in the sense that teams should be roughly equally matched: we don’t want to have one really good team that easily wins. So we get some estimations of students’ abilities— let’s say they get a score of 1 (worst), 2 or 3 (best). Then we can define a team’s strength as the sum of the students’ ability scores, and we want teams strengths to be roughly equal.

But we might even still have further requirements: maybe we want students to not be in teams with students they’ve already worked with (so they meet more people), or not put students who have done the competition together (as experience isn’t quite the same thing as ability).

Now this problem isn’t difficult because it’s hard to solve, but because it’s difficult to figure out exactly what the requirements are. The way I’d like to solve this problem is by specifying a small part of the problem, having something give me a possible solution, and then iterating to further define the problem (to avoid the bad cases I see).

I’ve had to solve a few problems of this nature, and after writing almost the same Python script each time, I refactored out the common code into Hufflepuff, an open-source project I released today. Like any good UNIX tool, it’s not especially interesting or complex itself, but lets you glue together other simple tools to produce something useful.

Let’s see how it works with a cut-down version of the example above. First, create a working directory, install a Python virtualenv and install Hufflepuff from pypi, and test it works:

$ mkdir -p ~/tmp/teams; cd ~/tmp/teams
$ virtualenv .
$ . bin/activate
(teams)$ pip install hufflepuff
(teams)$ hufflepuff -h
usage: hufflepuff [-h] [-c CONFIG_FILE] [--initial-binary INITIAL_BINARY]
                  [--initial-file INITIAL_FILE]
                  [--mutate-binary MUTATE_BINARY]
                  [--score-binary SCORE_BINARY] [--beam-width BEAM_WIDTH]
                  [--expand-mantissa EXPAND_MANTISSA]
                  [--expand-multiplier EXPAND_MULTIPLIER] [--target TARGET]
                  [--num-iterations NUM_ITERATIONS]
                  [--parallelism PARALLELISM]

We’ll also need some test data to play with, and for simplicity, we’ll just have a few students:

(teams)$ python
>>> import json, random
>>> with open('students.json', 'w') as f:
...   json.dump([
...     {
...       'name': name,
...       'skill': random.randint(1, 3),
...       'experience': random.random() < 0.3,
...     }
...     for name in 'ABCDEFGHIJKLMN'
...   ], f)
>>> [Ctrl-D]

You should now have a students.json file with some students, each with a unique name, skill level, and experience indicating whether they’ve done the competition before or not.

Now we need to provide three things to Hufflepuff: an initial assignment of students to teams, a program to mutate those teams, and a program to score a team assignment.

For the initial assignment, we need to specify the team sizes, and provide a valid assignment: so let’s just put the students in order. Here’s, which goes into your working directory:

import json

# For 14 students, two teams of four, and two of three.
team_sizes = [4] * 2 + [3] * 2

with open('students.json') as f:
    students = json.load(f)
assert sum(team_sizes) == len(students)

print json.dumps([
    [students.pop(0)['name'] for _ in range(team_size)]
    for team_size in team_sizes
assert len(students) == 0

Now when you run it, you should get the following output. It should be fairly obvious: the first team has four people (A, B, C, and D).

(teams)$ python
[["A", "B", "C", "D"], ["E", "F", "G", "H"], ["I", "J", "K"], ["L", "M", "N"]]

The next thing to provide to Hufflepuff is a way of mutating the state. In this example, the easiest thing to do is to swap a pair of students in different teams. We need to be carefuly to only produce valid states (every student should be in exactly one team), and ideally we shouldn’t duplicate states— so since teams are a set of students, let’s keep them in sorted order within the team.

Here’s, also to be placed in the working directory:

import json
import random
import sys

# Read lines from stdin (ending cleanly on EOF):
for line in iter(sys.stdin.readline, ''):
    teams = json.loads(line)

    # Swap between 1-3 pairs.
    for _ in range(random.randint(1, 3)):
        # Select two teams to swap between.
        team_a, team_b = random.sample(teams, 2)

        # Choose a person from each team.
        person_a = random.choice(team_a)
        person_b = random.choice(team_b)

        # Swap them.
        team_a[:] = sorted(
            person_b if name == person_a else name
            for name in team_a
        team_b[:] = sorted(
            person_a if name == person_b else name
            for name in team_b

    print json.dumps(teams)

You should be able to run it as below. The first JSON list is input, the second is output, and it should continue accepting input until you send EOF (by pressing Ctrl-D).

(teams)$ python
[["A", "B", "C", "D"], ["E", "F", "G", "H"], ["I", "J", "K"], ["L", "M", "N"]]
[["B", "C", "J", "M"], ["E", "F", "G", "L"], ["D", "I", "K"], ["A", "H", "N"]]

You’ll have noticed that it swaps a few pairs of people around. The initial state will be pretty bad, so we want to make a lot of swaps, to quickly head towards the better state. Towards the end of the search, lots of swaps will usually be bad, and we want single swaps to settle on the local optimum. There are more sophisticated things you could do, but I’ve found this is easy and works well enough in practice.

The last thing we need to provide is some way of scoring the state— nicely congregating all your experimental decisions of what to optimise into a simple problem: giving a score for a team assignment.

In this case, it’s easiest to specify the score as a “cost”, and say how bad an assignment is. We’ll inflict penalties for teams with more than one experienced student, and for the difference between a team’s strength and the average team strength. Here’s, also to be placed in the working directory:

import json
import sys

with open('students.json') as f:
    students_by_name = {
        student['name']: student
        for student in json.load(f)

def experience_penalty(teams, penalty=100):
    score = 0
    for team in teams:
        n_experienced = sum(1 for student in team if student['experience'])
        if n_experienced > 1:
            score += penalty
    return score

def skill_penalty(teams):
    strengths = [
        sum(student['skill'] for student in team)
        for team in teams

    average_strength = sum(strengths) / float(len(strengths))

    return sum(
        abs(average_strength - strength)
        for strength in strengths

for line in iter(sys.stdin.readline, ''):
    teams = json.loads(line)

    # Replace students' names with their full info.
    teams = [
        [students_by_name[name] for name in team]
        for team in teams

    score = experience_penalty(teams) + skill_penalty(teams)
    print score

If you run it, and provide it the initial team JSON, it should print out some score (which will vary based on your students.json file):

(teams)$ python
[["A", "B", "C", "D"], ["E", "F", "G", "H"], ["I", "J", "K"], ["L", "M", "N"]]

A side note: it’s handy to put a “debug mode” into the scoring program, so you can give it a solution, and quickly see how well it does on each of the conditions. This can be useful for distinguishing between multiple equally-scoring solutions, and also for telling if the search found the optimum.

Finally, we’ll write a search.cfg file to specify what we want Hufflepuff to do (we could also just pass the parameters as command line args, but this is nicer):

initial-binary = python
mutate-binary = python
score-binary = python
target = min

Then, all we need to do is to run Hufflepuff, which will take all those things, and run a beam search. As it runs, it prints out the scores of the items in it’s current search beam to stderr, and finally prints out [score, state] JSON-encoded lists to stdout of the best states it ever saw.

(teams)$ hufflepuff -c search.cfg
[2.0, [["A", "C", "D", "K"], ["B", "G", "J", "L"], ["F", "H", "I"], ["E", "M", "N"]]]
[2.0, [["A", "C", "D", "K"], ["B", "G", "J", "L"], ["F", "H", "N"], ["E", "I", "M"]]]
[2.0, [["A", "C", "E", "G"], ["D", "H", "J", "M"], ["B", "I", "L"], ["F", "K", "N"]]]
[2.0, [["A", "C", "G", "L"], ["I", "J", "K", "M"], ["D", "F", "N"], ["B", "E", "H"]]]
[2.0, [["A", "C", "L", "N"], ["I", "J", "K", "M"], ["D", "F", "G"], ["B", "E", "H"]]]

As you can see, it found a bunch of possible solutions, all of which had score 2.0. I can then manually inspect those states, determine if there’s any major problems with them, and refine the scoring program if necessary.

A cute little trick is that you can take those states (without the score), put them into best.json, and then run hufflepuff -c search.cfg --initial-binary='cat $initial-file' --initial-file=best.json to start the search from those states (and you can make it less ugly by defining initial-binary in search.cfg since that’s the default).

Pull requests, bugs and feature requests are welcome, and I’m also happy to answer questions in the comments below. And if case you were wondering, here’s the inspiration for the name.