Enumerating the ways of choosing teams in a group of players


Suppose you have a bunch of people, and you want to break them up into m teams of size n. (Therefore you have a total of nm people.) Today’s Little Program will enumerate the ways this can be done.

Formally, let’s say that you have a collection of size nm, and you want to enumerate the ways of partitioning the collection into m subsets, each subset of size n. The order of elements within each subset does not matter, and the order of the subsets doesn’t matter. That’s saying that a team of Alice and Bob is the same as a team of Bob and Alice, and Alice-Bob versus Charlie-David is the same as Charlie-David versus Alice-Bob.

The number of ways of doing this is (nm)!/n!mm!. You can see this by first taking all permutations of the players, then dividing out by the things that cause us to overcount: The number of ways of ordering players within each team is n!, and there are m teams, and there are m! ways of ordering the teams themselves. (Note that this is a cute way of expressing the result, but you shouldn’t use it for computation. A slightly better way for computation would be (Π1 ≤ knC(mk, m))/m!.

Okay, but how do you generate the teams themeselves?

Let’s first see how to generate the first team. Well, that’s easy. You just select n players and call them Team 1.

This leaves you n(m − 1) players with which to form m − 1 teams, which you can do recursively.

function Teams(n, m, f) {
 var a = [];
 for (var i = 1; i <= n * m; i++) {
  a.push(i);
 }

 if (m == 1) { f([a]); return; }

 Subsets(n * m, n, function(s) {
  var rest = a.filter(function(i) { return s.indexOf(i) < 0; });
  Teams(n, m - 1, function(t) {
    f([s].concat(t.map(function(team) {
        return team.map(function(i) { return rest[i-1]; });
       })));
  });
 });
}

Teams(2, 3, logToConsole);

The first part of this function builds an array of the form [1, 2, 3, ..., n * m]. If we are asking for only one team, then everybody is on the same team. Otherwise, for all possible choices of n-member teams, first see which people haven’t yet been picked for a team. Then generate all remaining possible team arrangements for those leftovers, and combine them to form the final team rosters.

The combination step is tricky because the recursive call generates subsets in the range [1, 2, 3, ..., n * (m-1)], and we need to convert those values into indices into the array of people waiting to be picked.

Note that this algorithm over-counts the possibilities since it generates both [[1,2],[3,4]] and [[3,4],[1,2]]. In other words, it assumes that team order is important (say, because the first team will wear red jerseys and the second team will wear blue jerseys). In the original problem statement, the order of the teams is not significant. (Maybe we’ll let them pick their own jersey colors.)

To solve that, we impose a way of choosing one such arrangement as the one we enumerate, and ignore the rest. The natural way to do this is to select a representative player from each team in a predictable manner (say, the one whose name comes first alphabetically), and then arranging the representatives in a predictable manner (say, by sorting them alphabetically).

The revised version of our algorithm goes like this:

function Teams(n, m, f) {
 var a = [];
 for (var i = 1; i <= n * m; i++) {
  a.push(i);
 }

 if (m == 1) { f([a]); return; }

 a.shift();
 Subsets(n * m - 1, n - 1, function(s) {
  var firstTeam = [1].concat(s.map(function(i) { return i+1; }))
  var rest = a.filter(function(i) { return s.indexOf(i) < 0; });
  Teams(n, m - 1, function(t) {
    f([firstTeam].concat(t.map(function(team) {
      return team.map(function(i) { return rest[i-1]; });
     })));
  });
 });
}

Teams(2, 3, logToConsole);

The first part of the function is the same as before, but the recursive step changes.

We remove the first element from the array. That guy needs to belong to some team, and since he’s the smallest-numbered guy, he will be nominated as the team representative of whatever team he ends up with, and since he’s the smallest-numbered guy of all, he will also be the first team representative when they are placed in sorted order. So we pick him right up front.

We then ask for his n - 1 teammates, and together they make up the first team. The combination is a little tricky because the Subsets function assumes that the underlying set is [1, 2, ..., n-1] but we actually want the subset to be of the form [2, 3, ..., n]; we fix that by adding 1 to each element of the subset.

We then find all the people who have yet to be assigned to a team and recursively ask for m - 1 more teams to be generated from them. We then combine the first team with the recursively-generated teams. Again, since the recursively-generated teams are numbered starting from 1, we need to convert the returned subsets into the original values we saved away in the rest variable.

Renumbering elements is turning into a bit of a bother, so let’s tweak our original Subsets function. For example, we would prefer to pass the set explicitly rather than letting Subsets assume that the set is [1, 2, 3, ..., n], forcing us to convert the indices back to the original set members. It’s also convenient if the callback also included the elements that are not in the subset.

function NamedSubsets(a, k, f) {
 if (k == 0) { f([], a); return; }
 if (a.length == 0) { return; }
 var n = a[a.length - 1];
 var rest = a.slice(0, -1);
 NamedSubsets(rest, k, function(chosen, rejected) {
  f(chosen, rejected.concat(n));
 });
 NamedSubsets(rest, k-1, function(chosen, rejected) {
  f(chosen.concat(n), rejected);
 });
}

function takeAndLeave(chosen,rejected) {
 console.log("take " + chosen + ", leave " + rejected);
}

NamedSubsets(["alice", "bob", "charlie"], 2, takeAndLeave);

The Named­Subsets function takes the last element from the source set and either rejects it (adds it to the “rejected” parameter) or accepts it (adds it to the “chosen” parameter).

With the Named­Subsets variant, we can write the Teams function much more easily.

function Teams(a, m, f) {
 var n = a.length / m;
 if (m == 1) { f([a]); return; }

 var p = a[0];
 NamedSubsets(a.slice(1), n - 1, function(teammates, rest) {
  var team = [p].concat(teammates);
  Teams(rest, m - 1, function(teams) {
    f([team].concat(teams));
  });
 });
}

Teams([1,2,3,4,5,6], 3, logToConsole);

Assuming we’re not in one of the base cases, we grab the first person p so he can be captain of the first team. We then ask Named­Subsets to generate his teammates and add them to p‘s team. We then recursively generate all the other teams from the people who haven’t yet been picked, and our result is our first team plus the recursively-generated teams.

There is a lot of potential for style points with the Named­Subsets function. For example, we can avoid generating temporary copies of the a array just to remove an element by instead passing slices (an array and indices marking the start and end of the elements we care about).

function NamedSubsetsSlice(a, begin, end, k, f) {
 if (k == 0) { f([], a.slice(begin, end)); return; }
 if (begin == end) { return; }
 var n = a[end - 1];
 NamedSubsetsSlice(a, begin, end - 1, k, function(chosen, rejected) {
  f(chosen, rejected.concat(n));
 });
 NamedSubsetsSlice(a, begin, end - 1, k-1, function(chosen, rejected) {
  f(chosen.concat(n), rejected);
 });
}

function NamedSubsets(a, k, f) {
 NamedSubsetsSlice(a, 0, a.length, k, f);
}

We could use an accumulator to avoid having to generate closures.

function AccumulateNamedSubsets(a, begin, end, k, f, chosen, rejected) {
 if (k == 0) { f(chosen, rejected.concat(a.slice(begin, end))); return; }
 if (begin == end) { return; }
 var n = a[begin];
 AccumulateNamedSubsets(a, begin + 1, end, k-1, f, chosen.concat(n), rejected);
 AccumulateNamedSubsets(a, begin + 1, end, k, f, chosen, rejected.concat(n));
}

function NamedSubsetsSlice(a, begin, end, k, f) {
 AccumulateNamedSubsets(a, begin, end, k, f, [], []);
}

function NamedSubsets(a, k, f) {
 NamedSubsetsSlice(a, 0, a.length, k, f);
}

For bonus style points, I recurse on the start of the range rather than the beginning so that the results are in a prettier order.

We can also get rid of the temporary accumulator objects by manipulating the accumulators destructively.

function AccumulateNamedSubsets(a, begin, end, k, f, chosen, rejected) {
 if (k == 0) { f(chosen, rejected.concat(a.slice(begin, end))); return; }
 if (begin == end) { return; }
 var n = a[begin];
 chosen.push(n);
 AccumulateNamedSubsets(a, begin + 1, end, k-1, f, chosen, rejected);
 chosen.pop();
 rejected.push(n);
 AccumulateNamedSubsets(a, begin + 1, end, k, f, chosen, rejected);
 rejected.pop();
}

And then we can take advantage of the accumlator version to pre-select the first player when building teams.

function Teams(a, m, f) {
 var n = a.length / m;
 if (m == 1) { f([a]); return; }

 AccumulateNamedSubsetsSlice(a, 1, a.length, n - 1, function(team, rest) {
  Teams(rest, m - 1, function(teams) {
    f([team].concat(teams));
  });
 }, [a[0]], []);
}

There is still a lot of potential for improvement here. For example, you can switch to the iterative version of Subsets to avoid the recursion on subset generation. You can use an accumulator in Teams to avoid generating closures. And if you are really clever, you can eliminate many more temporary arrays by reusing the elements in the various recursively-generated arrays by shuffling them around. But I’ve sort of lost interest in the puzzle by now, so I won’t bother.

Comments (16)
  1. Brian_EE says:

    All well and good, but this fails to take into account desireability of individual players, because, you know, no one wants that dorky kid who spends all his time writing little programs on their sports team.

  2. nathan_works says:

    Well, what's also been interesting about these set of enumeration problems is that at least one or two of them are a variant of some interview questions I was asked when applying to work at MSFT. This one in particular, though I didn't work out in the time provided how to eliminate the duplicates (alice-bob same as bob-alice)

  3. Joe says:

    This solution seems vastly overcomplicated. Once you know the number of the teams, you know the number of people on each team and the rest is cake.

    [There are m teams and n players on each team. Now go bake some cake! -Raymond]
  4. Joe says:

    My point is that you make the calculation of m and n and then chop up the array, which is certainly going to be much faster than the solutions shown.

    [But deciding how to chop up the array is the hard part. Maybe you can just post a blog entry showing the easy way, then link to it here. -Raymond]
  5. pc says:

    @Brian_EE: But this can be used for much more than for sports teams! Perhaps one wants to make teams for a math or trivia competition, or even a really exciting sport like Robotics.

  6. 12BitSlab says:

    @ pc:

    The members of the team need not be humans.  Not all  "games" are sports designed to be played by people.

  7. Brian_EE says:

    @12BitSlab: Presumably, non-humans don't wear colored jerseys. Following the basic tenants of Raymond's "Psychic Debugging" technique, I equated colored jerseys with sports (for most cases). When it comes to sports, the nerdy kids always get picked last.

  8. Maurits says:

    It would be interesting to generalize this to the situation where you are choosing from a pool of p > mn players, for which the number of solutions is C(p, mn) * (number of solutions for problem as stated).

  9. Katie says:

    I may be misunderstanding him, but I think Joe is ignoring the "enumerate" part of the problem, and suggesting something like this:

    for i = 1..m

     print "Players %((m-1)*n+1) through %(m*n) are on team %(m)."

    I guess he is right about it being simpler and faster…

  10. Joe says:

    The problem states that the set equals m * n. The player issue is irrelevant, so is the enumerate issue; all that matters is calculating the values for m and n. Reduce the problem, folks.

    [Not sure why you're saying that the enumeration is irrelevant. The problem is to "enumerate all the ways this can be done." The values of m and n are part of the problem statement. "Given m and n, enumerate all the ways of breaking nm players into m teams of size n." For example, given n=2 and m=2, the answer is "There are three ways of breaking 4 players into two equal teams: (1+2 vs 3+4), (1+3 vs 2+4), and (1+4 vs 2+3)." -Raymond]
  11. Joshua says:

    I think he's right about faster but *that's* not simpler. Maybe he thinks is a constant.

  12. Joe says:

    My fault. I completely misinterpreted the challenge. I hang my head in shame. So, forget everything I said. Hey look, a squirrel!

  13. Neil says:

    @Maruits I think you would just vary the first team member over the range 0 .. p – mn, then recurse as before; it looks like the code would cope with that quite happily.

    [You just wrap the function inside another loop that uses Subsets to generate all ways of choosing nm items from a pool of p. -Raymond]
  14. frenchguy says:

    @Neil: you need to recurse that first choice as long as the number of players to choose from is greater than the number of players remaining to choose. Say you have 11 players and want to make 2 teams of 4. If you choose player #2 initially, you still have to choose 7 players out of 9.

  15. RandomInternetDude says:

    First thing I thought when I saw this: use MiniZinc.  Hakan Kjellerstrand posted two different models in response (choosing_teams, under 'Combinatorial Problems').  http://www.hakank.org/minizinc

  16. Simon Buchan says:

    Looks like a bug in the first version: rest is taking everyone not in 's', when it should be looking in 'first_team' or looking for 'i + 1'.

    I've been trying to convert these to C++14, lots of interesting tradeoffs with value by default and the way lambdas work there. Sometimes the C++ comes out ahead!

Comments are closed.