Darkness 1998-12-14 00:00:00 UTC
 
Twilight 1998-12-15 00:00:00 UTC
 
Daylight 1998-12-16 00:00:00 UTC
 
Finish 1998-12-21 00:00:00 UTC

Binpack - Statistics

Analysis

To wrap up this inaugural on-line edition of the MATLAB Contest, we offer two different views of the contest.

First, we have a collection of graphical views of the contest as it evolved over the course of the week. Here you can see the strategic breakthroughs and the optimization wars as they played out in the contest rankings.

Second, we offer a detailed analysis of the winning entry of the competition. We point out some of the clever algorithmic tricks used to leap ahead of the competition and offer notes on some of the optimizations that you may be able to apply to your own work.

We'd like to thank everybody who participated in the MATLAB online programming contest, and we look forward to seeing you again at the next one. Please send your comments, questions, suggestions, and stories to us at contest@mathworks.com.

Thanks for joining us. Please send your comments, questions, and stories to us at contest@mathworks.com. Look for the next one to be announced in the MATLAB Digest and comp.soft-sys.matlab. We look forward to seeing you next time.


Contest Evolution

The programming contest was a tremendous success from multiple points of view: many people from around the world participated, people competed and collaborated to write superlative streamlined code, and beyond that it was fun to watch. Races to win the various prizes kept us here at the MathWorks hitting the reload button on the results page.

The contest is over now, but how did it go? What stories can the results tell us? One pleasant byproduct of the contest is the abundance of data available for analysis. It is surprisingly rewarding to comb through the data (using MATLAB, of course) and identify some of the trends and dramas that occurred as the contest progressed.

First, let's look at how the 1455 separate entries arrived at the contest. The figure below shows the cumulative number of entries (divided into passing and failing entries) as they came in.

pass-fail2.gif (156480 bytes)

In general, a little less than half of the submissions failed to pass the test suite. The periods of high activity are evident in this plot, such as the run-up to the T-shirt giveaway at the end of the second day, and of course the excitement at the very end of the contest. Here is a histogram view of the time of day (by hour) when the entries were arriving.

histogram.gif (60274 bytes)

If we take a look at how the entries scored as they came in, we get a picture like this.

score-v-time-notes.gif (46480 bytes)

Of course these are only the entries that passed the test suite. Entries that led the contest at any time (there were 62 of them) are shown in red along the bottom, so you can see how the best score improved as a function of time. We have called out the names of some of the longest-lived leaders. The entry that held the lead the longest (and, incidentally, won the first T-shirt giveaway) was Doug Pedersen's CDkNow, which ruled from 4:18 Tuesday afternoon until Roger Stuckey's CDkStream came along at 7:45 the next morning. In general, the contest was cruel to front runners, since follow-up tweaks tended to displace them quickly.

Note that the y-axis is a log-scale to emphasize the improvement in score. Even so, in some of the brutal tweaking battles that occurred, the score improvements were so small that they are still difficult to spot.

One of the interesting aspects of this contest is the fact that you needed to minimize two things at once: the blank space on the CD, and the overall CPU time. This suggests that we can plot the average CPU time vs. average % Gap (the amount of unused space on the CD, divided by the target time). This yields a very different picture of the contest.

gap-v-cpu.gif (26376 bytes)

The leaders are still shown in red in this picture. In general, the best score is somewhere along the lower-left frontier of best time and smallest gap, but it ends up being a little difficult to interpret because the values we're plotting are average gap and average CPU times, whereas the score is based on a linear function of gap and CPU time taken for each test case. Nevertheless, the trends are fascinating. Big improvements tend to move down and to the right, and they are followed by tweaking battles in which the new algorithm claws its way back down the time axis.

In the following series of plots, we're looking at the entries for each day (in red) plotted against the backdrop of all the entries (in light blue). You can clearly see the activity crowding into the lower left corner of highest performance. Despite this specialization, there is still heavy activity all over the map even on the last day.

 

five-days.gif (57358 bytes)

The contest encouraged collaboration by making it easy to copy someone else's code and modify it slightly. By tracing the parentage of the various entries, we can see which were more influential than others. One obvious pattern is that when one entry stays at number one for very long, it is frequently copied in the ensuing tweaking battle. If the tweaked code is better, then it displaces number one and becomes the new tweaksters' favorite. If, on the other hand, the new code doesn't measure up, then it's back to the old code. In this way, TweakStream V by Peter Acklam spawned 30 direct spin-offs and Paulo Uribe's influential last-day effort TweakingIsBad4U!23 generated some 31 different direct follow-ons late in the game.

Here is an image of all the descendents (not just first generation) of Yves Piguet's influential CDCrusher S3 plotted on a gap vs. CPU time chart. The black circle shows the original entry, and the red dots show subsequent entries based on the original. Lines connect direct relatives.

piguet-score.gif (45608 bytes)

The light blue dots in the background show all the other entries in the contest, and the light blue lines show heritage connections between those entries. Let's zoom in to the middle of this chart.

piguet-score-zoom.gif (39068 bytes)

Here we can see that Doug Pedersen's T-shirt-winning CDkNow was derived in part from CDCrusher S3. The highest scoring direct descendent of CDCrusher S3 is songs2blackhole it=inf, also by Yves Piguet. The portion of the chart above shows a quick flurry of submissions, all members of the songs2blackhole family. These clearly improved the original CDCrusher algorithm, but were not up to the level of Peter Acklam's Tweak Tweak. No (official) descendents of CDCrusher S3 were to take the lead again. (However, the additional credits in the comments of these entries do lead backwards from Tweak Tweak, an ancestor of the ultimate winner, back to CDCrusher S3.)

For the sake of comparison, here is the plot of CDCrusher S3 and its descendents shown on the % Gap vs. CPU time chart. Again, the black circle shows the original. One simple observation based on this plot is that it is easy to make the algorithm worse, hard to make it better!

cdcrusher.gif (51020 bytes)

Now we show a time-score plot of all the descendents of Peter Acklam's Tweak Tweak. There are more than 400 descendents, among them the ultimate winner of the contest. (In the comments of Tweak Tweak, Peter Acklam in turn credits CDTest2 by Holger Bosch, so the real lineage reaches even farther back!) This diagram makes clear the importance of the strategy of working from a pre-existing code base.

acklam-score.gif (46800 bytes)

To get an idea how brutal the final tweaking battle was, take a look at the last hours of the contest portrayed below. A remarkable 31 entries were based directly on TweakingIsBad4U!23 by Paulo Uribe between 1:23 PM and 4:35 PM before it was finally bested by Scott Radcliffe's CutSortShort5.

uribe-score.gif (37696 bytes)

Stop and consider the remarkable series of innovations that led to the winning piece of code. It's fair to say that no single person on the planet could have written such an optimized algorithm. Yet it appeared at the end of the contest, sculpted out of thin air by people from around the world, most of whom had never met before. One way to visualize this is as a genetic algorithm in which really smart people play the role gene-mixer, swapping good ideas back and forth until something optimal appears. The contest also has echoes of open-source software development, in that motivated people from all over contributed to create the best possible code. No matter how you look at it, it is an exciting new way to develop algorithms. And it is also endlessly entertaining to watch and analyze.

We'd like to thank everybody who participated in the MATLAB online programming contest, and we look forward to seeing you again at the next one. Please send your comments, questions, suggestions, and stories to us at contest@mathworks.com.

Ned Gulley
The MathWorks, Inc.
gulley@mathworks.com


The Winning Entry Of The 1998 Matlab Programming Contest

The week long MATLAB programming contest has successfully finished. There were over 1400 entries and lots of interesting stories to tell about them. I cannot describe them all in this short message. I will focus on the algorithmic details of the contest entries and give a brief discussion of the final winning entry of the contest, ScottWinsIfThisDoes, by Taylor Sherman.

Problem And Rules

To restate the problem briefly: You were provided with a list of song times and the length of a CD that you were to fill. Not all the songs will fit on the CD. The job of your function is to return a single vector of indices of the songs you have chosen for this CD. The goal was to maximize the usage of the CD, in other words, to leave a gap as small as possible. The complete statement of the rules and the complete test suite are available.

The contest scoring system considered not only the gap (the smaller the better), it also took into account the CPU time of the execution of your code (the faster the better). To be more precise,

score = gap*(cpu+constant)

The smaller the score, the better. The constant is introduced so that a super fast (zero cpu time) but super stupid (large gap) entry would not win.

Strategies

Our contest problem is also known as the "knapsack" problem, a special case of the bin-packing problem. Everybody knows how to write a brute force search algorithm to find the best fit song list. However, since brute force search is obviously slow and our scoring system punishes algorithms with long CPU times, exhaustive search is ruled out. This leaves us with two other choices: limited (heuristic) search and multiple tries.

Limited search means using some heuristic judging criteria to check a selected subset of all the combinations, and choose the ones that pass the criteria to test.

Multiple tries means generating many random combination of the song list, and then choosing the one with the best fit. This can also be considered as a kind of limited search, except there is no heuristic judging criteria. Simply choose the combination with the shortest gap among all the selected combinations. This approach has a fancy technical name, namely, the Monte-Carlo method.

Monte Carlo Method

Earlier entries mostly used the limited search approach. However, they either had a large gap (due to limitations in their heuristics) or spent too much CPU time. They were soon out run by entries using the Monte-Carlo method. All of the top 100 entries used the Monte-Carlo method.

The Monte-Carlo method for this problem can be summarized by the following three steps:

I. Generate some random permutations of the song list
II. Find the best packing among these permutations
III. Return the index corresponding to the best packing

Given these three steps, many strategies can be applied and many parameters can be adjusted. As one of the MathWorkers here said, the contest is like a genetic algorithm, except that it's the humans who are doing the mixing and selection. There were many cases where someone found a new strategy to apply to one of the steps, achieving a quantum leap in the score, which was then followed by a series of marginal improvements as other people tweaked (optimized) the strategy to its full power. This tweaking continued until a new strategy was found and another tuning war began.

The Winning Entry

Here's the complete listing of the winning entry. I've numbered the lines and added some indentation to make the code a little easier to read. The lines in the winning entry corresponding to the above 3 steps are:

I. Create a permutation matrix, choosing the size based on the size input song vector.
    7 [d,e] = sort(rand(b,ceil(c/2)));
II. Select the permutation with the best song packing.
    19 [d,h] = max(f(g+(0:b:b*c-b)));
III. Return the indices of the best permutation.
    20 indexList = t(e(1:g(h),h));
The other lines of codes are supplementary to these 3 lines. In addition, this winning entry combined the Monte-Carlo method with a limited search, which I'll discuss below as step IV.
IV. Lines 21-49 Substitute a left out song for an one in the included list to improve the score.
ScottWinsIfThisDoes, by Taylor Sherman
1   function indexList = binpack(songList,mediaLength)
2   [s,t] = sort(songList);
3   a = size(s,2);
4   b = min(a,round(4*a*mediaLength/sum(s)));
5   c = 90+0.1*a;
6   %[d,e] = sort(rand(b,c));
7   [d,e] = sort(rand(b,ceil(c/2)));
8   e = [e e(b:-1:1,:)];
9   e = e(:,1:c);
10  q = 0.25*(a-b);
11  if q > 1
12     r = floor(q*rand(1,c));
13     e = e+r(ones(1,b),:);
14  else
15     r = rand(c,1);
16  end
17  f = cumsum(s(e));
18  g = sum(f<=mediaLength);
19  [d,h] = max(f(g+(0:b:b*c-b)));
20  indexList = t(e(1:g(h),h));
21  leftOff = t(e(g(h)+1:end,h));
22  remTime = mediaLength-d;
23  onez1 = ones(1,g(h));
24  onez2 = ones(b-g(h),1);
25  newFree = remTime + songList(onez2,indexList) ...
                      - songList(onez1,leftOff)';
26  newFree(newFree<0) = 9e9;
27  [m,i] = min(newFree);
28  [m,j] = min(m);
29  if m < remTime
30     aux = indexList(j);
31     indexList(j) = leftOff(i(j));
32     leftOff(i(j)) = aux;
33     newFree = m + songList(onez2,indexList) ...
                   - songList(onez1,leftOff)';
34     newFree(newFree<0) = 9e9;
35     [m1,i] = min(newFree);
36     [m1,j] = min(m1);
37     if m1 < m
38        aux = indexList(j);
39        indexList(j) = leftOff(i(j));
40        leftOff(i(j)) = aux;
41        newFree = m1 + songList(onez2,indexList) ...
                       - songList(onez1,leftOff)';
42        newFree(newFree<0) = 9e9;
43        [m2,i] = min(newFree);
44        [m2,j] = min(m2);
45        if m2 < m1
46           indexList(j) = leftOff(i(j));
47        end
48     end
49  end
Now, let's look at the full code line by line.

Step I. Generate Permutations (Lines 2-16)

First we sort the songs in the song list.

2 [s,t] = sort(songList);

Since we are generating random permutation, this sort should not be needed. A random permutation of sorted list is still random. However, for this particular code, which only permutes a portion of the list plus a shift, the sort may help a little, but it also costs some CPU time. The internal winning entry (we ran the same contest problem internally within the MathWorks) doesn't have this line and it works as well.

In lines 3-5, we use some simple rules to determine the size of the permutation matrix based on the size of the input song list.

3 a = size(s,2);

Another way to compute a, the length of the song list, is to use the LENGTH function, i.e. a = length(s). This form does not assume that the input is a row vector and it is slightly faster.

4 b = min(a,round(4*a*mediaLength/sum(s)));

The parameter b determines how many entries of the list are to be permuted. Ideally, in order to have a better use of the songList, we should permute the whole songList. However, sorting a long list is costly. More importantly, the later part of the permutation will be thrown away anyway because we will only use the first songs in each permutation that fit on the CD. It is a very good idea to just permute an amount that is needed and shift them to cover a bigger range. Of course, we will miss some part of the songList in this approach. We will talk about that in later steps.

a*mediaLength/sum(s) is equivalent to mediaLength/mean(s), which gives you the average number of songs that can be put on the CD. The constant 4 can be adjusted arbitrarily. For this particular test suite, 4 seemed to work well. Another entry tried 4.1 which also worked well.

5 c = 90+0.1*a;

c decides how many columns of random permutations are to be generated. Obviously the bigger c is, the smaller the gap will be. However, a larger c means more sorting to do in the next step and thus a longer CPU time, which is penalized by the scoring. Thus, finding an optimal value for c is a very tricky thing. At first, people used a fixed c, regardless of the length of the songList. Later people realized it should be a function of 'a'. This is the line that went through the most changes, from 3*a+4 to the current 90+0.1*a. Since 'a' in the contest test suite could be as high as 100, 3*a is very costly. The codes in the contest tuned this line for the test suite into this final version, 90+0.1*a. It doesn't mean this is the best one, but it is very close.

7 [d,e] = sort(rand(b,ceil(c/2)));

This is where the random permutations, e, are generated. There are other ways to generate random permutations, but they require for loops, and are not as fast as the sort approach.

Generating More Permutations

Since the sort in line 7 is an expensive operation, we can improve our score if we can generate more permutations without increasing the workload for our sort.


8 e = [e e(b:-1:1,:)];

This line didn't come in until very late in the contest. This is a revolutionary idea. It reverses the order of the permutations and generates another group of permutations, and thus halves the work load of SORT, saving a lot of CPU time.

There are additional optimizations we can apply to this line. The square bracket operator (concatenation operator) in MATLAB is very costly, as it must be prepared to deal with all kinds of things in it. A faster way to do this operation is to pre-allocate the space in another variable, let's say E, and put e into the first half columns of E, and put e reversed into the second half of E. Pre-allocation and avoiding using square brackets would have big impact on the CPU time.

    C = ceil(c/2);
    E = zeros(b,C*2);   % Pre-allocation
    E(1:C) = e;
    E(C+1:end) = e(b:-1:1,:);

In the internal winning entry, in addition to this reverse trick, another similar optimization was used. Namely, a+1-e. This makes a symmetric image of e respect to the center of 1:a. This trick halves the work load of sort one more time, and it also has a very big positive side effect (see below).


9 e = e(:,1:c);

This line is not very useful. It may reduce e by one column if c is odd. Its cost in CPU time is not rewarded.

10 q = 0.25*(a-b);

q decides how much to shift for the current permutation. Since we only did permutations of 1:b, we may miss many other songs in the songlist. Shifting the permutation can catch some of the missing song entries. However, since we are only using a quarter of what's left, we are still missing a lot. This is why the middle line symmetric image trick mentioned above is very important. It not only halves the sort load, it also grabs those missing song entries.

11  if q > 1
12     r = floor(q*rand(1,c));
13     e = e+r(ones(1,b),:);
14  else
15     r = rand(c,1);
16  end

Earlier entries didn't have this IF block. Later on, someone realized that if q is less than or equal to 1, we are adding zeros, which can be completely avoided.

However, if you read the code carefully, you will notice that the 'else r=rand(c,1)' is not needed: r is never used in the rest of the code. Somehow nobody caught this, except in the final entry submitted by Hakan Erdogon after official contest closing. His entry is almost a full point ahead of the winning entry, some of this full point may be credited to the deletion of these two lines.

Step II. Select The Best Permutation (Lines 17-19)

17 f = cumsum(s(e));

This is the best way to check the default packing (i.e., beginning with the first song in the permutation list, add to the CD until we can't add any more). One thing that needs to be mentioned is that since s is sorted (line 2), we can easily find the maximum number of songs that can be put on the CD, i.e. maxnum = sum(cumsum(s)<=mediaLength); Thus, we only need to do cumsum up to maxnum. i.e., f = cumsum(s(e(1:maxnum,:))) The extra computation of maxnum costs some CPU time, but it may save a lot of CPU in the computation of cumsum when e is large (row or column).

18 g = sum(f<=mediaLength);
19 [d,h] = max(f(g+(0:b:b*c-b)));

Earlier entries used DIAG to get d and h. But DIAG uses a big matrix, which is very costly. Line 19 uses 1-d indexing into a 2-d array which is a very efficient strategy.

Step III. Return The Index List (Line 20)

Now, the final step. 20 indexList = t(e(1:g(h),h));

This indexList represents the results of the Monte-Carlo method, and most of the earlier entries ended here.

Step IV. Combining Strategies: Adding Limited Search (Lines 21-49)

Some people figured out that maybe we should combine the limited search approach and Monte-Carlo approach. The rest of the code does a columnwise exhaustive search to see if we can replace one of the selected songs by one left over and come out better. This part looks messier than what we've just looked at, but it turns out to be the winning key. The internal winning entry does not include this limited search, and thus did not score as well as this winning entry.

Now, let's take a look at these final tuning lines

21 leftOff = t(e(g(h)+1:end,h));

If we didn't sort the songlist, the t (here and in the above line) are not needed. Also, since we are using g(h) 4 times in the code, assigning g(h) to a temporary variable, instead of referring to an entry in a array every time, would have saved some CPU time.

22  remTime = mediaLength-d;
23  onez1 = ones(1,g(h));
24  onez2 = ones(b-g(h),1);
25  newFree = remTime + songList(onez2,indexList) ...
                      - songList(onez1,leftOff)';
26  newFree(newFree<0) = 9e9;

Lines 22-26 make use of some clever indexing and matrix arithmetic to compute all possible one song substitutions of a song that was left out for one that was included in the indexList. This vectorized form is very fast.

However, we can make two additional optimizations in these lines. First of all, the addition of the first term, remTime, in line 25 is not necessary. Instead, we could simply compare
    newFree < -remTime
instead of 0. This way, we can avoid adding a constant to each element of a large matrix and save some CPU. Notice that there are two more instances of this same construction in the following code (33-34, 41-42). Each could be optimized this way.

Secondly, in line 25 and in line 41, we use the same transpose of a big matrix. If we temporarily assign this transpose of the songlist to another variable, let's say, st, then songList(onez1,leftOff)' can be written as st(leftOff,onez1), thus saving the transpose of a big matrix.

35     [m1,i] = min(newFree);
36     [m1,j] = min(m1);

These two lines are problematic. They are okay most of the time, and that is why the winning entry passed. However, there are cases for which these two lines will produce errors. Several entries failed because of these two lines. For some random permutations, b-g(h) may equal one, and thus make the newFree matrix a row vector. In such cases, the MIN function in MATLAB does not find the min of each column, instead, it finds the min of the row, and it will produce i and j that makes i(j) outside the range of leftOff, which is 1. This will make leftOff(i(j)) error out. The right way to call MIN in this case is min(newFree, [], 1);

In lines 29-49, we simply repeat this one song substitution two more times if we can. Lines 29-49 were originally in a WHILE m < remTime loop, until someone found that it is faster to use IF instead of using WHILE. Since it would be very rare to be able to refine the selection more than three times, these embedded IF statements would do as well as the WHILE loop in most cases.

29  if m < remTime
30     aux = indexList(j);
31     indexList(j) = leftOff(i(j));
32     leftOff(i(j)) = aux;
33     newFree = m + songList(onez2,indexList) ...
                   - songList(onez1,leftOff)';
34     newFree(newFree<0) = 9e9;
35     [m1,i] = min(newFree);
36     [m1,j] = min(m1);
37     if m1 < m
38        aux = indexList(j);
39        indexList(j) = leftOff(i(j));
40        leftOff(i(j)) = aux;
41        newFree = m1 + songList(onez2,indexList) ...
                       - songList(onez1,leftOff)';
42        newFree(newFree<0) = 9e9;
43        [m2,i] = min(newFree);
44        [m2,j] = min(m2);
45        if m2 < m1
46           indexList(j) = leftOff(i(j));
47        end
48     end
49  end

That's all for the analysis of the final winning entry. Some of the CPU savings mentioned above may be very small. However, consider the fact that in the last half hour before the closing of the contest, the top two entries had scores differing only at the 4th digits after the decimal period. All the above mentioned optimizations would definitely have put you ahead.

Final Challenge

Let's end this discussion with a couple of fun puzzles related to the contest.

There are hundreds of submissions in the contest. From the results of these entries, I suspect people were able to find out many things about the test suite. Let's see what we can find out.

  1. What is the shortest length of the songList (length(songList) in the test suite? Suppose you don't know the answer, how can you design your entries to find out what it is? (You may need to submit multiple entries.)
  2. What is the longest length of the songList (length(songList))? How do you design your entries to find it out?

Thanks for participating the contest. See you next time.

Zhiping You
The MathWorks Inc.
zyou@mathworks.com