The week long MATLAB programming contest has successfully finished. There were over 1400 entries and lots of interesting stories to tell about them. I cannot describe them all in this short message. I will focus on the algorithmic details of the contest entries and give a brief discussion of the final winning entry of the contest, ScottWinsIfThisDoes, by Taylor Sherman.
To restate the problem briefly: You were provided with a list of song times and the length of a CD that you were to fill. Not all the songs will fit on the CD. The job of your function is to return a single vector of indices of the songs you have chosen for this CD. The goal was to maximize the usage of the CD, in other words, to leave a gap as small as possible. The complete statement of the rules and the complete test suite are available.
The contest scoring system considered not only the gap (the smaller the better), it also took into account the CPU time of the execution of your code (the faster the better). To be more precise,
score = gap*(cpu+constant)
The smaller the score, the better. The constant is introduced so that a super fast (zero cpu time) but super stupid (large gap) entry would not win.
Our contest problem is also known as the "knapsack" problem, a special case of the bin-packing problem. Everybody knows how to write a brute force search algorithm to find the best fit song list. However, since brute force search is obviously slow and our scoring system punishes algorithms with long CPU times, exhaustive search is ruled out. This leaves us with two other choices: limited (heuristic) search and multiple tries.
Limited search means using some heuristic judging criteria to check a selected subset of all the combinations, and choose the ones that pass the criteria to test.
Multiple tries means generating many random combination of the song list, and then choosing the one with the best fit. This can also be considered as a kind of limited search, except there is no heuristic judging criteria. Simply choose the combination with the shortest gap among all the selected combinations. This approach has a fancy technical name, namely, the Monte-Carlo method.
Earlier entries mostly used the limited search approach. However, they either had a large gap (due to limitations in their heuristics) or spent too much CPU time. They were soon out run by entries using the Monte-Carlo method. All of the top 100 entries used the Monte-Carlo method.
The Monte-Carlo method for this problem can be summarized by the following three steps:
I. Generate some random permutations of the song list
II. Find the best packing among these permutations
III. Return the index corresponding to the best packing
Given these three steps, many strategies can be applied and many parameters can be adjusted. As one of the MathWorkers here said, the contest is like a genetic algorithm, except that it's the humans who are doing the mixing and selection. There were many cases where someone found a new strategy to apply to one of the steps, achieving a quantum leap in the score, which was then followed by a series of marginal improvements as other people tweaked (optimized) the strategy to its full power. This tweaking continued until a new strategy was found and another tuning war began.
Here's the complete listing of the winning entry. I've numbered the lines and added some indentation to make the code a little easier to read. The lines in the winning entry corresponding to the above 3 steps are:
I. Create a permutation matrix, choosing the size based on the size input song vector.The other lines of codes are supplementary to these 3 lines. In addition, this winning entry combined the Monte-Carlo method with a limited search, which I'll discuss below as step IV.
7 [d,e] = sort(rand(b,ceil(c/2)));
II. Select the permutation with the best song packing.
19 [d,h] = max(f(g+(0:b:b*c-b)));
III. Return the indices of the best permutation.
20 indexList = t(e(1:g(h),h));
IV. Lines 21-49 Substitute a left out song for an one in the included list to improve the score.
ScottWinsIfThisDoes, by Taylor Sherman|
First we sort the songs in the song list.
2 [s,t] = sort(songList);
Since we are generating random permutation, this sort should not be needed. A random permutation of sorted list is still random. However, for this particular code, which only permutes a portion of the list plus a shift, the sort may help a little, but it also costs some CPU time. The internal winning entry (we ran the same contest problem internally within the MathWorks) doesn't have this line and it works as well.
In lines 3-5, we use some simple rules to determine the size of the permutation matrix based on the size of the input song list.
3 a = size(s,2);
Another way to compute a, the length of the song list, is to use the LENGTH function, i.e. a = length(s). This form does not assume that the input is a row vector and it is slightly faster.
4 b = min(a,round(4*a*mediaLength/sum(s)));
The parameter b determines how many entries of the list are to be permuted. Ideally, in order to have a better use of the songList, we should permute the whole songList. However, sorting a long list is costly. More importantly, the later part of the permutation will be thrown away anyway because we will only use the first songs in each permutation that fit on the CD. It is a very good idea to just permute an amount that is needed and shift them to cover a bigger range. Of course, we will miss some part of the songList in this approach. We will talk about that in later steps.
a*mediaLength/sum(s) is equivalent to
mediaLength/mean(s), which gives
you the average number of songs that can be put on the CD. The
constant 4 can be adjusted arbitrarily. For this particular test
suite, 4 seemed to work well. Another entry tried 4.1 which also
5 c = 90+0.1*a;
c decides how many columns of random permutations are to be generated. Obviously the bigger c is, the smaller the gap will be. However, a larger c means more sorting to do in the next step and thus a longer CPU time, which is penalized by the scoring. Thus, finding an optimal value for c is a very tricky thing. At first, people used a fixed c, regardless of the length of the songList. Later people realized it should be a function of 'a'. This is the line that went through the most changes, from 3*a+4 to the current 90+0.1*a. Since 'a' in the contest test suite could be as high as 100, 3*a is very costly. The codes in the contest tuned this line for the test suite into this final version, 90+0.1*a. It doesn't mean this is the best one, but it is very close.
7 [d,e] = sort(rand(b,ceil(c/2)));
This is where the random permutations, e, are generated. There are other ways to generate random permutations, but they require for loops, and are not as fast as the sort approach.
Since the sort in line 7 is an expensive operation, we can improve our score if we can generate more permutations without increasing the workload for our sort.
8 e = [e e(b:-1:1,:)];
This line didn't come in until very late in the contest. This is a revolutionary idea. It reverses the order of the permutations and generates another group of permutations, and thus halves the work load of SORT, saving a lot of CPU time.
There are additional optimizations we can apply to this line. The square bracket operator (concatenation operator) in MATLAB is very costly, as it must be prepared to deal with all kinds of things in it. A faster way to do this operation is to pre-allocate the space in another variable, let's say E, and put e into the first half columns of E, and put e reversed into the second half of E. Pre-allocation and avoiding using square brackets would have big impact on the CPU time.
C = ceil(c/2); E = zeros(b,C*2); % Pre-allocation E(1:C) = e; E(C+1:end) = e(b:-1:1,:);
In the internal winning entry, in addition to this reverse trick, another similar optimization was used. Namely, a+1-e. This makes a symmetric image of e respect to the center of 1:a. This trick halves the work load of sort one more time, and it also has a very big positive side effect (see below).
9 e = e(:,1:c);
This line is not very useful. It may reduce e by one column if c is odd. Its cost in CPU time is not rewarded.
10 q = 0.25*(a-b);
q decides how much to shift for the current permutation. Since we only did permutations of 1:b, we may miss many other songs in the songlist. Shifting the permutation can catch some of the missing song entries. However, since we are only using a quarter of what's left, we are still missing a lot. This is why the middle line symmetric image trick mentioned above is very important. It not only halves the sort load, it also grabs those missing song entries.
11 if q > 1 12 r = floor(q*rand(1,c)); 13 e = e+r(ones(1,b),:); 14 else 15 r = rand(c,1); 16 end
Earlier entries didn't have this IF block. Later on, someone realized that if q is less than or equal to 1, we are adding zeros, which can be completely avoided.
However, if you read the code carefully, you will notice that the 'else r=rand(c,1)' is not needed: r is never used in the rest of the code. Somehow nobody caught this, except in the final entry submitted by Hakan Erdogon after official contest closing. His entry is almost a full point ahead of the winning entry, some of this full point may be credited to the deletion of these two lines.
17 f = cumsum(s(e));
This is the best way to check the default packing (i.e., beginning with the first song in the permutation list, add to the CD until we can't add any more). One thing that needs to be mentioned is that since s is sorted (line 2), we can easily find the maximum number of songs that can be put on the CD, i.e. maxnum = sum(cumsum(s)<=mediaLength); Thus, we only need to do cumsum up to maxnum. i.e., f = cumsum(s(e(1:maxnum,:))) The extra computation of maxnum costs some CPU time, but it may save a lot of CPU in the computation of cumsum when e is large (row or column).
18 g = sum(f<=mediaLength);
19 [d,h] = max(f(g+(0:b:b*c-b)));
Earlier entries used DIAG to get d and h. But DIAG uses a big matrix, which is very costly. Line 19 uses 1-d indexing into a 2-d array which is a very efficient strategy.
20 indexList = t(e(1:g(h),h));
This indexList represents the results of the Monte-Carlo method, and most of the earlier entries ended here.
Some people figured out that maybe we should combine the limited search approach and Monte-Carlo approach. The rest of the code does a columnwise exhaustive search to see if we can replace one of the selected songs by one left over and come out better. This part looks messier than what we've just looked at, but it turns out to be the winning key. The internal winning entry does not include this limited search, and thus did not score as well as this winning entry.
Now, let's take a look at these final tuning lines
21 leftOff = t(e(g(h)+1:end,h));
If we didn't sort the songlist, the t (here and in the above line) are not needed. Also, since we are using g(h) 4 times in the code, assigning g(h) to a temporary variable, instead of referring to an entry in a array every time, would have saved some CPU time.
22 remTime = mediaLength-d; 23 onez1 = ones(1,g(h)); 24 onez2 = ones(b-g(h),1); 25 newFree = remTime + songList(onez2,indexList) ... - songList(onez1,leftOff)'; 26 newFree(newFree<0) = 9e9;
Lines 22-26 make use of some clever indexing and matrix arithmetic to compute all possible one song substitutions of a song that was left out for one that was included in the indexList. This vectorized form is very fast.
However, we can make two additional optimizations in these
lines. First of all, the addition of the first term, remTime, in line
25 is not necessary. Instead, we could simply compare
newFree < -remTime
instead of 0. This way, we can avoid adding a constant to each element of a large matrix and save some CPU. Notice that there are two more instances of this same construction in the following code (33-34, 41-42). Each could be optimized this way.
Secondly, in line 25 and in line 41, we use the same transpose of a big matrix. If we temporarily assign this transpose of the songlist to another variable, let's say, st, then songList(onez1,leftOff)' can be written as st(leftOff,onez1), thus saving the transpose of a big matrix.
35 [m1,i] = min(newFree); 36 [m1,j] = min(m1);
These two lines are problematic. They are okay most of the time, and that is why the winning entry passed. However, there are cases for which these two lines will produce errors. Several entries failed because of these two lines. For some random permutations, b-g(h) may equal one, and thus make the newFree matrix a row vector. In such cases, the MIN function in MATLAB does not find the min of each column, instead, it finds the min of the row, and it will produce i and j that makes i(j) outside the range of leftOff, which is 1. This will make leftOff(i(j)) error out. The right way to call MIN in this case is min(newFree, , 1);
In lines 29-49, we simply repeat this one song substitution two more times if we can. Lines 29-49 were originally in a WHILE m < remTime loop, until someone found that it is faster to use IF instead of using WHILE. Since it would be very rare to be able to refine the selection more than three times, these embedded IF statements would do as well as the WHILE loop in most cases.
29 if m < remTime 30 aux = indexList(j); 31 indexList(j) = leftOff(i(j)); 32 leftOff(i(j)) = aux; 33 newFree = m + songList(onez2,indexList) ... - songList(onez1,leftOff)'; 34 newFree(newFree<0) = 9e9; 35 [m1,i] = min(newFree); 36 [m1,j] = min(m1); 37 if m1 < m 38 aux = indexList(j); 39 indexList(j) = leftOff(i(j)); 40 leftOff(i(j)) = aux; 41 newFree = m1 + songList(onez2,indexList) ... - songList(onez1,leftOff)'; 42 newFree(newFree<0) = 9e9; 43 [m2,i] = min(newFree); 44 [m2,j] = min(m2); 45 if m2 < m1 46 indexList(j) = leftOff(i(j)); 47 end 48 end 49 end
That's all for the analysis of the final winning entry. Some of the CPU savings mentioned above may be very small. However, consider the fact that in the last half hour before the closing of the contest, the top two entries had scores differing only at the 4th digits after the decimal period. All the above mentioned optimizations would definitely have put you ahead.
Let's end this discussion with a couple of fun puzzles related to the contest.
There are hundreds of submissions in the contest. From the results of these entries, I suspect people were able to find out many things about the test suite. Let's see what we can find out.
Thanks for participating the contest. See you next time.
The MathWorks Inc.