CounterInBase
This document demonstrates the use of the CounterInBase pseudo-class and shows its time demands compared to time demands of the simple method of generating decimal numbers and converting them to a different base.
Contents
Motivation
Sometimes a counter in a different base than 10 is needed, e.g. when we need to enumerate numbers in a base different from 10. This can be accomplished by counting in the decimal base and using the MATLAB function dec2base. The dec2base function, however, produces strings as outputs. It also uses the decimal digits (0-9) and the characters (A-Z) to express the numbers, so that it is limited to bases from 2 to 36.
If we need a numerical array as the output instead of the strings, we have to built our own function. Here is one that is based on the original MATLAB dec2base function, but produces arrays of numbers as the output (the documentation is skipped).
dbtype mydec2base 1 dbtype mydec2base 22:32
1 function nb = mydec2base(nd,base) 22 nd = nd(:); 23 ndigits = max(1,round(log(max(nd)+1)/log(base))); 24 while any(base.^ndigits <= nd) 25 ndigits = ndigits + 1; 26 end 27 nb = zeros( numel(nd), ndigits ); 28 for i = ndigits:-1:1, 29 nb(:,i) = mod(nd,base); 30 nd = floor(nd/base); 31 end 32 end
If we want to enumarate e.g. the numbers from 0 to 10 in base 3, we can do it with the following code
for i = 0:10, num = mydec2base(i,3); fprintf('%d ',num); fprintf('\n'); end
0 1 2 1 0 1 1 1 2 2 0 2 1 2 2 1 0 0 1 0 1
Of course, we can also generate all the numbers at once and convert them to the required base as a whole. This would give us slightly different results:
nums = 0:10; nums3 = mydec2base(nums,3)
nums3 =
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
And, of course, there are many ways in between the above mentioned ways - we can generate the numbers in chunks.
Now, assume that we need to enumerate large set of numbers, so that we cannot (or we do not want to) hold them all in the memory as one large array that can be subsequently enumerated. In this case, we have to use the above for loop and process the numbers one by one, or in chunks.
In that situation, it may be pretty time consuming to convert a (small set of) large number(s) to a different base, if all what is needed is to incrementaly build the output by incrementing the previous number by 1.
And here the CounterInBase pseudo-class comes in, its updates should be faster than the repeating conversions to a certain base.
The CounterInBase pseudo-class
CounterInBase pseudo-class realizes a counter in arbitrary base. You can
- set its state (by calling the set(decnum) method),
- ask its state (by calling the next(0) method),
- ask for the next number in the particular base and advance the counter (by calling the next() method),
- ask for the next n numbers and advance the counter by n (by calling the next(n) method),
You can also
- save the state of the counter to hard drive (by calling save([fname]) method), and
- restore the state (by calling load([fname]) method).
Examples:
Create the object and set the base:
counter = CounterInBase(3);
Several calls to next() method:
counter.next() counter.next() counter.next() counter.next()
ans =
0
ans =
1
ans =
2
ans =
1 0
Ask for more than 1 number at a time:
counter.next(5)
ans =
1 1
1 2
2 0
2 1
2 2
Set the state of the counter:
counter.set(26)
Ask for the next 3 numbers (number 26 in base 10 set in the previous step is 222 in base 3, so the first number returned by the next() call should be 222):
counter.next(3)
ans =
0 2 2 2
1 0 0 0
1 0 0 1
Time demands
In this section, we will compare the CounterInBase class with the method of enumerating numbers using the for loop and the mydec2base function. We will compare their time demands when enumerating all numbers up to a certain maximal number (maxnum) in chunks of various sizes (blsize) and using various bases (base).
Time demands: Method 1 benchmarking
The time demands of enumerating the numbers using the for loop and the mydec2base function are measured in the following way. The method differs for blsize=1 and blsize>1.
If the chunk size blsize=1, then we want to generate the numbers one by one and actually no chunks are generated. The following code is used in this case:
dbtype measuredec2base 21:25
21 tic; 22 for i = 1:maxnum, 23 n = mydec2base(i,base); 24 end 25 t(ibase, iblsize) = toc;
If the chunk size blsize>1, then we need to construct the array representing the respective chunk of numbers. The following code is used here:
dbtype measuredec2base 27:33
27 tic; 28 d = 0:blsize-1; 29 for i = 1:round(maxnum/blsize), 30 n = mydec2base(d,base); 31 d = d + blsize; 32 end 33 t(ibase, iblsize) = toc;
Time demands: Method 2 benchmarking
The code for measuring the time demands of enumerating the numbers using the CounterInBase class is the same, independent of the chunk size blsize:
dbtype measureCounterInBase 20:25
20 tic; 21 c = CounterInBase(base); 22 for i = 1:round(maxnum/blsize), 23 n = c.next(blsize); 24 end 25 t(ibase, iblsize) = toc;
Time demands: Comparison
Let's measure the time needed to enumerate the maxnum numbers when varying the base and the chunk size blsize.
maxnum = 1e5; bases = space125(2,100); blsizes = space125(1,1000); tD = measuredec2base(maxnum, bases, blsizes) tC = measureCounterInBase(maxnum, bases, blsizes)
tD =
4.2289 2.4874 1.0969 0.6264 0.3961 0.2562 0.2058 0.1839 0.1932 0.1636
3.8217 2.0557 0.8820 0.4918 0.2913 0.1723 0.1342 0.1135 0.1003 0.1024
3.4145 1.9043 0.8144 0.4356 0.2526 0.1396 0.1033 0.0885 0.0736 0.0697
3.3694 1.8698 0.7816 0.4163 0.2348 0.1254 0.0895 0.0735 0.0606 0.0572
3.2603 1.8103 0.7414 0.3960 0.2164 0.1113 0.0768 0.0605 0.0562 0.0441
3.3445 1.9029 0.7767 0.4130 0.2245 0.1147 0.0769 0.0598 0.0478 0.0439
tC =
2.0581 1.1255 0.5433 0.3503 0.2588 0.1978 0.1762 0.1672 0.1630 0.1589
2.0236 1.1394 0.5420 0.3280 0.2334 0.1743 0.1562 0.1471 0.1413 0.1380
2.0310 1.1224 0.5140 0.3241 0.2256 0.1707 0.1472 0.1384 0.1309 0.1294
2.0152 1.0893 0.5135 0.3198 0.2253 0.1692 0.1481 0.1373 0.1295 0.1290
2.0104 1.0778 0.5115 0.3438 0.2216 0.1645 0.1449 0.1350 0.1301 0.1267
2.0127 1.0799 0.5151 0.3184 0.2284 0.1669 0.1708 0.1390 0.1272 0.1302
As can be seen from the results, the CounterInBase is indeed faster than the for loop with conversion function for a broad range of settings. Let's look at the speedup graphically:
plotSpeedup(tD./tC, bases, blsizes); print -dpng -r72 screenshot.png
The maximal speedup is around 2.3 for counting in base 2 with chunk size 2. For low chunk size and low base, the speedup is substantial. For counting in large bases (base>100), the enumeration using CounterInBase becomes inefficient for chunk sizes larger than 20; from that point the CounterInBase class is actually slower than using the base-conversion function in a for loop.
We can look at the boundary where both methods are almost equivalent from another perspective (see the bold yellow line in th next picture). The CounterInBase is more efficient for the parameter combinations on the left of the yellow boundary, the for loop and base-conversion function win on the right side of the boundary.
plotSpeedupContour(tD./tC, bases, blsizes);
Conclusions
If you are about to enumerate numbers in certain base and you cannot process them in chunks larger than, say, 20, you should use the CounterInBase class.
If your primary concern is not the speed, but rather a clear and readable code, you may profit from the friendly interface of the class as well.
The speedup (or slowdown) factor as reported in this document holds, when nothing time consuming is done with the generated numbers (which is not a realistic assumption). In the case when you need to use the generated number for something non-trivial, the difference between both methods will become negligible. In that situation, the friendly interface may persuade you to use this class.