Finding the shortest binary string in a given interval


Today's Little Program answers the question, "Given an interval of real numbers, find the shortest binary string that fits in that interval." For example, if you are given the range 0.1…0.3, then the answer is 0.25, whose binary representation is 0.01. Another way of phrasing the problem is that you are to find the simplest dyadic rational number in the range, where a dyadic rational number is one whose denominator is a power of two. In our case, the simplest such fraction is ¼.

This sounds like a pointless exercise, but it actually solves a real problem: The shortest binary string in the interval (a, b) is the value of the surreal number { a | b }.¹

The idea here is that we start with an integer candidate, and the refine the candidate, making it longer and longer by one bit (alternatively: doubling the denominator), until we come up with a candidate that fits between a and b. Here is my first attempt:

function nextHigherInteger(a) {
 var t = Math.ceil(a);
 if (t == a) t++;
 return t;
}

// Preconditions: 0 ≤ a < b
function shortest(a, b) {
 var t = nextHigherInteger(a);
 var bit = 1;
 while (t >= b) {
  bit /= 2;
  t -= bit;
 }
 return t;
}

You may have noticed that the code is italicized. That's because it's wrong, but let's walk through it anyway so we can understand what it is trying to do:

We start by calculating the smallest integer greater than or equal to a. That is our candidate. If that is less than b, then we have a winner. Otherwise, we subtract a bit and try again. Each time we try, the amount we subtract is cut in half, so that it remains a single bit.

This seems to work for shortest(0.1, 0.3), but it doesn't work for shortest(0.3, 0.4) because we forgot to check whether subtracting the next bit took us below a and out of the range.

Let's try again:

function shortest(a, b) {
 var t = nextHigherInteger(a);
 var bit = 1;
 while (bit) {
  bit /= 2;
  if (t <= a) t += bit;
  else if (t >= b) t-= bit;
  else return t;
 }
 throw "Unable to find a value in range";
}

This time, after making our initial guess, we see if it lies in the target range. If it's too low, we bump it up a little; if it's too high, we drop it a little. Each time we adjust the value, we do so by a smaller and smaller amount, which lets us hit a smaller and smaller target range. The total adjustment is the Zeno sum ½ + ¼ + ⅛ + ... → 1. As a worst case, our initial guess overshoots a by 1 − ε, so our repeated adjustments will eventually bring us as close to a as desired, if you just wait long enough. Since we assume that b > a, we will eventually adjust enough so that the adjusted value is between a and b.

Another way of looking at this problem is by zooming the range up rather than zooming the trial value down; i.e., we keep shifting the binary point to the right (doubling the size of the range) until we can fit an integer into the range. Once we find such an integer, we can then scale it back down to the original interval.

function shortest(a, b) {
 var den = 1;
 var t = nextHigherInteger(a);
 while (t >= b) {
  a *= 2;
  b *= 2;
  den *= 2;
  t = nextHigherInteger(a);
 }
 return t / den;
}

¹ Therefore, it doesn't so much solve a real problem as it solves a surreal problem. Yes, it's a bad math pun. I don't apologize.

Comments (13)
  1. Scarlet Manuka says:

    This seems like a case where, assuming you don’t care about the output format, it’d be easier to deal with the inputs as fixed point binary strings. Then you just have to go along the strings and look for the first place where they don’t match, put a 1 in that position and you’re done.

    Now performance wise this may not be too great, depending on how your language of choice deals with strings, but conceptually it seems simpler to me.

    1. Xinthia Douglas says:

      If the numbers are fixed-point, you can do that by first xorring the two inputs, and then finding the first 1. Here are some ideas on how to do that: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
      The most obvious one would be to left shift the number to test, while simultaneously right shifting our result number.
      And I agree that conceptually, it does seem simpler.

    2. Neil says:

      I don’t think that algorithm agrees with Raymond’s for inputs 0.25, 0.5 – his returns 0.25 but you return 0.5.

      1. Neil says:

        Hmm, thinking about it, I’d expect the surreal number for { 0.25 | 0.5 } would be 0.375, but maybe I miscalculated Raymond’s algorithm.

        1. More likely is that I miscalculated my algorithm.

      2. Xinthia Douglas says:

        But isn’t 0.5 shorter than 0.25? And expressed in binary, 0.01 <= 0.1 <= 0.1.

    3. If you’re going to do that, then skip the string and just do bit operations directly on the mantissa.

  2. Steinar H. Gunderson says:

    I believe it’s possible to do this without a for loop, in just a few instructions, if you’re willing to assume numbers are IEEE 754 floats. It’s more complicated, but here’s an outline (ignoring NaNs and inf for now):

    – If the signs differ, either the numbers are -0.0 and +0.0 (in which case there is no solution), or +0.0 is your solution.
    – If the exponents are the same, find the first digit of the mantissa where they differ (the smaller will be 0 and the larger will be 1, at least assuming positive numbers). You can do this in constant time with XOR and a bit-search instruction. Try setting the 0-bit in the smaller to 1 and truncating the rest; if that’s not equal to the larger number, that’s your solution. If it _is_ equal to the larger, set the end to …01 instead of …1. That’s your solution.
    – If the exponents are different, I’m not sure if there’s sufficient information in your problem. (What’s the right solution for 2.0 and 32.0; are 4, 8 and 16 all good solutions? Or 2.5? Or are these numbers implicitly fixed-point with the 1.0 bit as the first one?) Anyway, for one possible interpretation, you can take the smaller number (again assuming they are positive), find the first 0 bit, flip it to 1 and truncate as above. If the mantissa is all-ones, the solution is to just add epsilon, which takes you up to the next exponent.

    If the numbers are to be interpreted as fixed-point between 0.0 and 1.0, you can convert them to fixed-point by just adding 1.0 (making them between 1.0 and 2.0), and then you can follow case #2 above. That’s probably cheating, though, since it chops off precision :-)

    1. For (2.0, 32.0) the answer is 3. The smallest denominator is 1, and the smallest numerator for that denominator is 3.

  3. Xinthia Douglas says:

    ‘Your comment is awaiting moderation.’
    Can I ask why it is taking so long?

    1. I’m kind of busy. Also, comments with fake email addresses are not accepted.

  4. Lev says:

    The practical usage of this exercise is in Fano coding ( https://en.wikipedia.org/wiki/Shannon%E2%80%93Fano_coding ).

Comments are closed.

Skip to main content