The Solution To The Simple Puzzle

The first time I ran my histogram visualizer I asked for a Cauchy distribution with a minimum of -10 and a maximum of 10, and of course I got a graph that looks much like the one from my article of last week:

Looks perfectly reasonable; I guess my program is correct right out of the gate, because I am that awesome!

Then I went to make a graph of a uniform distribution with a minimum of zero and a maximum of one, but I forgot to update the actual query; it still gave me a Cauchy distribution. Here's that same Cauchy distribution this time graphed only from 0 to 1. Oh, the pain:

Which is obviously neither uniform nor Cauchy. Equally obvious: I am not sufficiently awesome to write a twenty-line program without a trivial floating point bug the first time.

The bug, which is very subtle in the first graph, was now obvious: the calculation to determine what the count is for the leftmost bucket is wrong. Why? Because converting a double to an integer simply discards the fractional part, effectively truncating towards zero, and "towards zero" is not downwards if any datum is negative. That means that the leftmost bucket got everything that was supposed to be in it, and everything that was supposed to be in the bucket to its left as well! The solution is either to take the floor of the number before turning it into an int, or to check to see if the double is in the right range before truncating it, not after.

1. NickLarsen says:

Equally obvious: comments like these are what make this blog awesome.

2. Brian says:

If you aren't sufficiently awesome, then what hope do the rest of us have? I guess that means we will have to test our software (only when working with graphs or floating point numbers, of course).

3. Eugene says:

Now it's suddenly becomes obvious why remainder is not a useful operation and why division should round downwards (re blogs.msdn.com/…/what-s-the-difference-remainder-vs-modulus.aspx)

4. I assure you that my sample programs have even worse stupid errors. Your genius is that it occurred to you to make a blog entry out of them.

5. Eamon Nerbonne says:

This bug is exactly the reason that the behavior you describe in blogs.msdn.com/…/what-s-the-difference-remainder-vs-modulus.aspx is a design flaw: The misdesigned semantics _encourage_ these kinds of mistakes.

6. voo says:

@Eamon I agree. I can't think of a single case where I'd want the modulus to be negative, but there are lots of situations where the converse (always positive) is an useful attribute.

7. CodeInChaos says:

One more puzzle involving random numbers:

var r=new Random();

const int n=1000000;

Console.WriteLine(Enumerable.Range(0,n).Count(_=>r.Next(1431655765)%2==0)/(double)n);

What does this output when you run it on .net 4?

8. Ted says:

A common truncation error.  This is on par with overloading the '==' operator in C# and forgetting to check for null left hand and right hand operands.  MSDN help topics have serveral examples lacking null checks.

9. MalayKPandey says:

This logic works fine for me : –

private static int[] CreateHistogram(IEnumerable<double> data, int buckets, double min, double max)

{

int[] results = new int[buckets];

double divisor = (max – min) / buckets;

foreach (double datum in data)

{

int index = (int)((datum – min) / divisor);

if (0 <= index && index < buckets)

results[index] += 1;

}

return results;

}