Today, another episode of my ongoing series "What's the difference?" Today, what's the difference between a remainder and a modulus, and which, if either, does the % operator represent in C#?

A powerful idea that you see come up in mathematics and computer programming over and over again is the idea of an **equivalence relation**.

First off, let's define (again) what a "relation" is; a relation is a function that takes two things, say, integers, and returns a Boolean that says whether the two integers are "related" or not. For example, "is greater than" is a relation. It takes two integers, say 4 and 2, and returns a Boolean that indicates whether 4 "is greater than" 2 or not; in this case, yes, it is.

An "equivalence relation" is a relation which is reflexive, symmetric, and transitive. That is, if ∾ is an equivalence relation, then:

Reflexive: X∾X is true for any X

Symmetric: X∾Y = Y∾X for any X and Y

Transitive: if X∾Y and Y∾Z are both true then X∾Z is also true.

Clearly "is greater than" is not an equivalence relation; though it is transitive, it is neither symmetric nor reflexive. "Have the same sign" on the other hand, is an equivalence relation; all negative numbers are equivalent to each other, all positive numbers are equivalent to each other, and zero is equivalent to itself.

An equivalence relation partitions the set of integers into (possibly infinitely many) *equivalence classes*. For example, let's consider the equivalence relation A∾B if and only if A-B is evenly divisible by 4. So 0∾4 is true and -86∾2 is true, and so on. We can make four "equivalence classes" where every integer is in exactly one class, and every integer in each class is related to every other integer in that class. The classes are {0, 4, -4, 8, -8, 12, -12, ... }, {1, -3, 5, -7, 9, -11, ... }, {2, -2, 6, -6, 10, -10} and {3, -1, 7, -5, 11, -9, ... }.

These classes are classes of "equivalent" integers, where "equivalence" means "is equally good in terms of determining whether the relation holds". Equivalence classes are often identified by a "canonical" member; in the case of the equivalence classes of "integers that are congruent modulo four", the canonical members are usually taken to be 0, 1, 2 and 3.

One can of course generalize this; given any positive integer n you can create the equivalence relation "A∾B if and only if A-B is evenly divisible by n". This defines n equivalence classes, which are usually identified by the canonical elements 0, 1, ..., n-1. The relation is usually written (somewhat clumsily, in my opinion) as A ≡ B mod n and is read as "A is congruent to B modulo n".

It is very tempting to think of the A % M operator in C# as meaning "partition the integers into M equivalence classes and give me the canonical element associated with the class that contains A" operator. That is, if you say 123 % 4, the answer you get is 3 because 123 ≡ 3 mod 4, and 3 is the "canonical" element associated with the equivalence class that contains 123.

However, that is not at all what the % operator actually does in C#. The % operator is not the *canonical modulus operator*, it is the *remainder operator*. The A % B operator actually answer the question "If I divided A by B using integer arithmetic, what would the remainder be?"

In order to work out what the remainder is, we need to first clearly define our terms. The divisor, dividend, remainder and quotient must obey the following identity:

dividend = quotient * divisor + remainder

Now, that is not enough to uniquely determine an answer to the question of what, say, 123 / 4 is. 123 = 30 * 4 + 3 is the desired solution, but 123 = 6000 * 4 - 23877 is also a possible solution. We need to put restrictions on the quotient and remainder.

Naively we might say that the lesser quotient is the better quotient. But of course that does not help either, because 123 = 1 * 4 + 119 is also perfectly good, and 1 is less than 30, but 1 is not a better quotient. We need some better rules; the rules we've come up with to determine the quotient and remainder (assuming that the divisor and dividend are valid, that is, we're not dividing by zero and so on) are:

* If there is any quotient that makes the remainder zero then that's the quotient and the remainder is zero, and we're done.

* Otherwise, do the division in all non-negative integers. Take the **largest** quotient that keeps the remainder **positive**.

* If the divisor and dividend have opposite signs then make the quotient negative.

* The remainder can now be worked out so as to obey the identity.

The net result here is that integer division rounds towards zero. That should make sense; we want 123/4 to be 30 with a remainder of 3, not 31 with a remainder of -1. Similarly, we want -123/4 to be -30, not -31! It would be bizarre to say that 4 goes into 123 30 times but goes into -123 -31 times! One expects that changing the sign of a term changes the *sign* of the result; it does not change the *magnitude* of the result.

If -123/4 is -30, then what is the remainder? It must obey the identity, so the remainder is -3. That is not the canonical item associated with the equivalence class that contains -123; that canonical item is 1.

It's important to remember this fact when doing things like balancing a hash table:

int bucketIndex = item.GetHashCode() % bucketCount;

or determining if a number is odd:

var oddNumbers = someSequence.Where(x => x % 2 == 1);

The first is wrong because the array index can be negative if the hash code is negative; the second does not classify -123 as an odd number.** The % operator does not give the canonical modulus, it gives the remainder.**

Wow.

After so many years of C#, it's impressive to realize that I was wrong on the behaviour of such a simple operator Oo

What is the motivation for % to be remainder operator instead of modulus?

Or, in other words, in what situations remainder is more useful?

almost every instance of 60 in your arithmetic examples should be replaced with 30. 4 * 60 is 240. ðŸ™‚

Whoops; that's what I get for editing the example after it was written. (I was originally going to talk about parity, so the multiplicands were twos.) You've got to apply the edits *consistently* it turns out. Thanks for the note. -- Eric

Nitpick: By the given definition of Reflexiveness (Xâˆ¾X is true for any X), "Are both greater than zero" would not be an equivalence relation, as it would not hold for any X <= 0.

I'm not a mathematician, so I'm not sure if there is actually a slightly different true definition of reflexiveness, or if this is an incorrect example. Looking at Wikipedia I believe it's the latter? en.wikipedia.org/.../Equivalence_relation

But please don't get me wrong - this was a great post, Eric. ðŸ™‚

For reference, a short summary of the sign of the remainder in various languages:

1) Both same sign as divisor and same sign as dividend operators:

Ada, Common Lisp, Haskell, Matlab, Prolog and VHDL use â€˜modâ€™ and â€˜remâ€™; Ruby, Standard ML, Euphoria, Fortran, SenseTalk and Scheme use other names.

2) Same sign as dividend:

ActionScript, bc, bash, C99, C#, Clarion, ColdFusion, D, Go, Java, JavaScript, PHP, PowerShell, Torque Game Engine and Verilog use â€˜%â€™. AMPL, AppleScript, Game Maker, Objective Caml, Delphi, SQL99 and most Basics use various capitalisations of â€˜Modâ€™. Occam, Eiffel, PIC Basic Pro, RPG, Progress and Erlang use others.

This is the same as used by the x86 line of processors.

3) Same sign as divisor:

Lua 5, Python, Tcl and Perl use â€˜%â€™. Maple, Clojure, FileMaker, Mathematica, Minitab, Oberon, Turing, old-style Lua, MathCad and PL/I use various capitalisations of â€˜Modâ€™. Smalltalk, R and J use others.

4) Always positive:

Used by ALGOL-68, Pascal, Stata and Scheme R6RS. The last one also has a closest to zero remainder.

5) Implementation defined:

C90 and C++ use â€˜%â€™.

6) Not defined:

ASP and old-style Basic, although I've never come across something that ancient. Even GW-Basic exhibits the same behaviour as modern Basics.

(Loosely based on the â€˜modulo operationâ€™ Wikipedia article, one or two web searches and a quick test.)

P.S. It's pretty rude that the blogging software just throws your post away if you edit too long. Fortunately I saved it to the clipboard.

Some may find this paper, Division and Modulus for Computer Scientists, helpful:

legacy.cs.uu.nl/.../divmodnote.pdf

@Vlad - there's no good reason for it to be that way except that early hardware where C first ran defined the operation that way. The remainder function on negative numbers is pretty useless and one in practice invariably winds up fixing the result if negative numbers are a possibility. For all the reasons "it's important to remember this fact," it's clearly the wrong decision to make when designing a programming language's semantics.

@Matt

The reflexive property isn't dependent on just one comparison, but on the comparison of two comparisons. For example, It doesn't matter if a > 0 or b > 0, but that (a > 0) = (b > 0).

Here is a pseudo-implementation (in something that looks a lot like javascript) that might help:

function bothGreaterThanZero(a, b) {

return a > 0 && b > 0;

}

function isReflexive(a, b, func) {

x = func(a, b);

y = func(b, a);

return x === y;

}

var a = -1, b = -5;

var result = isReflexive(a, b, bothGreaterThanZero);

Since first and second are equal, the relation is reflexive.

@Darren: the PDPs for which C was originally cooked up had a very limited instruction set. Both kinds of remainders were available as common library functions though. Insofar as C#'s % definition is historically inspired, it is more likely due to the influence of the x86 idiv instruction.

Also note that the first major .Net languages were C# and VB.Net which had a C++ and VB heritage. Since C++'s % was implementation defined it may have made sense to choose VB's Mod behaviour.

DOH! I need a rewind on that example. That is what I get for posting without running.

function greaterThanZero(num) {

return num > 0;

}

function isReflexive(a, b, func) {

x = func(a);

y = func(b);

return x === y;

}

var a = -1, b = -5;

var result = isReflexive(a, b, greaterThanZero);

Surely, the first rule on the reminder you want is abs(reminder) < abs(divisor).

> Similarly, we want -123/4 to be -30, not -31!

We actually want it to be -30.75, and if you round that down (which is what you want in most cases), you get -31. This is what you get in languages inspired by math, rather than CPU instruction sets.

For a good treatment of the subject see Daan's classical paper: legacy.cs.uu.nl/.../divmodnote.pdf

Or even Wikipedia: en.wikipedia.org/.../Modulo_operation

good to know!

for some reason i thought that % operator always yields the canonical number ( a positive number). It turns out I was wrong and negative numbers are possible too.

@Eugene

> Surely, the first rule on the reminder you want is abs(reminder) < abs(divisor).

So? This requirement is observed by both the canonical modulus operation and by the x86-based remainder operation.

>> Similarly, we want -123/4 to be -30, not -31!

> We actually want it to be -30.75, and if you round that down

> (which is what you want in most cases)

Why? Why is "round toward negative infinity" a more useful operation than "round toward zero"?

Alternatively, why is it good to have a system where (-x)/y != -(x/y)?

This is one of my bugbears.

I don't think I have *EVER* found the remainder operation useful, except as a means to calculate the modulus. I've had to write a modulus function for pretty much every language I've ever used, and this isn't always entirely straightforward; handling edge cases (like INT_MIN) can be tricky.

Modulus really ought to be a member of the Math class.

This is an interesting article. I am like many of the other posters--in 25 years of practical programming, I have never used any language's "remainder" operator for anything other than modulo on positive integers.

Mandatory nitpick: not all relations are functions, although all functions are relations. And even more importantly, not all relations are binary.

@Dale

> So?

Well, if you take care to define reminder, do it properly.

> Why?

There's a long list of arguments in e.g. Knuth and Daan's paper which I linked has some good references.

For one, it makes x%2==1 for all odd x, and hashBucket = hash%numBuckets without casts to unsigned.

> Transitive: if Xâˆ¾Y and Yâˆ¾Z are both true then Xâˆ¾Z is also true; otherwise it is false

It seems there's an inaccuracy here, namely in the 'otherwise it is false' part. For equivalence there's no difference, but it is not part of 'transitive' definition. Assuming it is correct, we have that !(Xâˆ¾Y && Yâˆ¾Z) -> !X?Z. Contraposition of implication is equivalent transition, so Xâˆ¾Z -> Xâˆ¾Y && Yâˆ¾Z. So, put simply, we have that if Xâˆ¾Z is true, then for any Y should also be true Xâˆ¾Y && Yâˆ¾Z which is definitely incorrect for any transitive relation.

For simple example let's take > relation which is transitive as we know. Let X=3, Y=3, Z=1. 3>3 && 3>1 (X>Y && Y>Z) is not true, so according to your definition 3>1 (X>Z) should be false. Bah.

Dale:

> Alternatively, why is it good to have a system where (-x)/y != -(x/y)?

Suppose you're implementing a game, simulation or renderer where you have objects positioned in a 2D or 3D space. There is nothing special about the 0 coordinate, and you don't want a discontinuity as objects travel between negative and positive coordinates. Say you measure position in units and split space up into 3 unit square cells. If you use division and round-towards-zero integer division to determine cell, you get a discontinuity as an object moves across zero:

Position = ... -11, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7...

Cell = ... -3, -3, -3, -2, -2, -2, -1, -1, -1, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2...

So your object spends longer in cell 0 and there's a discontinuity between negative and positive space.

Here's a similar bug where rounding direction errors have caused unintentionally different behaviour in a game for negative coordinate positions than positive ones:

gaming.stackexchange.com/.../do-positive-coordinate-locations-give-more-ore-in-minecraft

I'd say it's not that the property "For all x, y: if y!=0 then (-x)/y == -(x/y)" is undesirable, but that in practice it's not as useful a property as "For all x, y, k: if y!=0 then x / y + k == (x + ky) / y"

I would concur with the others and say that in 6 years as a professional programmer and maybe 20 years as an amateur I cannot think of a single instance of wanting the remainder operation, but I can think of a great many instances of wanting the modulus operator. (However, I also don't think I can remember an instance where I cared about the behaviour for a negative *divisor*, so I'm equally happy with floored division or Euclidian division, as described in the Wikipedia article.)

As an introductory reference on the subject, your article is missing one detail: an example telling how to get Modulus in C# for those cases where that IS what you want. What is the "best" way to do this? (Assuming you want a formula that works even in corner cases like INT_MIN).

public int Modulo (int x, int y)

{

if ( y == 0) return 0;

int reduced = (x / y);

int absReduced = reduced < 0 ? -1 * reduced : reduced;

return absReduced % y;

}

>> Transitive: if Xâˆ¾Y and Yâˆ¾Z are both true then Xâˆ¾Z is also true; otherwise it is false

Ivan> It seems there's an inaccuracy here, namely in the 'otherwise it is false' part. For equivalence there's no difference, [...] For simple example let's take > relation which is transitive as we know. Let X=3, Y=3, Z=1. 3>3 && 3>1 (X>Y && Y>Z) is not true, so according to your definition 3>1 (X>Z) should be false. Bah.

Actually, it fails for equivalences too. Take integer equality, X=1, Y=2, Z=1. X=Y is false, therefore X=Z, 1=1, is false.

More formally, the author wrote "(X~Y)&&(Y~Z)=(X~Z)" when he only meant to say "(X~Y)&&(Y~Z)->(X~Z)".

Overall though, well writen. It's interesting to contrast this further with other languages, as Anonymous Coward has. I was under the false impression for some time that C/C++ held to the remainder-quotient identity while rounding toward negative infinity, but this was just an artifact of the compiler/architecture. I have not performed % on a signed number since I learned the truth.

An easy way to get the Conical Modulus is to convert to uint. Using Eric's example:

int bucketIndex = (int)((uint)item.GetHashCode() % (uint)bucketCount);

@Aaron: yeah, good catch, thanks. ðŸ™‚

@DRBlaise: This doesn't work. For example, (-1 mod 4) is 3, but your code gives 1.

To calculate the modulus "x mod n" I tend to use one of the following methods:

1) general: y = x%n; if y<0 { y = y+n; }

2) when n is a power of 2 use a bit mask: y = x & (n-1)

3) in situations I know that x and n are not negative: y = x%n; (remainder and modulus are the same in this case)

And like many other respondents, I also never had any need for the "remainder" operation in my 20+ years of programming (and very frequent need of the modulus). The existing '%' operation cannot be changed in C# anymore, but isn't it time to add a proper modulus operator to the language? For example '%%'.

I tend to be really lazy:

1. int mod = (x % n + n) % n;

2. int mod = (x + bigConstant) % n; (if I know that x will never be smaller than -bigConstant, chosen to be a multiple of n)

> (Note that if a relation is an equivalence relation then the converse also holds: if Xâˆ¾Y and Yâˆ¾Z are not both true the Xâˆ¾Z is false.)

If X = 1, Y = 2, Z = 1, and the equivalence relation is "=", does this mean that if 1 = 2 and 2 = 1 are not both true (which they aren't), then 1 = 1 is false?

I think you meant if exactly one of Xâˆ¾Y and Yâˆ¾Z is true, then Xâˆ¾Z is false. Or I could be missing something.

@carlos: Converting to uint DOES work-Did you run your example through code (-1 mod 4)?

int x = -1; int y = 4; Console.WriteLine((uint)x % (uint)y);

This prints 3.

"(Note that if a relation is an equivalence relation then the converse also holds: if Xâˆ¾Y and Yâˆ¾Z are not both true the Xâˆ¾Z is false.)"

For example, 1=2 and 2=1 are not both true, so 1=1 is false.

@DRBlaise: Sorry, I had a brain fart. I was thinking (uint) is the same as Abs(), when I know very well it isn't.

However, your solution still fails unless 2 divides the dividend. e.g. for -1 mod 5 you get 0.

So I guess this has the same problem as everyone else as to what this operator does:

msdn.microsoft.com/.../ydwa9zh0.aspx

+1 to weeble0's comment. I have also been a professional programmer for 6-7 years and an amateur for 20 years, and I am also not aware of a single instance of wanting the "remainder" operation involving negative numbers, but I can think of several instances of wanting the modulus operator (i.e. where I want 'mod' to produce a positive result).

Normalizing angles comes to mind immediately. It would sure be SO nice if I could just write "angle % 360.0".

@DRBlais

Converting to uint only works when the divisor is a power of two. I think in early Delphi versions the modulo behavior worked by casting to uint, but I don't have such a version here to verify.

For hash tables it doesn't matter, because you're not interested in the canonical modulus, but just in a well distributed number 0<=i<n.

For an easy modulus that includes negatives, I use the below (in the call to WriteLine):

int n = 5;

for (int i = -10; i < 11; i++)

{

Console.WriteLine(i % n < 0 ? n+(i % n) : i % n);

}

It's wonderful to see mathematics still being talked about as it relates to programming. Too often it's just a matter of trying to see what give the answer that's useful to you in that situation and not bothering to look at the "real" story. Great article.

Just one month ago you recommended "Java Puzzlers: Traps, Pitfalls, And Corner Cases" on your C# reading list. The unintuitive behaviour of '%' is the very first "puzzler" in the book. You are proving your own point that it is a useful book for C# devs to read as well.

Eric, why use x % 2 == 1 instead of just x % 2 != 0 ?

Surely thi is a classic example of Principle of Greatest Surprise:

How to make sure to surprise the greatest number of programmers in the worst possible way at the worst possible time.

Haivng been bitten by this in C, but understanding why C++ had to remain backwards compatible, I would have bet real money that C# would make a more sensible decision. Wrong! DO we have to wait another 30 year for a programming language to get it right?