What is the historical reason for MulDiv(1, -0x80000000, -0x80000000) returning 2?

Commenter rs asks, "Why does Windows (historically) return 2 for MulDiv(1, -0x80000000, -0x80000000) while Wine returns zero?"

The MulDiv function multiplies the first two parameters and divides by the third. Therefore, the mathematically correct answer for MulDiv(1, -0x80000000, -0x80000000) is 1, because a × b ÷ b = a for all nonzero b.

So both Windows and Wine get it wrong. I don't know why Wine gets it wrong, but I dug through the archives to figure out what happened to Windows.

First, some background. What's the point of the MulDiv function anyway?

Back in the days of 16-bit Windows, floating point was very expensive. Most people did not have math coprocessors, so floating point was performed via software emulation. And the software emulation was slow. First, you issued a floating point operation on the assumption that you had a float point coprocessor. If you didn't, then a coprocessor not available exception was raised. This exception handler had a lot of work to do.

It decoded the instruction that caused the exception and then emulated the operation. For example, if the bytes at the point of the exception were d9 45 08, the exception handler would have to figure out that the instruction was fld dword ptr ds:[di][8]. It then had to simulate the operation of that instruction. In this case, it would retrieve the caller's di register, add 8 to that value, load four bytes from that address (relative to the caller's ds register), expand them from 32-bit floating point to 80-bit floating point, and push them onto a pretend floating point stack. Then it advanced the instruction pointer three bytes and resumed execution.

This took an instruction that with a coprocessor would take around 40 cycles (already slow) and ballooned its total execution time to a few hundred, probably thousand cycles. (I didn't bother counting. Those who are offended by this horrific laziness on my part can apply for a refund.)

It was in this sort of floating point-hostile environment that Windows was originally developed. As a result, Windows has historically avoided using floating point and preferred to use integers. And one of the things you often have to do with integers is scale them by some ratio. For example, a horizontal dialog unit is ¼ of the average character width, and a vertical dialog unit is 1/8 of the average character height. If you have a value of, say, 15 horizontal dlu, the corresponding number of pixels is 15 × average character width ÷ 4. This multiply-then-divide operation is quite common, and that's the model that the MulDiv function is designed to help out with.

In particular, MulDiv took care of three things that a simple a × b ÷ c didn't. (And remember, we're in 16-bit Windows, so a, b and c are all 16-bit signed values.)

  • The intermediate product a × b was computed as a 32-bit value, thereby avoiding overflow.

  • The result was rounded to the nearest integer instead of truncated toward zero

  • If c = 0 or if the result did not fit in a signed 16-bit integer, it returned INT_MAX or INT_MIN as appropriate.

The MulDiv function was written in assembly language, as was most of GDI at the time. Oh right, the MulDiv function was exported by GDI in 16-bit Windows. Why? Probably because they were the people who needed the function first, so they ended up writing it.

Anyway, after I studied the assembly language for the function, I found the bug. A shr instruction was accidentally coded as sar. The problem manifests itself only for the denominator −0x8000, because that's the only one whose absolute value has the high bit set.

The purpose of the sar instruction was to divide the denominator by two, so it can get the appropriate rounding behavior when there is a remainder. Reverse-compiling back into C, the function goes like this:

int16 MulDiv(int16 a, int16 b, int16 c)
 int16 sign = a ^ b ^ c; // sign of result

 // make everything positive; we will apply sign at the end
 if (a < 0) a = -a;
 if (b < 0) b = -b;
 if (c < 0) c = -c;

 //  add half the denominator to get rounding behavior
 uint32 prod = UInt16x16To32(a, b) + c / 2;
 if (HIWORD(prod) >= c) goto overflow;
 int16 result = UInt32Div16To16(prod, c);
 if (result < 0) goto overflow;
 if (sign < 0) result = -result;
 return result;

 return sign < 0 ? INT_MIN : INT_MAX;

Given that I've already told you where the bug is, it should be pretty easy to spot in the code above.

Anyway, when this assembly language function was ported to Win32, it was ported as, well, an assembly language function. And the port was so successful, it even preserved (probably by accident) the sign extension bug.

Mind you, it's a bug with amazing seniority.

Comments (47)
  1. I wonder if there is a tool for searching a computer for PE images that import a given function.  I would be curious to know what software might still be importing this esoteric function!

  2. henke37 says:

    I have seen ports like that before. New fancy wrapping, same old code.

  3. Odder still: if MSDN is correct, why are they carrying these functions over to Windows 8 Metro-style apps, which supposedly dispense with the old Win32 way of doing things?  I can't imagine a situation where you'd need them – every compiler-writer is probably going to be using built-in processor instructions for this or providing their own emulation…

    ["In particular, Mul­Div took care of three things that a simple a × b ÷ c didn't." Two of those three things still apply today. -Raymond]
  4. SimonRev says:


    Considering that the MSDN docs on LOGFONT suggest using MulDiv to convert a point size to lfHeight, I suspect a lot of software is using that function.

  5. S says:

    @JamesJohnston – You'll find a lot of apps still use this function. Although its origins are quite esoteric, because the win32 api tends to use all integers, and a lot of devs still believe that floating point maths is really slow even when it isn't, people still use it. Plus this msdn page (<a href="msdn.microsoft.com/…/a>) shows the canonical method to initialize a LOGFONT structure, which I know I have written lots of times in lots of apps.

  6. Beard says:

    I imagine Wine gets it wrong for compatibility reasons ;)

  7. Are you sure? says:

    @Beard: Then wouldn't Wine return 2, not 0?

  8. Surely you mean 0x8000 rather than 0x80000000

  9. Oh, I see, it started out as 0x8000 but then made the jump to 0x80000000 when it was ported

  10. kog999 says:

    I would like to apply for my refund. When should i expect to receive it?

  11. Adam Rosenfield says:

    The Wine source code can be found here, for the curious: source.winehq.org/…/kernel_main.c

  12. alegr1 says:


    MulDiv is a very convenient function, for example, to scale coordinates from some app units to pixels (or the other way around). Nothing esoteric about it.

  13. Mike Mol says:

    At a guess, I'd suspect WINE gets it wrong because it's trying to duplicate Windows behaviors, so that apps which depend on Windows behaviors don't break under WINE.

  14. voo says:

    Another person to add to the ever growing list of people who never thought that abs(x) < 0 was a possibility. Well can't blame them..

    @Mike Mol: Keeping backwards compatibility by introducing some completely different bug? Interesting idea. Note that the wine version is actually worse than the original windows version, because we're relying on signed overflow because it's written in C, not asm and that's undefined behavior in C.

  15. Cesar says:

    @Adam Rosenfield: I prefer reading the source using LXR: source.winehq.org/…/kernel_main.c

    I believe Wine based the function on the documentation, and no real-world program depended on the buggy result in such a way that it would break under Wine, so they did not know (or did not care) about the difference. A lot of Wine code seems to be written on demand (a function is written because some program uses it), unlike Windows which has to write the code before programs start using it.

  16. Evan says:

    I like how the Wine folks apparently haven't heard of INT_MIN and INT_MAX and actually wrote out 2147483647.

  17. @Evan says:

    INT_MIN != -INT_MAX  (INT_MIN = 0x80000000 = -2147483648, INT_MAX = 0x7fffffff =  +2147483647)

  18. rs says:

    Thank you for answering my question! I ran into this behavior when I used MulDiv for some coordinate scaling. I was using -0x80000000 as a sentinel value and first assumed no special handling was needed in the MulDiv call since it would behave as an identity function. However, some simple testing showed this was not true.

    The historical information you collected is awesome and makes this perfectly comprehensible. If a=1, b=c=-0x8000, and (-c)/2=0x8000/2 is computed as 0xc000 (=(uint16)-0x4000) instead of 0x4000, the result is (0x8000+0xc000) / 0x8000, which would explain the value 2.

  19. dave says:

    but on the opposite just think on how hard it would be to implement fork() on Windows..)

    Easy enough,  I suppose, given that NtCreateProcess has an input argument that tells it to use an existing address space (and of course it was designed that way to make fork possible).

    The now-just-about-obsolete book Windows NT/2000 Native API Reference (Nebbett, 2000) has an example sketching out how you'd implement fork for Win32 processes.

  20. Cesar says:

    @dave: AFAIK, you lose the Win32 subsystem if you do it that way.

  21. Ian Boyd says:

    "Those who are offended by this horrific laziness on my part can apply for a refund."  <3 this blog

  22. @mmm, @dave, @Cesar:

    I'll just leave this here…

    #5.5: cygwin.com/faq-nochunks.html

    #4.44: cygwin.com/faq-nochunks.html

    Process Creation: cygwin.com/…/highlights.html

  23. Dealing with the overflow in the intermediate calculation is really useful, but I found the rounding behaviour of MulDiv introduced bugs into code which assumed integer division would always round down (like it normally does), so I ended up replacing MulDiv with my own simple MulDivRoundDown.

    It depends what you use it for whether the rounding is good or bad, but it's worth keeping in mind that it is not like normal integer multiplication followed by division, even ignoring this quirk and the handling of division by zero and overflows. Don't blindly drop MulDiv into some code just to avoid the overflow problem.

  24. Adam Rosenfield says:

    @JamesJohnston: This should work under Cygwin, though it will be rather slow:

    grep -r –include=*.{exe,dll} -l -Z '<MulDiv>' /cygdrive/* | (while read -r -d "$(printf \0)" x; do dumpbin /imports "$x" | grep -l –label="$x" '<CreateFileW>'; done)

    You could also replace the initial grep with a "find /cygdrive/* -iname *.dll -o -iname *.exe -print0"; it'll do less I/O but require more dumpbin processes to be created (one for every executable, instead of one for every potential match), so it's not clear to me which is faster.

  25. Anonymous Coward says:

    Note for those of us puzzled by the minus signs:

    -x = ~x + 1

    -80000000h = 0x7FFFFFFFh + 1 = 80000000h

    Because of overflow roll-around, there is no difference whether the minus signs are there or not.

  26. Evan says:

    Oh, by the way, the overflow issue reminds me of this cute little puzzle: write a function to average two C 'int's. If the average is not an integer you can round up or down, it doesn't matter; in fact, you don't even have to be consistent between different numbers. (This problem arose for me in doing a binary search over the space of ints.)

    There are a couple different "levels" of answers. The one that made it into code was "cast to a long long, average, then cast back" along with a static assertion that sizeof(long long) > sizeof(int). But that's intellectually unsatisfying; for instance, it doesn't tell you how to average two long longs besides "use an arbitrary-precision arithmetic library."

    So one step up from "use a larger integer" is to do it while making other mostly-reasonable assumptions about the platform, e.g. that 'int' is a 32-bit, two's complement number that wraps on overflow. The step up from that is to do it so it'll work on any conforming C platform.

  27. Joshua says:

    @Evan: (t1) + ((t2 – t1) / 2)

    This works on two's compliment wrap-around but bitness doesn't matter (they can even be pointers). It also works on ones compliment if you cast to unsigned-type and back.

    [Yup, this is an old problem. -Raymond]
  28. mmm says:

    >> and no real-world program depended on the buggy result in such a way that it would break under Wine

    To be true so many programs break under Wine (*) that there is an higher than zero chance that this might be one of the thousands of little issues that make apps break.

    (*) => from the above statement it might seem that I do not appreciate Wine. On the contrary – just their job is unbelievably difficult: to emulate a moving target, supporting apps relying on all kinds of undocumented behavior all this running on systems which (by virtue of having a different kernel) might make some simple tasks immensely difficult (I can't come with an example in that direction, but on the opposite just think on how hard it would be to implement fork() on Windows..).

  29. Ivo says:

    How about (a>>1)+(b>>1)+(a&b&1)? I believe it will always round down. (a>>1)+(b>>1)+((a|b)&1) to round up.

  30. Evan says:

    Yes, I know.

    First, I'd have to work through the math and what happens with overflow and stuff to know this for sure (because they sure didn't comment it well if it's correct) but from just glancing at things it seems like the check *should* have been against INT_MIN rather than INT_MIN+1.

    Second, even if the code is right, interestingly enough, -2147483647 = -2147483648+1 = INT_MIN+1, which is probably a better way of writing it.

  31. steven says:

    How to reverse-compile assembly into C?

  32. Jim says:

    Steven: use an appropriately skilled (and willing) human being.

  33. Paul M. Parks says:

    @steven: I usually describe it as turning hamburger back into a cow.

  34. Evan says:

    [I've tried posting this before, but I think it was dropped. Sorry if it's a double (or triple!) post.]

    @Joshua:  (t1) + ((t2 – t1) / 2)

    "The average of INT_MIN and INT_MAX is -2147483648". (t2-t1 can overflow.) That can be the basis of a solution, but isn't completely correct itself.

    @Ivo: (a>>1)+(b>>1)+(a&b&1)

    '>>' is allowed to do a logical shift (giving a positive number, of course), but IIRC that's a works-in-practice solution.

  35. ThomasX says:

    Is this bug already fixed? Life in this universe cannot possibly go on otherwise.

  36. Neil says:

    I would have implemented MulDiv(a, b, c) as a * b * 2 / c / 2 (with appropriate rounding) and it wouldn't have occurred to me to check the behaviour of INT_MIN either.

  37. Dave Totzke says:

    >the exception handler would have to figure out that the instruction was fld dword ptr ds:[di][8]

    It is things like this that keep me constantly amazed that any of this computer stuff even works at all given what goes on close to the metal. I mean this in a complementary way of course.

    It is also the very reason that I am thankful that I work many layers of abstraction removed from the metal.

  38. Matt Graeber says:

    @JamesJohnston, @Adam Rosenfield

    Here's a quick powershell script to dump all dlls or exes that import MulDiv:

    Get-ChildItem $env:windir -Recurse -Include *.dll,*.exe -ErrorAction SilentlyContinue |

      ForEach-Object {

         $Assembly = $_

         Invoke-Expression "dumpbin /IMPORTS $Assembly" |

            Where-Object { $_ -match '(?i:MulDiv)' } |

            ForEach-Object { $Assembly }


  39. meh says:

    it should be pretty easy to spot in the code above

    :( I saw the 2's complement overflow, but couldn't work out whether 2, 1.5 (rounded to 2), or something else would be returned in the end. My knowledge of C (ANSI or pre-ANSI) integer promotion, arithmetic signed/unsigned conversion, etc, being rusty, I tried to see what would happen when copying the code to a compiler and changing all the variables to 32/64 width to hopefully produce the same result. Anyways, I was surprised to discover that UInt32x32To64 is defined in WinNT.h!

  40. @Ceaser:

    No you don't lose the Win32 subsystem that way. To a certain extent, calling native functions through LoadLibrary/GetModuleHandle and GetProcAddress is supported. That is the reason for the existance of winternal.h and for pages like msdn.microsoft.com/…/bb432200(VS.85).aspx in the msdn. Also, to stop your application from being Win32, you have to link with /subsystem:native. If you use /subsystem:windows or /subsystem:console then your application will be seen as a Win32 process regardless.

    The only thing to get in the way is the lack of documentation for NtCreateProcess.

  41. Yuhong Bao says:

    BTW, why was MulDiv not extended to 64-bit in Win64?

  42. Cesar says:

    @Crescens2k: search on your favorite search engine for "NtCreateProcess cygwin fork" (or similar queries). You will find mailing list posts where they discussed why it does not work, and one of the reasons is that yes, you lose the connection to csrss (which is where part of Win32 resides).

  43. Evan says:

    BTW, in case anyone cares and is still reading this, here's one of the most satisfying answers to my average question: a/2 + (a%2 + b%2)/2 + c/2. It's fully standards-compliant AFAICT, as the standard requires / and % to be consistent. -5/2 is allowed to return either -2 or -3, but if -5/2==-2 then -5%2 must be -1, and if -5/2==-3 then -5%2 must be +1. That works either way its set up. (Incidentally, now you should also be able to say why the order of operations matters and doing a/2 + c/2 + (a%2 + b%2)/2 isn't correct.)

    That solution isn't mine to have come up with, but I like it quite a bit.

  44. @Cesar:

    And this is where the lack of documentation on NtCreateProcess comes in. If you actually go and reverse engineer CreateProcess you will notice that it is more than a thin wrapper around NtCreateProcess.

    CreateProcess itself actually calls RtlCreateProcessParams which does the proper setting up and when necessary, it calls NtCreateProcess. As a bit of a guess, I'm thinking that NtCreateProcess just creates the process object. So again, it isn't that you lose Win32 that way, it is the lack of documentation on NtCreatePocess that is the problem.

  45. Gabe says:

    Crescens2k: Indeed, NtCreateProcess merely a tiny cog inside the giant Win32 CreateProcess machine. I once looked at the source code for CreateProcess to understand how to write something like fork(), and it was several pages of code. It has to create the address space, parse the command line, load the executable, create the initial thread, create the environment, and so on. There are dozens of system calls, of which just one is NtCreateProcess.

    In true microkernel fashion, the NtCreateProcess function does nothing but create a container with some inherited handles, so I don't think that lack of documentation on it is as important as the lack of documentation on everything else that CreateProcess has to do.

    It's hard to implement fork() on Win32 because the whole system wasn't designed to support cloning processes.

  46. Anonymous Coward says:

    Note that you should avoid fork() even on Linux/Unix/Posix. The reason is that if you use fork() from a non-trivial program, you get a choice between either turning overcommit on (which is dangerous, and I'm sad to say the default on many operating systems, although fortunately not on Windows) or risk poor performance and high memory and swap consumption.

    Don't use fork(), there are better alternatives, especially for fork-exec, which is from the code I've seen the most common use of fork().

Comments are closed.