Find the index of the smallest element in a JavaScript array


Today's Little Program isn't even a program. It's just a function.

The problem statement is as follows: Given a nonempty JavaScript array of numbers, find the index of the smallest value. (If the smallest value appears more than once, then any such index is acceptable.)

One solution is simply to do the operation manually, simulating how you would perform the operation with paper and pencil: You start by saying that the first element is the winner, and then you go through the rest of the elements. If the next element is smaller than the one you have, you declare that element the new provisional winner.

function indexOfSmallest(a) {
 var lowest = 0;
 for (var i = 1; i < a.length; i++) {
  if (a[i] < a[lowest]) lowest = i;
 }
 return lowest;
}

Another solution is to use the reduce intrinsic to run the loop, so you merely have to provide the business logic of the initial guess and the if statement.

function indexOfSmallest(a) {
 return a.reduce(function(lowest, next, index) {
                   return next < a[lowest] : index ? lowest; },
                 0);
}

A third solution is to use JavaScript intrinsics to find the smallest element and then convert the element to its index.

function indexOfSmallest(a) {
 return a.indexOf(Math.min.apply(Math, a));
}

Which one is fastest?

Okay, well, first, before you decide which one is fastest, you need to make sure they are all correct. One thing you discover is that the min/indexOf technique fails once the array gets really, really large, or at least it does in Internet Explorer and Firefox. (In my case, Internet Explorer and Firefox gave up at around 250,000 and 500,000 elements, respectively.) That's because you start hitting engine limits on the number of parameters you can pass to a single function. Invoking apply on an array of 250,000 elements is the equivalent of calling min with 250,000 function parameters.

So we'll limit ourselves to arrays of length at most 250,000.

Before I share the results, I want you to guess which algorithm you think will be the fastest and which will be the slowest.

Still waiting.

I expected the manual version to come in last place, because, after all, it's doing everything manually. I expected the reduce version to be slightly faster, because it offloads some of the work into an intrinsic (though the function call overhead may have negated any of that improvement). I expected the min/indexOf version to be fastest because nearly all the work is being done in intrinsics, and the cost of making two passes over the data would be made up by the improved performance of the intrinsics.

Here are the timings of the three versions with arrays of different size, running on random data. I've normalized run times so that the results are independent of CPU speed.

Relative running time per array element
Elements manual reduce min/indexOf
Internet Explorer 9
100,000 1.000 2.155 2.739
200,000 1.014 2.324 3.099
250,000 1.023 2.200 2.330
Internet Explorer 10
100,000 1.000 4.057 4.302
200,000 1.028 4.057 4.642
250,000 1.019 4.091 4.068

Are you surprised? I sure was!

Not only did I have it completely backwards, but the margin of victory for the manual version was way beyond what I imagined.

(This shows that the only way to know your program's performance characteristics for sure is to sit down and measure it.)

What I think is going on is that the JavaScript optimizer can do a really good job of optimizing the manual code since it is very simple. There are no function calls, the loop body is just one line, it's all right out there in the open. The versions that use intrinsics end up hiding some of the information from the optimizer. (After all, the optimizer cannot predict ahead of time whether somebody has overridden the default implementation of Array.prototype.reduce or Math.prototype.min, so it cannot blindly inline the calls.) The result is that the manual version can run over twice as fast on IE9 and over four times as fast on IE10.

I got it wrong because I thought of JavaScript too much like an interpreted language. In a purely interpreted language, the overhead of the interpreter is roughly proportional to the number of things you ask it to do, as opposed to how hard it was to do any of those things. It's like a fixed service fee imposed on every transaction, regardless of whether the transaction was for $100 or 50 cents. You therefore try to make one big purchase (call a complex intrinsic) instead of a lot of small ones (read an array element, compare two values, increment a variable, copy one variable to another).

Bonus chatter: I ran the test on Firefox, too, because I happened to have it handy.

Relative running time per array element
Elements manual reduce min/indexOf
Firefox 16
100,000 1.000 21.598 3.958
200,000 0.848 21.701 2.515
250,000 0.839 21.788 2.090

The same data collected on Firefox 16 (which sounds ridiculously old because Firefox will be on version 523 by the time this article reaches the head of the queue) shows a different profile, although the winner is the same. The manual loop and the min/indexOf get more efficient as the array size increases. This suggests that there is fixed overhead that becomes gradually less significant as you increase the size of the data set.

One thing that jumps out is that the reduce method way underperforms the others. My guess is that setting up the function call (in order to transition between the intrinsic and the script) is very expensive, and that implementors of the JavaScript engines haven't spent any time optimizing this case because reduce is not used much in real-world code.

Update: I exaggerated my naïveté to make for a better narrative. As I point out in the preface to my book, my stories may not be completely true, but they are true enough. Of course I know that JavaScript is jitted nowadays, and that changes the calculus. (Also, the hidden array copy.)

Comments (49)
  1. Mc says:

    I'm always telling my programmers to keep things simple as possible and don't start optimizing code till you find it's necessary and then measure which bit is causing the problem and focus your efforts there.

  2. David Cuccia says:

    Reminds me of Herb Sutter's discussion in his recent TechEd talk about cache lines and prefetchers.

  3. anyfoo says:

    > Firefox 16 (which sounds ridiculously old because Firefox will be on version 523 by the time this article reaches the head of the queue)

    And predictably, you were right! It's not quite at 523 yet, but at 29 nonetheless. Almost twice the version number and 16 certainly seems old by now.

  4. Mike Caron says:

    My guess was aligned with the results, which makes me feel good, because every similar operation I've done in Javascript (look for biggest/smallest, sum, etc) looks like the loop version, because I didn't even know there was a reduce function in Array!

    On an unrelated note, the last version seems insane to me. Not a criticism, of course, since this it was written in the interests of performance testing/exploration. However, I would definitely flag it if I saw it in a code review.

  5. RRR says:

    Unless I'm missing something here (and I might, cause I don't know crap about javascript), not surprised one bit.

    This is one case where the obvious way is obviously the best way.

    There is no way in hell you can find out the answer without going through the entire list, which is exactly what the "manual" way does.

    Anything else done by the reduce or math stuff can only add more execution time in this case unless they do some magic; and we all know magic does not exist unless you do some multithreading or something (which I assumed and which I later found to be true when I asked the google about this).

    Again, I might be missing something…

  6. RRR says:

    Correction:

    (which I assumed and which I later found to be true when I asked the google about this).

    should be

    (which I assumed about javascript in the browser and which I later found to be true when I asked google about this).

  7. Mike Caron says:

    @RRR:

    The thinking was that the reduce/min functions were written in native x86 code, not in javascript. Therefore, they are going to be much faster than the equivalent loop in javascript. Of course, as noted, there is overhead to function calls, and optimizations that can't take place when doing such things.

    I wonder if using Math.min.call(Math, a) would be any faster, since it would need to unpack the argument array…

    [Exactly. The core operation is minimal; most of the overhead is loop control. And native loop control is faster than interpreted loop control right? -Raymond]
  8. ZoBook says:

    I also think that the manual is the fastest (not taking into account multi-threading) because as RRR noted, the ONLY way to find out is to go through the entire list…how do you do that will determine the "speed" of the process. If you can, like split the list in 4 and check simultaneously then you do (total / 4) + 2 testings, when you have the result of the 4 sublist, you do one more test (2 simultaneously, like in a semi-final soccer match) and then finally one more test (the final match).

  9. acq says:

    The reason you are surprised is that the JIT mechanisms in modern JavaScript engines got really, really effective. The JIT actually finally produces the native code for the loop, but typically it has to see first that it's worth producing it (by running the first passes as the interpretation). That's why you observe initial costs. Once the loop is running native, it is very effective.

    Here's one nice text about the internal architecture of one JIT multi-tier system.

    My favorite demonstration of the JavaScript's modern JIT speeds is http://bellard.org/jslinux/ As far as I remember, my fruit-flavored tablet v3 (AFAIK still without the last tier demonstrated above) is as fast to boot the Linux in JavaScript as my i5 650 running IE 11: around 18 seconds. JIT engines really fly today.

  10. Wladimir Palant says:

    The results weren't really surprising to me – but then again, I deal with JavaScript all the time. And I've learned a while ago that function calls are really expensive in JavaScript. So functions like Array.reduce() that will do a function call for each array element are neat but should never be used on performance-critical paths. As to the variant using Math.min() – the issue here should be the implicit copying of the original array into the arguments array. Simple loops like in the "manual" approach on the other will be optimized really well by the current JIT compilers, the difference to native code won't really be significant there.

  11. acq says:

    The text I've mentioned in the previous post, somehow link was missing: http://www.webkit.org/…/introducing-the-webkit-ftl-jit

  12. John Doe says:

    I'd expect the reduce version to be linearly slower than the inline loop, just because of the function call.  Just how much, only by benchmarking.

    I wouldn't expect anything from the min/indexOf version, because it can easily fail due to the possible implementation-dependent limit of arguments.

    Even so, I was expecting it to be about twice the manual version, because it must scan the whole array twice in the worse case.  However, it does both less-than comparisons (<, min) and strict equality comparisons (===, indexOf), which may explain the greater difference.

    Here's an interactive way of testing for yourelf, and provide stats with your browser(s): jsperf.com/index-of-smallest

  13. Dan Bugglin says:

    I write JS for a living and have had this exact problem.  I take the manual route every time.

    Using math.min actually never occurred to me.  Clever.  I might try that in the future just to reduce the code written (typically speed isn't important except in specific instances where a piece of code is run a lot).

    And then there is the fact that I have to write for IE7 and IE8 at work.  There is no .reduce or .indexOf, so that limits your options.  You can write shims of course but then you are still pursuing a manual method, just with more code and function call overhead!  Ironically you're writing your code to be faster in non-IE and slower in IE.  Not that I have too much of a problem with this.

    PS: Raymond: Try using the IE Dev Tools profiler (and Chrome's own profiler) to time functions.  Very interesting to look at and pretty easy to see where you need to optimize things… though typically for me all my time is spent in native DOM functions thus I end up needing to just rethink how I'm doing the DOM manipulation. :(

    @Mc Exactly.  I worked with a guy who wanted to optimize everything up front.  Smart guy, but he ended up WRITING HIS OWN STACK to get around a problem with IE6's ~2000 iteration stack limit for a specific recursive function… and we might not have needed to.  Needless to say whenever that function was called you could not trust the real stack trace since it was destroyed.  Made debugging a LOT harder.

    Same reasoning was used to reject jQuery.  I am on a smaller project now and I was able to call some shots this time, so I said we're using jQuery.  It has helped so much more than I'd dreamed.  Much neater code when dealing with the DOM, auto-cross platform, etc.  And where there have been inefficiencies I can identify them and drop back to normal DOM functions in a few specific places.

  14. John Doe says:

    @Wladimir Palant, the array copying expains the difference even further.

    So, the min/indexOf version may be at least 3 times slower than the manual version because:

    – It copies the array to an arguments object inside Function.prototype.apply

    – It scans the arguments to find the smallest element with < inside Math.min

    – It scans the array to find and return the smallest element's index with === inside Array.prototype.indexOf

    I believe the manual version can be further optimized (a tiny bit) by remembering the smallest element and comparing to it, thus avoiding many a[lowest] accesses.

  15. Dan Bugglin says:

    BTW Raymond you didn't mention whether the randomly generated data is ints or doubles.  Some JS engines, like Chrome's, will optimize for integers.  In my results below you can see a big difference in some of IE's tests.

    Also your ternary in the second function is backwards (only noticed when I went to run it).

    I did my own tests.  Average of 100 runs.  I used new Date() to time things so I only got MS granularity, at least in Chrome.  Didn't check the others.

    Relative running time per array element

    Elements manual reduce min/indexOf

    Chrome 37 dev – Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2008.2 Safari/537.36

    100000 doubles 0.41 10.56 15.98

    100000 ints 0.41 11.18 14.14

    200000 doubles 0.89 23.26 ERROR

    200000 ints 0.93 23.18 ERROR

    250000 doubles 1.08 30.29 ERROR

    250000 ints 1.02 29.97 ERROR

    IE11 – Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET CLR 1.1.4322; .NET4.0C; .NET4.0E; rv:11.0) like Gecko

    100000 doubles 0.47 6.76 8.24

    100000 ints 0.48 5.61 8.24

    200000 doubles 1.04 13.84 24.38

    200000 ints 0.93 11.76 11.24

    250000 doubles 1.34 17.37 29.38

    250000 ints 1.47 15.74 27.47

    Firefox Nightly 32.0a1 (2014-05-26) – Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0

    100000 doubles 0.21 1.05 1.46

    100000 ints 0.2 1.06 1.44

    200000 doubles 0.35 2.11 3.2

    200000 ints 0.36 2.23 2.9

    250000 doubles 0.54 2.79 4.05

    250000 ints 0.45 3.2 4

    * – Chrome encountered stack overflow errors

    Firefox has gotten stupid fast, it seems.  I remember when 3.5 was so slow I jumped ship to Chrome first chance I got; was even worth losing add-ons at the time.  Chrome is interestingly oddly slow in some cases, either the other browsers are catching up or one of my browser extensions is slowing things down a bit, who knows…

    Here's the code I used: http://jsfiddle.net/G83mW/1/

    Could probably use some code to throw up highest/lowest outlying test results.  Thankfully we have a few functions for at least one of those purposes already!

    Also could use shims for .indexOf, and .reduce if you want to test early IE versions.

  16. Marvin says:

    Raymond, read about Asm.js and weep. The fastest way to use JS is to treat it as assembly and use C++ on top of it :)

  17. Marvin says:

    acq: Mine was a general comment about JS efficiency not this particular problem. If you want to write high level code doing things manually will become painful very quickly. JS today is efficient processing simple operators on vars and accessing arrays. Using this as a primitive you can build other languages on top that will give you higher level abstractions in a fast way. JS itself is not optimization friendly.

    BTW: once you know this you could predict outcome of Raymond's experiment right away – wasn't surprising to me.

  18. acq says:

    Finding the array length only once: var n=a.length; for (var i = 1; i < n; i++)  speeds up both Firefox and IE 11 in "manual." On 250k FF is 7 times faster than reduce version, and IE 11 even 31 times faster. The absolute time for FF manual is apparently 0.25s and IE 11 manual 0.31s.

  19. acq says:

    Correction, it's apparently "milliseconds:" here are number of full executions per second (a.length in loop, outside, reduce, apply):

    Firefox

    250000 doubles 3333  4347  549  476

    250000 ints 3333  4347  558  450

    IE 11

    250000 doubles  1190  3225  108  67

    250000 ints  1190  3333  110  120

  20. acq says:

    Instead of quoting results, my version of The MAZZTer's bench: http://jsfiddle.net/G83mW/4/

  21. Jeffrey Bosboom says:

    It's refreshing when the simple and obvious implementation is also the fastest.

  22. acq says:

    Matt, I agree. I've already read the presentation of some "smart" guy who proposed "the OS written in JavaScript." To make the really compact and effective code we still need C (and the real asm), although even as every now and then the programmers who depend on GCs and VMs think that everything is the nail for their hammer.

    On another side, JavaScript engines got immensely faster in the last years. It all started with one company that made a new and at that moment much faster engine than any among older players…

  23. Shocking says:

    As usual, IE is the slowest and most buggy browser. Who would have thought.

  24. lucidfox says:

    Number of executions per second.

    Elements manual manualn manualcmp manualncmp reduce min/indexOf

    Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2011.0 Safari/537.36

    250000 doubles 2631 2631 3333 3333 74 ERROR RangeError: Maximum call stack size exceeded

    250000 ints 2702 2702 3333 3333 75 ERROR RangeError: Maximum call stack size exceeded

  25. Joshua says:

    I am not surprised at the results and was predicting them before reading half the page (basically just long enough for parallel thought to overrun read after reading the problem statement). JavaScript is jitted now so it has the same behavior as other jotted code. If you want it fast don't bother to use higher order functions. I thought everybody knew this by now.

  26. Crescens2k says:

    @Shocking:

    Eh? Slowest? Most buggy? I think you got lost, the IE blog is elsewhere. That is where you bash IE isn't it?

    Anyway, because I was bored, I also decided to get the results of IE x64.

    Relative running time per array element

    Elements manual reduce min/indexOf

    Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; InfoPath.3; rv:11.0) like Gecko

    100000 doubles 0.31 3.69 5.02

    100000 ints 0.29 2.92 3.03

    200000 doubles 0.58 6.72 9.77

    200000 ints 0.58 5.78 4.7

    250000 doubles 0.73 8.35 9.17

    250000 ints 0.73 7.23 11.64

    Relative running time per array element

    Elements manual reduce min/indexOf

    Mozilla/5.0 (Windows NT 6.3; Win64; x64; Trident/7.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; rv:11.0) like Gecko

    100000 doubles 0.79 2.35 1.69

    100000 ints 0.31 1.98 2.07

    200000 doubles 0.59 3.8 ERROR

    200000 ints 0.58 3.76 ERROR

    250000 doubles 0.74 4.77 ERROR

    250000 ints 0.73 4.69 ERROR

    Number of executions per second.

    Elements manual manualn manualcmp manualncmp reduce min/indexOf

    Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; InfoPath.3; rv:11.0) like Gecko

    250000 doubles 943 1282 2325 1315 99 72

    250000 ints 1298 1298 2325 2439 130 102

    Number of executions per second.

    Elements manual manualn manualcmp manualncmp reduce min/indexOf

    Mozilla/5.0 (Windows NT 6.3; Win64; x64; Trident/7.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; rv:11.0) like Gecko

    250000 doubles 900 1265 2000 2777 194 ERROR

    250000 ints 689 1176 2040 2857 198 ERROR

  27. acq says:

    Marvin, you don't need C++ to write asm.js code, at least implementing the "manual indexOfSmallest" demo should not be *that* hard.

  28. Cheong says:

    I have similar experience in writing VBA too. In VBA code, writing your code runs faster than built-in methods at many time.

    Say if you look at the call to use Advanced Filter to copy unique values to another row, you'll find writing the code by yourself runs lightning fast (almost instant complete), and using that function is noticably slower)

    On the other hand, if you're writing code that calls Office Automation from outside (like C#) you'll want to call built-in function to avoid COM transversal overhead.

  29. Ben says:

    I have to ask this: when was this article written? Firefox 16 is late 2012 IIRC. That's a 1.6 year delay between the article being written and its' publication.

  30. Matt says:

    I'm a little bit worried by people jumping to all sorts of conclusions that JavaScript is somehow faster than the C++ code that backs it here. That's not what's going on.

    Having written lots of these engines, the thing that's expensive is the *transition* between JavaScript and C++. The cost of wrapping an inlined float to a VARIANT structure and then calling the intrinsic over a IDispatch(Ex) and then unwrapping it from that variant in order to do the comparison kills the performance benefit of the C++ – especially when we recognise that a single "<" operator is going to be JITted rather efficiently by the Trident engine and the loop can be unrolled and vectorized by the JIT engine if it doesn't contain anything crazy like an intrinsic or a virtual method call.

  31. Neil says:

    I tried to compare all three benchmarks on both 32-bit and 64-bit Firefox but the results were inconclusive.

  32. acq says:

    Neil, my end conclusion, observing only the "manual" code, is that Firefox actually doesn't care about a.length inside of the for condition but IE 11 definitely prefers to compare with a scalar. So the former obviously knows that specific optimization. Still both benefit if the inside of the loop the current minimum which is being compared is a scalar too. Compared to the "reduce" variant in the same browser, that fastest "manual" is 7 times faster on Firefox and almost 40 times faster on IE 11. Firefox is around 1/4th faster in such a fastest "manual" versus IE 11. The Chromium nightly has the same speed no matter if scalars are in the loop or not, but it's therefore two times slower than IE 11 and 2.5 times slower than Firefox. Chromium is also 20 times faster with its "manual" vs its own "reduce."

  33. acq says:

    Still, please note that all this doesn't necessarily translate to your use case: if there are much less loop passes, the JIT doesn't have to come to the point to generate the native code, so if you know that you'll process only a very small array, the reduce approach can be a better choice: The measurements we talk about only reflect the current state of the JIT engines in the browsers once they have a tight and long loops. The "apply" is however something that except for looking nice, in the present state, shouldn't be used for any possibly bigger array.

  34. Brian_EE says:

    @Ben: As Raymond has said in the past: "I'm not the only one with a long posting queue. So remember that the next time you see a strange coincidence between an article that comes out of my queue and the date it appears." blogs.msdn.com/…/10110525.aspx

    Somewhere he once said that the queue was about 18 months, though I can't find the specific blog article in the search.

  35. Simon Lindholm says:

    This blog post discusses some reasons why "reduce" etc. are slow in Firefox: rfrn.org/…/two-reasons-functional-style-is-slow-in-spidermonkey.html

    In this case it seems to come down to implementing all the additional logic needed to comply with the spec, most importantly the requirement of being able to handle sparse arrays. Writing a custom "reduce" function gives me performance on par with the manual loop: http://jsfiddle.net/G83mW/10/

  36. Goran says:

    @the jazzier: came here to see Fox is fastest, wasn't disappointed. Crazy people @Mozilla

  37. acq says:

    Thanks Simon! I understand your result that the main reason the "reduce" is slow in Firefox is that it's *not* implemented in javascript, and therefore JIT can't work on it. For those who don't run the jsfiddle tests: your hand-written "reduce" achieves the speed of a[i] < a[lowest] "manual" version. Such "reduce" in IE 11 is almost 5 times faster than built-in, but still at least 4 times slower than "manual."

    Now that the path to the faster "reduce" is clearer, why in the world must "min.apply" get every array element separately copied as the parameter, and that *in all the engines*? The use case where the existing array is the second argument is documented and expected. Why "flattening"? It's definitely a very concise notation and worth supporting.

  38. Cheong says:

    This makes me remember something last night.

    At 2009, I've filed request on adding GREATEST() or LEAST() in T-SQL like what they have at PL/SQL on MS Connect. connect.microsoft.com/…/maxval-minval-and-avgval

    The SQL team expressed interest at the beginning, but later marked that as "won't fix". Could it be because of similar reason?

  39. voo says:

    @Joshua: "JavaScript is jitted now so it has the same behavior as other jotted code. If you want it fast don't bother to use higher order functions. I thought everybody knew this by now."

    Except that if you write the same code in say Java, your higher-order function will be just as fast as the manual loop, because there's absolutely nothing inherent in higher-order functions that tells the JIT "sorry you're not allowed to optimize this". It's just JavaScript once again being a horrible language beyond comparison making the life that much harder for JITs. So certainly not something everybody would or should know – if you're not optimizing JavaScript code why care about its peculiarities?

  40. David M says:

    My expectations matched your 'naive' self.  But my extra special naivety is showing, as I am still surprised.

  41. carbon twelve says:

    My guess was that the manual version would be fastest; followed by the slightly higher level version; followed by the much higher level version.

    This is just because that's how it's worked in pretty much every language I've ever used: for all the guff about higher-level code being theoretically easier to optimise, it never actually seems to work out that way in practice!

  42. Matt says:

    @carbon twelve " for all the guff about higher-level code being theoretically easier to optimise, it never actually seems to work out that way in practice!"

    JITs have crap optimisers because every millisecond spent optimising the code is a millisecond where the user is waiting before the code has even started running.

    If you write your code in a statically-compiled language with a mature compiler like MSVC C/C++ you'll see simple loops get unrolled and aggressively vectorized and functional calls get inlined.

  43. Silly says:

    Wouldn't you just ask it on the way into the array?

  44. JackTripper says:

    I would think that the `reduce` intrinsic can start to do advanced things, such as dividing up the array into n-chunks (where n is your number of logical processors). Then reduce the array in parallel.

    As we already know, multi-core performance gains don't come from multiple execution units: they come from having more cache (L1, L2, L3).

    Bonus Viewing

    Native Code Performance on Modern CPUs: A Changing Landscape by Eric Brumer

    channel9.msdn.com/…/4-329

  45. Paranoid says:

    Talking about speed. What is the "Post" button doing that makes chrome hang for seconds and activate the hard drive?

    What information are you collecting, mr nsa.. er I mean microsoft?

  46. Drunk and stupid atm says:

    "I've normalized run times so that the results are independent of CPU speed."

    How do you achieve that?

  47. Simon Lindholm says:

    @acq: Firefox actually does implement "reduce" in JavaScript (i.e. self-hosts it), the implementation just happens to be quite slow because (mainly) of the need to support sparse arrays. Instead of using a regular for-loop "for (var i = 0; i < len; i++) { fn(ar[i]); }" it does the equivalent of "for (var i = 0; i < len; i++) { if (i in ar) fn(ar[i]); }" – see the source code at dxr.mozilla.org/…/Array.js. But it's true that a non-self-hosted version would have been slower still.

    As for why "apply" copies everything onto the stack: 1) adding another stack argument representation complicates JIT design, and might require additional branches 2) naively sharing memory doesn't work, since both arguments and array are mutable. And no-one has bothered to optimize "Math.min.apply(Math, ar)" (or ES6's "Math.min(…ar)"!) specifically, since it's an uncommon pattern, with a simple and fast alternative.

  48. acq says:

    Simon, thanks a lot for the insights.

  49. John Doe says:

    @acq, sharing the provided array with the arguments object (e.g. the actual underlying array) is a possible optimization if the compiler can prove that the callee won't change it.  It could as well share until the point it's changed, and then copy it, with a smart implementation of copy-on-write sparse arrays.

    In some lisps, apply is optimized on the caller when providing a (rest of a) list that was itself already on the stack, but as soon as that list is changed or provided as a normal argument to other functions from within the callee (other than with apply), the args list is reified to a fresh list.  IIRC, the args list may or may not be the provided list, in order to allow an implementation to provide a copy of the variable amount of arguments on the stack.

    I guess this is still an inspiration for current JS implementers.  I think at least one or two scheme implementations have shown that native stacks are a bit overrated, but I guess they were talking about ease of capturing (managed) stacks for full continuations and all GC related stuff.

    en.wikipedia.org/…/Spaghetti_stack

    en.wikipedia.org/…/Funarg_problem

    Lisp was perceived as very slow, quite as much as current interpreted languages are relative to natively compiled, manually memory managed languages.  Because of that, some expensive machines were made, dedicated to running lisp.  And yes, the OS was in lisp itself, with GC, runtime function redefinitions, OO, etc.  Much like what you see in a browser's developer tools, but with drivers (e.g. network), peripherals and devices (e.g. mouse, keyboard, printers), native applications, completely inspectable GUIs where most things are sensitive to mouse-over and the IDE (inspector, debugger, editor, etc) was part of the system, distributed services, and whatnot.  Of Course, the chance to shoot yourself in the foot was just a tiny portion of bricking the machine.

    Eventually, native compilers for non-dedicated machines (e.g. Unix, Windows) and GC got good enough that can (almost) match .NET and Java, but no one is seriously trying the OS thing anymore, at least not the kernel-in-lisp thing.

Comments are closed.