Worst Case Scenario for QuickSort


Take a look at the following code:

   1: var sw = new Stopwatch();
   2: sw.Start();
   3: Enumerable.Range(0, 3).SelectMany((i) => Enumerable.Range(0, 50000)).OrderBy(i => i).ToList();
   4: Console.WriteLine(sw.ElapsedMilliseconds);
   5:  
   6: sw.Reset();
   7: sw.Start();
   8: Enumerable.Range(0, 2).SelectMany((i) => Enumerable.Range(0, 50000)).OrderBy(i => i).ToList();
   9: Console.WriteLine(sw.ElapsedMilliseconds);
  10: sw.Reset();

 

Line 3 is saying:

Generate a list which looks like 0,1,2..49999,0,1,2..49999,0,1,2..49999

Then Sort it.

While Line 8 is saying:

Generate a list which looks like 0,1,2..49999,0,1,2..49999

Then Sort it.

 

If you run this program you will see that line 3 executes ( depending on your machine) in less than a second while line 8 executes in around 30 seconds. This may jump out as really strange at first since you are sorting less numbers in line 3.  But that isn’t the issue in this scenario. The issue is the sorting algorithm.

Underneath the covers the OrderBy function is using a version of QuickSort.  The version of quicksort used always chooses the middle element of the list as the pivot position.  Since both left and right lists will always be sorted we hit the worst case of this quicksort algorithm which is O(n^2). 

 

 

 


Comments (2)

  1. JamesCurran says:

    Um…. That doesn’t explain the problem.  A sorted list produces a worse case only if you use the first item as the pivot.  That would cause every item to always fall into the "higher" partition, with every "lower" partition empty.

    Choosing the middle item on a sorted collection will mean the you will always have half the collection in the lower partition, with the other half in the "upper" partition.  This is actually the best case, and is actually a way of getting around that problem (there is still a O(n*n) worse case arrangement, but it’s a obscure pathological case instead of a very common case).

  2. MattManela says:

    As always it depends on exactly how you code your parition function.  But let say your parition function was something like this (which is along the same idea as the one used by the OrderBy method):

    void Parition(int[] arr, int begin, int end)

    {

      int low = begin;

      int high = end;

      int pivot = arr[(end-begin)/2];

      while(low <= high)

      {

          while(pivot > arr[high])

             high++;

          while(pivot < arr[low])

            low–;

            if(low < high)

            {

               int temp = arr[high];

               arr[high] = arr[low];

               arr[low] = arr[high];

               low++;

               high–;

            }

            else

            {

               low++;

            }

      }

      if (first < high)

      {

         Sort( arr, begin, high);

      }

      if (high+1 < last)

      {

         Sort( arr, high+1, end);

       }

    }

    In this case the sort of an array such as 0,1,2,0,1,2 will happen in the following steps (I am showing the sub array partition is working with before and after its called each time):

    0,1,2,0,1,2 becomes 0,1,2,0,1,2

    0,1,2,0,1 becomes 0,1,1,0,2

    0,1,1,0 becomes 0,0,1,1

    0,0,1 becomes 0,0,1

    0,1 becomes 0,1

    As you can see with a middle element parition element such as this the quick sort is performing at worst case O(n^2).  I probably should have been more clear on the type of parition algorithm being used.

    Thanks for pointing this out,

    -Matt