My analysis and explanation of quick sort
My explanation and analysis of quick sort
Basics & introduction to quick sort
Quick sort is an efficient sorting algorithm (and my personal favourite) that makes use of the divide and conquer approach.
It works by selecting a pivot and partitioning the other elements into sub arrays, according to whether they are greater or lesser than the pivot.
Note: In this post I do not go into great depth on the variations of quick sort (and there are many).
How does it work?
First you need to understand what the pivot is.
The pivot is a value chosen from the list to compare the rest to.
The pivot is often the first or last index of the array, however some other methods are randomly picking an index and the "median of three" (which is my personal favourite)
quick sort explained with example:
For the sake of simplicity i will use the last index pivot for this example:
- say we have a list arr =
- the pivot
4is taken out:
- now we iterate through the the list; for each value, if the value is smaller than the pivot, we place it on the left of our pivot, and if it's bigger we place it on the right.
- start with
 4 
 4 (remove 3 and place next to pivot)
 4 
 4 [5,7]
[3,1] 4 [5,7]
[3,1,2] 4 [5,7]
[3,1,2] 4 [5,7,6]
- after this process we will be left with
[3,1,2] 4 [5,7,6]
Now our pivot is in the correct index of the array!
But were not done yet, we have still have the 2 lists beside our pivot to sort.
all we need to do is apply the same steps (selecting pivot and comparing) to the lists to the left and right of our pivot.
This is where recursion comes into play.
For those of you who dont already know, recursion is the process of defining a problem (or the solution to a problem) in terms of (a simpler version of) itself.
Here is my implementation of quicksort (last index pivot):
## last element as pivot def quick_sort_last(arr, low=0, high=None): if high == None: high = len(arr) ## define pivot as the last value in arr pivot = arr[high-1] ## l1 is the list of numbers less than pivot, and l2 are the numbers larger than the pivot l1, l2 = ,  for num in range(low, high-1): # range is up to high-1 the pivot is the last value ## iterate through the array and place into l1 and l2 appropriately if arr[num] < pivot: l1.append(arr[num]) elif arr[num] > pivot: l2.append(arr[num]) ## quicksort on l1 until l1 has less than 2 values if len(l1) > 1: l1 = quick_sort_last(l1, 0, len(l1)) ## quicksort on l2 until l2 has less than 2 values if len(l2) > 1: l2 = quick_sort_last(l2, 0, len(l2)) return [*l1,pivot,*l2] # return list with l1 then pivot then l2 as values in that order
More on the pivot:
The MOST important thing to know: when choosing a pivot, the goal is to get the middle value of a list, how you decide to pick the pivot greatly affects the efficiency of your algorithm! (see more on this in the computational complexity section)
There are many different ways to choose a pivot but when it comes down to it, but the most way to choose the right pivot is trying multiple and seeing whih performs best for you!
I tested my 4 variations (last index pivot, first index, median of three and random pivot) and found that the last and first index pivots were (equally) the most efficient for random lists with lengths ranging from 10 to 10,000.
As a general rule of thumb, I would recommend using the last or first index variations when working with smaller lists and the median of three when working with large lists (thousands of items+).
The median of three pivot:
There are variations in the median of three and the partition scheme although the one I will be focusing on is the median of the first, last and middle elements in the list.
Since the goal is to get the middle value of the list as the pivot, the median of three is far more likely to perform better than the first/last index pivots.
Quick sort is an unstable sorting algorithm because we do swapping of elements according to pivot's position (without considering their original positions).
Quick sort has a big O of T(n^2) - n squared. At first this seems like a bad sorting algorithm however we must consider that big O is the absolute worst case scenario.
So we will also use quick sort's big Θ is T(n * logn). Now that is a huge difference! - big Θ (theta) is the average case running time.
And, finally... Quick sort vs merge sort:
Although there is no definitive answer to which algorithm is better, we can consider the benefits and drawbacks of each and come up with our own conclusion.
- faster than quick sort on larger data sets
- worst case running time of T(n * logn)
- preferred data structure is linked lists
- faster than merge sort with smaller data sets
- preferred data structure is arrays
- in-place sort! - (i.e. it doesn’t require any extra storage) whereas merge sort requires O(N) extra storage