Merge Sort
Let's now revisit the topic of sorting. Previously, we talked about selection sort, insertion sort, and radix sort, and compared and contrasted their efficiencies.
Now that we know recursion, let's learn an algorithm that recursively sorts a list. One such algorithm is merge sort, which uses a divide-and-consquer technique to sort a list in time O(nlogn)
.
Watch the following video to learn how merge sort works:
To summarize, merge sort has two phases:
-
Start with a list of
n
elements. Repeatedly split the list in half using recursion until you are left with sublists of size 1. This is the base case, since sublists of size 1 are, naturally, sorted. -
Merge the sublists of size 1 pairwise, into sorted sublists of size 2. Repeat this process, merging the sublists of size 2 pairwise into sublists of size 4, so on and so forth. Repeat until at last you merge the two sublists of size
n / 2
into one fully sorted list of sizen
.
Merge sort in Python
Watch the following video to step through the implementation of merge sort:
Here's the code written in the video:
def merge_sort(lst):
if len(lst) > 1:
left_lst = lst[:len(lst) // 2]
right_lst = lst[len(lst) // 2:]
# Recursion
merge_sort(left_lst)
merge_sort(right_lst)
# Merge left_lst and right_lst
i = 0 # left_lst index
j = 0 # right_lst index
k = 0 # merged list index
while i < len(left_lst) and j < len(right_lst):
if left_lst[i] < right_lst[i]:
lst[k] = left_lst[i]
i += 1
else:
lst[k] = right_lst[j]
j += 1
k += 1
# Fill in remaining elements from left_lst
while i < len(left_lst):
lst[k] = left_lst[i]
i += 1
k += 1
# Fill in remaining elements from right_lst
while j < len(right_lst):
lst[k] = right_lst[j]
j += 1
k += 1
A few notes:
- After the recursive calls to
merge_sort()
return, bothleft_lst
andright_lst
are sorted, and can therefore be merged into a single sorted list using the merging algorithm. - Only one of bodies of the last two
while
loops will be executed, depending on which half (left or right) still has elements. - The base is when the condition
if len(lst) > 1
is false. Lists that have a length of <= 1 are already sorted, so no further recursion is needed and the function doesn't need to do anything -- it can just return.
Efficiency
Let's now talk about the efficiency of merge sort, both in terms of space and time:
To summarize, merge sort runs in time O(nlogn)
in all cases, which is much better than selection sort and insertion sort, which typically run in time O(n2)
. However, merge sort requires O(n)
extra space, as it is not an in-place sorting algorithm.