Merge sort in C# is one of the algorithms that we can use to sort elements. Merge Sort is known to be efficient as it uses the “divide and conquer” strategy that we are going to discuss as we implement it.
Let’s start.
What is Merge Sort?
Merge Sort achieves its purpose using a two-step process:
Divide: merge sort starts by dividing the input array into two halves. The algorithm recursively calls itself for each of those halves until there are no half-arrays to divide during the sorting process.
Conquer: the algorithm sorts and merges the sub-arrays in this step to return an array whose values are sorted.
Generally, we use these high-level steps when sorting an array or a list with a merge sort:
Step 1: Check if the array has one element. If it does, it means all the elements are sorted. Step 2: Use recursion to divide the array into two halves until we can't divide it anymore. Step 3: Merge the arrays into a new array whose values are sorted.
How Does Merge Sort Algorithm Work?
Let’s say we want to sort an array that has eight elements:
int[] array = { 73, 57, 49, 99, 133, 20, 1, 34 };
Dividing Arrays
Using the merge sort algorithm, let’s start subdividing the array into equal halves. We can see here that in the first iteration, the array is split into two equal arrays of half the size (4):
73, 57, 49, 99
133, 20, 1, 34
Now, we divide these arrays into halves without changing the sequence of the elements in the original array:
73, 57
49, 99
133, 20
1, 34
We are going to subdivide the arrays further into arrays that have one element each (we achieve the atomic value in this step):
73
57
49
99
133
20
1
34
After subdividing the arrays, we are going to now start merging them while comparing the elements and placing them in the right order.
Merging and Sorting
This algorithm requires an extra array to store the sorted elements. The algorithm starts by traversing the subarrays that hold 73 and 57. 57 is less than 73, so we swap their positions while merging them into a new array that has two elements:
57, 73
Next, we traverse the subarrays that hold 49 and 99 while comparing the elements at each position. Since 49 is less than 99, the merged array of size 2 becomes:
57, 73
49, 99
The same process applies here as the algorithm starts traversing the subarrays while comparing the values they hold in each position. 20 is less than 133, so the algorithm swaps their positions when merging them and the array becomes:
57, 73
49, 99
20, 133
1 and 34 are already in the correct order so we don’t swap their positions when combining them back into an array:
57, 73
49, 99
20, 133
1, 34
Next Merge Iteration
In the next iteration of the merging process, the arrays should have four elements each, sorted in the correct order.
The algorithm starts traversing the two subarrays of size two while comparing the values at each position. In the first position, 49 is less than 57. We move 49 to the first position followed by 57. Now we move on to 73 and since it is less than 99 they don’t need to swap positions.
So, the new merged array of size 4 is:
49, 57, 73, 99
We repeat the same process for the right half of the array.
The algorithm compares the elements on each subarray while merging them into a new array. 1 is less than 20 so it becomes the first element in the merged array followed by 20.
The algorithm compares 133 with 34 and adds 34 first into the new array before adding 133. So, the new array after the merging process is:
1, 20, 34, 133
In the next iteration of the merging process, the array should have eight sorted elements (same as the original array).
The algorithm starts traversing the subarrays while comparing the values at each position while adding them to a new array that has sorted elements. For example, at position 1 in the subarrays, 1 is less than 49, so we store 1 as the first element in the merged array:
49, 57, 73, 99
1, 20, 34, 133
The algorithm iterates through the arrays while placing them in the correct order in the merged array which becomes:
1, 20, 34, 49, 57, 73, 99, 133
Let’s implement the merge sort algorithm in C#.
How to Implement Merge Sort in C#?
We are going to define a method SortArray()
as our entry point into the sorting algorithm. The method takes three parameters int[] array
, int left
and int right
:
public int[] SortArray(int[] array, int left, int right) { if (left < right) { int middle = left + (right - left) / 2; SortArray(array, left, middle); SortArray(array, middle + 1, right); MergeArray(array, left, middle, right); } return array; }
First, SortArray()
uses the left
and right
integer values to define the index of the element in the middle of the array.
The method recursively calls itself to subdivide the right and left subarrays. The merging process commences after each array has one element. Let’s write a method that implements the merge sort process:
public void MergeArray(int[] array, int left, int middle, int right) { var leftArrayLength = middle - left + 1; var rightArrayLength = right - middle; var leftTempArray = new int[leftArrayLength]; var rightTempArray = new int[rightArrayLength]; int i, j; for (i = 0; i < leftArrayLength; ++i) leftTempArray[i] = array[left + i]; for (j = 0; j < rightArrayLength; ++j) rightTempArray[j] = array[middle + 1 + j]; i = 0; j = 0; int k = left; while (i < leftArrayLength && j < rightArrayLength) { if (leftTempArray[i] <= rightTempArray[j]) { array[k++] = leftTempArray[i++]; } else { array[k++] = rightTempArray[j++]; } } while (i < leftArrayLength) { array[k++] = leftTempArray[i++]; } while (j < rightArrayLength) { array[k++] = rightTempArray[j++]; } }
The first thing to note is that the MergeArray()
method takes four parameters. The leftArrayLength
and rightArrayLength
variables help us define temporary arrays to hold values during the sorting process:
var leftArrayLength = middle - left + 1; var rightArrayLength = right - middle; var leftTempArray = new int[leftArrayLength]; var rightTempArray = new int[rightArrayLength];
We copy data into those temporary arrays using two loops as the next step:
for (i = 0; i < leftArrayLength; ++i) leftTempArray[i] = array[left + i]; for (j = 0; j < rightArrayLength; ++j) rightTempArray[j] = array[middle + 1 + j];
We then proceed to compare the elements in the leftTempArray[i]
and rightTempArray[j]
objects and swap their positions if the element in the leftTempArray[i]
is less than or equal to the element in the rightTempArray[j]
object while storing them in the array[k]
position in the merged array:
while (i < leftArrayLength && j < rightArrayLength) { if (leftTempArray[i] <= rightTempArray[j]) { array[k++] = leftTempArray[i++]; } else { array[k++] = rightTempArray[j++]; } }
The process completes by copying any remaining elements from the leftTempArray[i]
and the rightTempArray[j]
objects into the merged array:
while (i < leftArrayLength) { array[k++] = leftTempArray[i++]; } while (j < rightArrayLength) { array[k++] = rightTempArray[j++]; }
Finally, we can verify that the method sorts an array accurately:
var array = new int[] { 73, 57, 49, 99, 133, 20, 1 }; var expected = new int[] { 1, 20, 49, 57, 73, 99, 133 }; var sortFunction = new Merge(); var sortedArray = sortFunction.SortArray(array, 0, array.Length - 1); Assert.IsNotNull(sortedArray); CollectionAssert.AreEqual(sortedArray, expected);
Time and Space Complexity of Merge Sort
The space complexity of the merge sort algorithm is O(N) because the merging process results in the creation of temporary arrays as we can see in the implementation step.
The time complexity of the merge sort algorithm is O(N*log N).
Let’s see why.
Assuming the length of the array is N, every time we divide it into equal halves, we can represent that process as a logarithmic function log N. In each step, we find the middle of any subarray, which takes O(1) time.
As we merge the subarrays, the algorithm takes O(N) which we determine through the length of the array we are sorting.
Therefore, the total time that the merge sort algorithm takes is N(log N + 1), which we can convert to O(N*log N).
The best, average, and worst-case complexity of the merge sort algorithm is O(N*log N).
The best-case complexity scenario occurs when the algorithm sorts an array whose values are sorted. In this case, the number of comparisons the algorithm makes is minimal.
In a scenario where we have a mix of values that are in the correct order and those that are not, the algorithm encounters an average-case complexity scenario.
On the other hand, the worst-case complexity scenario occurs when assuming we have a list of numbers that are in descending order and we need to sort them in ascending order. The algorithm would encounter a worst-case complexity as the length of that array grows to N.
Advantages and Disadvantages of Merge Sort Algorithm
The merge sort algorithm is faster when sorting large arrays than other sorting algorithms such as bubble sort.
It has consistent execution times as all the cases take O(N*log N) time.
However, merge sort requires O(N) space to run, which makes it less efficient than bubble sort and selection sort which use constant space O(1). Besides that, it does not perform well as bubble sort for smaller arrays and lists.
Merge Sort in C# Performance
Let’s verify the merge sort algorithm has a time complexity of O(N*log N) by measuring the time it takes for the algorithm to sort an array.
First, let’s write a method to generate a set of random numbers that are going to be added into the array:
public static int[] GenerateRandomNumber(int size) { var array = new int[size]; var rand = new Random(); var maxNum = 10000; for (int i = 0; i < size; i++) array[i] = rand.Next(maxNum + 1); return array; }
The GenerateRandomNumber
method takes an integer as its sole input. Using the inbuilt Random
class, we generate integer values (less than 10000) that we’re going to put into the array.
Let’s write a method that generates a sorted array:
public static int[] GenerateSortedNumber(int size) { var array = new int[size]; for (int i = 0; i < size; i++) array[i] = i; return array; }
Next, we are going to create a collection that holds different arrays that have random and sorted values:
public IEnumerable<object[]> ArrayData() { yield return new object[] { GenerateRandomNumber(200), 0, 199, "Small Unsorted" }; yield return new object[] { GenerateRandomNumber(2000), 0, 1999, "Medium Unsorted" }; yield return new object[] { GenerateRandomNumber(20000), 0, 19999, "Large Unsorted" }; yield return new object[] { GenerateSortedNumber(200), 0, 199, "Small Sorted" }; yield return new object[] { GenerateSortedNumber(2000), 0, 1999, "Medium Sorted" }; yield return new object[] { GenerateSortedNumber(20000), 0, 19999, "Large Sorted" }; }
Each object entry has four values. An integer array such as GenerateRandomNumber(200)
, the index of the first value in the array (0), the index of the last element in the array (199), and a string object storing the name of that array (“Small Unsorted”).
The array objects have different sizes (to simulate time complexity scenarios) and hold random numbers that are added by the GenerateRandomNumber
method. The GenerateSortedNumber
method creates arrays that have values that are sorted.
Let’s assess the sample best, average, and worst-case complexity performance results of the algorithm:
| Method | array | left | right | arrayName | Mean | Error | StdDev | |---------- |------------- |----- |------ |---------------- |------------:|-----------:|-----------:| | SortArray | Int32[200] | 0 | 199 | Small Sorted | 16.95 μs | 0.577 μs | 1.702 μs | | SortArray | Int32[200] | 0 | 199 | Small Unsorted | 17.02 μs | 0.653 μs | 1.925 μs | | SortArray | Int32[2000] | 0 | 1999 | Medium Sorted | 204.07 μs | 8.722 μs | 25.717 μs | | SortArray | Int32[2000] | 0 | 1999 | Medium Unsorted | 203.08 μs | 8.809 μs | 25.972 μs | | SortArray | Int32[20000] | 0 | 19999 | Large Sorted | 2,352.66 μs | 102.065 μs | 300.940 μs | | SortArray | Int32[20000] | 0 | 19999 | Large Unsorted | 2,530.11 μs | 111.928 μs | 330.023 μs |
We can see that the time merge sort takes to sort an array increases as the size of the array grows.
The algorithm sorts a large array in 2,530.11 μs (~0.00253 seconds) but sorts a small array in 17.02 μs (1.702e-5 seconds).
Besides that, we can see that the algorithm performs slightly better when sorting ordered arrays than when sorting random elements. The difference can be seen when sorting larger arrays.
Conclusion
Merge sort in C# is efficient when sorting large lists and arrays but needs extra space O(N) to achieve its goals. It uses the “divide and conquer” strategy, which makes it have a time complexity of O(N*log N). If you want to learn how to implement other sorting algorithms, check out these bubble sort and selection sort articles.