Sorting is one of the most common programming problems that we have to solve. The good news is, that there are many sorting algorithms available to address such problems. However, each algorithm has its strengths and weaknesses, which may influence the choice of algorithm to use in different scenarios.
In this article, we’re going to learn about bucket sort, see how it works, implement it in C# and analyze its time and space complexity.
Let’s start.
What Is Bucket Sort?
Bucket/bin sort is a sorting algorithm that works by partitioning an array into several buckets. The bucket sort algorithm then sorts the contents of each bucket recursively or by applying different sorting algorithms.
Bucket sort uses this principle to sort a list of elements:
- The algorithm sets up an array of empty buckets
- Each object from the unsorted array is scattered into its corresponding buckets
- Each bucket gets sorted individually
- Sorted items are gathered from each bucket in their correct order
Let’s take a deep dive and learn how bucket sort works.
How Does Bucket Sort Algorithm Work?
To illustrate how the bucket sort algorithm works, let’s assume we intend to sort this array:
int[] array = {42, 6, 16, 4, 12, 24, 46, 17};
We need to divide these elements into buckets using an identifier. To make our example simple, let’s create buckets based on a range of ten values for each bucket: 0 – 9, 10 – 19, 20 – 29, 30 – 39, and 40 – 49.
Next, we are going to “scatter” the elements into those buckets based on their range:
- 0 – 9: 6, 4
- 10 – 19: 16, 12, 17
- 20 – 29: 24
- 30 – 39: null
- 40 – 49: 42, 56
Now, we sort the items in each bucket using a different algorithm such as bubble sort (to keep our illustration simple):
- 0 – 9: 4, 6
- 10 – 19: 12, 16, 17
- 20 – 29: 24
- 30 – 39: null
- 40 – 49: 42, 56
Finally, we gather the items from each bucket in the correct order, which completes the sorting process:
4, 6, 12, 16, 17, 24, 42, 56
Let’s learn how to implement the bucket sort algorithm in C#.
How to Implement Bucket Sort in C#?
We can implement the bucket sort algorithm in different ways. We are going to implement the algorithm in C# as we have discussed so far and try to improve its shortcomings by implementing an optimized solution.
Non-Optimized Bucket Sort Implementation in C#
First, we are going to write a function int[] BubbleSort
that takes a List<int>
object and sorts its values using the bubble sort algorithm:
public static int[] BubbleSort(List<int> listInput) { for (int i = 0; i < listInput.Count; i++) { for (int j = 0; j < listInput.Count; j++) { if (listInput[i] < listInput[j]) { var temp = listInput[i]; listInput[i] = listInput[j]; listInput[j] = temp; } } } return listInput.ToArray(); }
Next, we are going to define a method called SortArray()
as our entry point into the sorting algorithm. The method takes int[] array
as its sole input and returns a sorted array:
public int[] SortArray(int[] array) { List<int> sortedList = new List<int>(); var minValue = array[0]; var maxValue = array[0]; if (array == null || array.Length <= 1) { return array; } for (int i = 1; i < array.Length; i++) { if (array[i] > maxValue) maxValue = array[i]; if (array[i] < minValue) minValue = array[i]; } var numberOfBuckets = maxValue - minValue + 1; List<int>[] bucket = new List<int>[numberOfBuckets]; for (int i = 0; i < numberOfBuckets; i++) { bucket[i] = new List<int>(); } for (int i = 0; i < array.Length; i++) { var selectedBucket = (array[i] / numberOfBuckets); bucket[selectedBucket].Add(array[i]); } for (int i = 0; i < numberOfBuckets; i++) { int[] temp = BubbleSort(bucket[i]); sortedList.AddRange(temp); } return sortedList.ToArray(); }
Let’s understand how this method works.
First, we check whether int[] array
is null or has a single element as our base case and skips the sorting process:
if (array == null || array.Length <= 1) { return array; }
Next, we are going to find the maximum and minimum elements in the array:
for (int i = 1; i < array.Length; i++) { if (array[i] > maxValue) { maxValue = array[i]; } if (array[i] < minValue) { minValue = array[i]; } }
Here, we loop through the array
as we find the maxValue
and minValue
, which are going to come in handy when we create the buckets, which in this case, are List<int>
objects:
var numberOfBuckets = maxValue - minValue + 1; List<int>[] bucket = new List<int>[numberOfBuckets]; for (int i = 0; i < numberOfBuckets; i++) { bucket[i] = new List<int>(); }
Now that we have the buckets ready, we are going to scatter the array values into the buckets we’ve created:
for (int i = 0; i < array.Length; i++) { var selectedBucket = (array[i] / numberOfBuckets); bucket[selectedBucket].Add(array[i]); }
After initializing the buckets, we are going to invoke the BubbleSort()
method we created earlier and pass each bucket. The method returns a sorted array, which we add at the end of the sorted list:
for (int i = 0; i < numberOfBuckets; i++) { int[] temp = BubbleSort(bucket[i]); sortedList.AddRange(temp); } return sortedList.ToArray();
Finally, we can verify that the SortArray()
method sorts a given unsorted array accurately:
var array = new int[] { 73, 57, 49, 99, 133, 20, 1 }; var expected = new int[] { 1, 20, 49, 57, 73, 99, 133 }; var sortFunction = new InsertionSortMethods(); var sortedArray = sortFunction.SortArray(array, array.Length); Assert.IsNotNull(sortedArray); CollectionAssert.AreEqual(sortedArray, expected);
Optimized Bucket Sort Implementation in C#
The bucket sort algorithm has a few weaknesses, which were are going to discuss soon. We can improve the algorithm by initializing the buckets only when we need them, which saves us some time initializing them. Using the LinkedList<T>
objects instead of List<T>
would come in handy when accessing sequential data, which can give us a slight performance edge.
Let’s learn how to optimize our current implementation:
public int[] SortArrayOptimized(int[] array) { if (array == null || array.Length <= 1) { return array; } int maxValue = array[0]; int minValue = array[0]; for (int i = 1; i < array.Length; i++) { if (array[i] > maxValue) { maxValue = array[i]; } if (array[i] < minValue) { minValue = array[i]; } } LinkedList<int>[] bucket = new LinkedList<int>[maxValue - minValue + 1]; for (int i = 0; i < array.Length; i++) { if (bucket[array[i] - minValue] == null) { bucket[array[i] - minValue] = new LinkedList<int>(); } bucket[array[i] - minValue].AddLast(array[i]); } var index = 0; for (int i = 0; i < bucket.Length; i++) { if (bucket[i] != null) { LinkedListNode<int> node = bucket[i].First; while (node != null) { array[index] = node.Value; node = node.Next; index++; } } } return array; }
SortArrayOptimized()
is similar to SortArray()
but has a few notable differences such as the use of LinkedList<int>
objects instead of List<int>
objects:
LinkedList<int>[] bucket = new LinkedList<int>[maxValue - minValue + 1];
The goal of this process is to store each array element in its corresponding index and move everything to the left as possible (minValue
) during the sorting process:
for (int i = 0; i < array.Length; i++) { if (bucket[array[i] - minValue] == null) { bucket[array[i] - minValue] = new LinkedList<int>(); } bucket[array[i] - minValue].AddLast(array[i]); }
We can see that a new LinkedList<int>
object is created when the loop encounters minValue
in the array and adds the rest of the array elements in order at the end of the bucket when calling bucket[array[i] - minValue].AddLast(array[i])
.
Next, we start the process of moving the elements from the bucket back into int[] array
.
We achieve this by setting the index of the array to zero and iterating through the bucket as we add the sorted elements back into int[] array
:
var index = 0; for (int i = 0; i < bucket.Length; i++) { if (bucket[i] != null) { LinkedListNode<int> node = bucket[i].First; while (node != null) { array[index] = node.Value; node = node.Next; index++; } } } return array;
We start iterating from the head of the LinkedList<int>
as we move to the next nodes as we add the values to the int[] array
object. The loop terminates when the next node is null and returns the sorted array.
Next, we can verify that the SortArray()
and SortArrayOptimized()
methods return the same result when sorting an array:
public void GivenUnsortedArray_WhenArrayIsNotEmpty_ThenReturnSortedArray() { var array = new int[] { 73, 57, 49, 99, 133, 20, 1 }; var expected = new int[] { 1, 20, 49, 57, 73, 99, 133 }; var sortFunction = new BucketSortMethods(); var sortOptimized = sortFunction.SortArrayOptimized(array, string.Empty); var sortNormal = sortFunction.SortArray(array, string.Empty); CollectionAssert.AreEqual(sortNormal, expected); CollectionAssert.AreEqual(sortOptimized, expected); }
Space and Time Complexity of Bucket Sort Algorithm
As we’ve seen in the implementation section, the bucket sort algorithm requires the creation of buckets to hold the array elements during the sorting process. Therefore, the space complexity of bucket sort is O(n + k), with n being the number of elements in the array and k being the number of buckets.
Best-Case Time Complexity
Bucket sort encounters the best-case time complexity scenario when we scatter elements uniformly across the buckets. The time it would take to sort the array is O(n + k) with n being the number of elements in the array and k being the number of buckets.
Average-Case Time Complexity
When bucket sort encounters random array elements it encounters an average-case time complexity scenario. It would take O(n + k) time to sort the array with n being the number of elements in the array and k being the number of buckets.
Worst-Case Time Complexity
This scenario occurs when bucket sort encounters a reversed list. Depending on the “inner algorithm” used such as the bubble sort algorithm, the algorithm may need to carry out N2 comparisons. Therefore, the bucket sort algorithm has a worst-case complexity of O(N2).
Advantages of Bucket Sort Algorithm
Bucket sort reduces the number of comparisons we make when sorting elements as we distribute them into different buckets and sort each bucket separately.
The algorithm can be fast when we scatter elements uniformly across all the buckets.
Disadvantages of Bucket Sort Algorithm
It does not sort elements in place like other algorithms such as merge sort, as it needs O(n + k) space.
The algorithm has a worst-case complexity of O(N2).
Bucket sort may or may not be stable. A stable sorting algorithm maintains the relative order of the array elements in case it encounters two similar values, unlike quicksort, which is unstable.
Performance Tests
Let’s assess how bucket sort performs by measuring the time it takes for it to sort an array.
First, let’s write a method to generate a set of random array elements:
public static int[] CreateRandomArray(int size) { var array = new int[size]; var rand = new Random(); for (int i = 0; i < size; i++) array[i] = rand.Next(1, size); return array; }
The CreateRandomArray()
the method takes an integer as its sole input. Using the inbuilt Random
class, we generate integer values that we’re going to put into the array.
Next, we are going to define a method that generates a sequence of elements. This method simulates a scenario where we have a sorted array:
public static int[] CreateSortedArray(int size) { var array = new int[size]; for (int i = 0; i < size; i++) array[i] = i; return array; }
Next, we are going to create an object that holds different arrays that have random and sorted values:
public IEnumerable<object[]> SampleArrays() { yield return new object[] { CreateRandomArray(200), "Small Unsorted" }; yield return new object[] { CreateRandomArray(2000), "Medium Unsorted" }; yield return new object[] { CreateRandomArray(20000), "Large Unsorted" }; yield return new object[] { CreateSortedArray(200), "Small Sorted" }; yield return new object[] { CreateSortedArray(2000), "Medium Sorted" }; yield return new object[] { CreateSortedArray(20000), "Large Sorted" }; }
Each object entry has three values: an integer array e.g. CreateRandomArray(200)
, its length (200), and a string object storing the name of that array (“Small Unsorted”).
The array objects have different sizes (to simulate time complexity scenarios) and hold random numbers that are added by the CreateRandomArray()
method. On the other hand, the CreateSortedArray()
method creates arrays that have values that are sorted.
Let’s assess the sample best, average, and worst-case complexity performance results of the algorithm:
| Method | array | arrayName | Mean | Error | StdDev | |------------------- |------------- |---------------- |-----------------:|---------------:|----------------:| | SortArray | Int32[20000] | Large Sorted | 942,629.686 μs | 17,950.5497 μs | 21,368.8525 μs | | SortArrayOptimized | Int32[20000] | Large Sorted | 5,730.690 μs | 176.0321 μs | 519.0345 μs | | SortArray | Int32[20000] | Large Unsorted | 1,538,116.280 μs | 37,973.6267 μs | 111,966.0951 μs | | SortArrayOptimized | Int32[20000] | Large Unsorted | 5,631.236 μs | 111.7425 μs | 220.5689 μs | | SortArray | Int32[2000] | Medium Sorted | 12,017.725 μs | 170.9306 μs | 159.8886 μs | | SortArrayOptimized | Int32[2000] | Medium Sorted | 174.385 μs | 3.0591 μs | 3.5229 μs | | SortArray | Int32[2000] | Medium Unsorted | 11,875.260 μs | 215.5740 μs | 435.4702 μs | | SortArrayOptimized | Int32[2000] | Medium Unsorted | 168.497 μs | 5.7366 μs | 16.6430 μs | | SortArray | Int32[200] | Small Sorted | 106.994 μs | 2.0930 μs | 4.2281 μs | | SortArrayOptimized | Int32[200] | Small Sorted | 8.926 μs | 0.1784 μs | 0.2615 μs | | SortArray | Int32[200] | Small Unsorted | 153.492 μs | 5.2994 μs | 15.6255 μs | | SortArrayOptimized | Int32[200] | Small Unsorted | 10.968 μs | 0.3958 μs | 1.1670 μs |
The SortArrayOptimized()
method performs at least ten times better than the SortArray()
method when sorting arrays of different sizes because the latter uses an inefficient “inner algorithm.” The bubble sort algorithm we implement has to perform N2 comparisons regardless of the arrays’ sizes, which explains why the non-optimized implementation is always slower than the optimized one.
Conclusion
In this article, we have learned how bucket sort works in C#. We can use it in scenarios where we have control over the range of values that we intend to sort. However, other algorithms such as merge sort and quick sort perform much better than bucket sort.