Parallel Bucket Sort in C#

Recently I had the problem that sorting data based on a floating-point key was the bottleneck. I was able to speed things up using Bucketsort in C#.

Bucket sort is mainly useful when input is uniformly distributed over a range for example [0.0 - 1.0]. The input data is first sorted into a certain number of buckets in linear time and afterwards each bucket is sorted using a comparison based sorting algorithm. The sorting of each individual bucket can be done in parallel which can also improve performance compared to a sequential sorting algorithm. At the end the sorted buckets are merged to form the final sorted solution.

The Wikipedia article on BucketSort gives a more detailed overview of the algorithm and a complexity analysis.

Here is the implementation that I used:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;

namespace ConsoleApp
{
  public class Program
  {
    public static void BucketSort(double[] data, int bucketCount)
    {
      var buckets = new List<double>[bucketCount];
      for (int i = 0; i < bucketCount; i++)
        buckets[i] = new List<double>(data.Length / bucketCount);

      var min = double.MaxValue;
      var max = -double.MaxValue;

      for (int i = 0; i < data.Length; i++)
      {
        min = Math.Min(min, data[i]);
        max = Math.Max(max, data[i]);
      }

      for (int i = 0; i < data.Length; i++)
      {
        var idx = Math.Min(bucketCount - 1, (int)(bucketCount * (data[i] - min) / (max - min)));
        buckets[idx].Add(data[i]);
      }

      Parallel.For(0, bucketCount, i => buckets[i].Sort());

      var index = 0;
      for (var i = 0; i < bucketCount; i++)
      for (var j = 0; j < buckets[i].Count; j++)
        data[index++] = buckets[i][j];
    }

    public static void Main(string[] args)
    {
      for (int algo = 0; algo < 2; algo++)
      {
        var random = new Random(0);
        var sw = Stopwatch.StartNew();

        for (int i = 0; i < 1000; i++)
        {
          var n = random.Next(20000, 200000);
          var data = new double[n];
          for (int j = 0; j < data.Length; j++)
            data[j] = 2 * random.NextDouble() - 1;

          switch (algo)
          {
            case 0:
              Array.Sort(data);
              break;

            case 1:
              BucketSort(data, n / 200);
              break;
          }
        }

        Console.WriteLine($"{(algo == 0 ? "Array.Sort" : "BucketSort")}; {sw.ElapsedMilliseconds}");
      }
    }
  }
}

On my system i get the following output:

Array.Sort; 11482
BucketSort; 4928

Here Bucketsort is considerably faster.