Recently I had the problem that sorting data based on a floating-point key was
the bottleneck. I was able to speed things up using Bucketsort in C#.

Bucket sort is mainly useful when input is uniformly distributed over a range
for example [0.0 - 1.0]. The input data is first sorted into a certain number of
buckets in linear time and afterwards each bucket is sorted using a comparison
based sorting algorithm. The sorting of each individual bucket can be done in
parallel which can also improve performance compared to a sequential sorting
algorithm. At the end the sorted buckets are merged to form the final sorted
solution.

The Wikipedia article on BucketSort
gives a more detailed overview of the algorithm and a complexity analysis.