Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The bound on string sorting is typically written as O(n log n + D), where D is the sum of distinguishing prefixes, ie input length minus some fluff. Since D >= n log n we already have linearity on input length.


Are you saying that you can sort strings in O(n log n + D) using a conventional sorting algorithm such as merge sort? If so, I don't understand why D would be an additive factor implying that each string is only involved in a constant number of comparisons.

(I wasn't, by the way, only considering strings when discussing variable sized keys -- the beauty of discrimination is that we essentially reduce all sorting problems to string sorting.)


A conventional sorting algorithm such as Multikey Quicksort.

http://people.mpi-inf.mpg.de/~sanders/courses/algdat03/salgd...


When people say "conventional" sorting algorithms, they're usually talking about sorting algorithms based around pairwise comparison functions.

I note on slide 14 of this presentation, it looks like this is sort of the discriminator for selecting a better partitioning scheme. So it looks to me like this actually leverages a similar principle?

As we've seen in this thread and others, there are some other ways to measure and/or process that have different characteristics. Surely all of these deserve attention! So, thanks very much for sharing this.


Sorry, I was being imprecise. By conventional I meant comparison-based sorting functions that are polymorphic in the element type and thus not allowed to examine individual bytes.

Multikey Quicksort indeed looks like a special case of discrimination, exploiting some of the same principles.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: