Analysis of Pivot Sampling in Dual-Pivot Quicksort: A Holistic Analysis of Yaroslavskiy’s Partitioning Scheme

Markus E. Nebel, Sebastian Wild, Conrado Martinez

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Abstrakt

The new dual-pivot Quicksort by Vladimir Yaroslavskiy—used in Oracle’s Java runtime library since version 7—features intriguing asymmetries. They make a basic variant of this algorithm use less comparisons than classic single-pivot Quicksort. In this paper, we extend the analysis to the case where the two pivots are chosen as fixed order statistics of a random sample. Surprisingly, dual-pivot Quicksort then needs more comparisons than a corresponding version of classic Quicksort, so it is clear that counting comparisons is not sufficient to explain the running time advantages observed for Yaroslavskiy’s algorithm in practice. Consequently, we take a more holistic approach and give also the precise leading term of the average number of swaps, the number of executed Java Bytecode instructions and the number of scanned elements, a new simple cost measure that approximates I/O costs in the memory hierarchy. We determine optimal order statistics for each of the cost measures. It turns out that the asymmetries in Yaroslavskiy’s algorithm render pivots with a systematic skew more efficient than the symmetric choice. Moreover, we finally have a convincing explanation for the success of Yaroslavskiy’s algorithm in practice: compared with corresponding versions of classic single-pivot Quicksort, dual-pivot Quicksort needs significantly less I/Os, both with and without pivot sampling. © 2015, Springer Science+Business Media New York.
OriginalsprogDansk
TidsskriftAlgorithmica
Vol/bind75
Udgave nummer4
Sider (fra-til)632-683
ISSN0178-4617
DOI
StatusUdgivet - 2016

Citationsformater