TY - GEN
T1 - Applying Sorting Networks to Synthesize Optimized Sorting Libraries
AU - Codish, Michael
AU - Cruz-Filipe, Luís
AU - Nebel, Markus
AU - Schneider-Kamp, Peter
PY - 2015/12
Y1 - 2015/12
N2 - This paper presents an application of the theory of sorting networks to facilitate the synthesis of optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. This enables further program transformations based on sorting network optimizations, and eventually the synthesis of code from sorting networks. We show that, if considering the number of comparisons and swaps, the theory predicts no real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. We provide empirical evidence that using code synthesized from efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.
AB - This paper presents an application of the theory of sorting networks to facilitate the synthesis of optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. This enables further program transformations based on sorting network optimizations, and eventually the synthesis of code from sorting networks. We show that, if considering the number of comparisons and swaps, the theory predicts no real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. We provide empirical evidence that using code synthesized from efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.
U2 - 10.1007/978-3-319-27436-2_8
DO - 10.1007/978-3-319-27436-2_8
M3 - Article in proceedings
SN - 978-3-319-27435-5
T3 - Lecture Notes in Computer Science
SP - 127
EP - 142
BT - Logic-Based Program Synthesis and Transformation
A2 - Falaschi, Moreno
PB - Springer
T2 - 25th International Symposium on Logic-based Program Synthesis and Transformation
Y2 - 13 July 2015 through 15 July 2015
ER -