### Resumé

Originalsprog | Engelsk |
---|---|

Tidsskrift | Formal Aspects of Computing |

Vol/bind | 29 |

Udgave nummer | 3 |

Sider (fra-til) | 559-579 |

ISSN | 0934-5043 |

DOI | |

Status | Udgivet - 2017 |

### Fingeraftryk

### Citer dette

}

*Formal Aspects of Computing*, bind 29, nr. 3, s. 559-579. https://doi.org/10.1007/s00165-016-0401-3

**Optimizing Sorting Algorithms by Using Sorting Networks.** / Codish, Michael; Cruz-Filipe, Luis; Nebel, Markus; Schneider-Kamp, Peter.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

TY - JOUR

T1 - Optimizing Sorting Algorithms by Using Sorting Networks

AU - Codish, Michael

AU - Cruz-Filipe, Luis

AU - Nebel, Markus

AU - Schneider-Kamp, Peter

PY - 2017

Y1 - 2017

N2 - In this paper, we show how the theory of sorting networks can be applied to synthesize optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm, with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. By replacing it with faster sorting networks, we can improve the performance of these algorithms. We show that by considering the number of comparisons and swaps alone we are not able to predict any real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. Furthermore, a close control of how often registers have to be spilled to memory gives us a complete explanation of the performance of different sorting networks, allowing us to choose an optimal one for each particular architecture. Our experimental results show that using code synthesized from these efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.

AB - In this paper, we show how the theory of sorting networks can be applied to synthesize optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm, with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. By replacing it with faster sorting networks, we can improve the performance of these algorithms. We show that by considering the number of comparisons and swaps alone we are not able to predict any real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. Furthermore, a close control of how often registers have to be spilled to memory gives us a complete explanation of the performance of different sorting networks, allowing us to choose an optimal one for each particular architecture. Our experimental results show that using code synthesized from these efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.

U2 - 10.1007/s00165-016-0401-3

DO - 10.1007/s00165-016-0401-3

M3 - Journal article

VL - 29

SP - 559

EP - 579

JO - Formal Aspects of Computing

JF - Formal Aspects of Computing

SN - 0934-5043

IS - 3

ER -