CS Lectures

Lecturer

Sebastian Wild / Quicksort, Timsort, Powersort: Algorithmic ideas, engineering tricks, and trivia behind CPython’s new sorting algorithm

University of Liverpool
Nov
8
13:00
Aud 1

Abstract

In 2002, Tim Peters invented Timsort, a clever Mergesort variant, for the CPython reference implementation of the Python programming language. Timsort is both effective in Python and a popular export product: it is used in many languages and frameworks, notably OpenJDK, the Android runtime, and the V8 JavaScript engine. Despite its widespread use, algorithms researchers eventually discovered two flaws in Timsort's underlying algorithm: The first could lead to a stack overflow in CPython and Java; although meanwhile fixed, it exposed conceptual intricacies of the algorithm. The second flaw concerns its performance: the order in which detected sorted segments, the “runs” of the input, are merged, can be 50% more costly than necessary. Based on ideas from optimal alphabetic trees, our Powersort merge policy finds nearly optimal merging orders with negligible overhead and using a much more transparent rule. Since Python 3.11, it is part of the CPython reference implementation.

About the Speaker

Sebastian Wild is a Lecturer (Assistant Professor) in the Department of Computer Science of the University of Liverpool where he works on space-efficient data structures, analysis of algorithms, and engineering of foundational algorithms. Before joining Liverpool, he was a Postdoctoral Fellow at the University of Waterloo. His PhD on the analysis of multiway quicksort, obtained from the University of Kaiserslautern, received the GI Dissertationspreis 2016.

ITU Hosts

Martin Aumüller

Riko Jacob