Python 3.12 Preview: Support For the Linux perf Profiler
The final release of Python 3.12 is scheduled for October 2023, which is growing closer. In the meantime, you can download and install its preview version to get a sneak peek at the upcoming features. One of the biggest changes announced is support for the Linux perf
profiler, which is a powerful performance profiling tool.
In this tutorial, you’ll:
- Install and use the Linux
perf
profiler with Python 3.12 - Get a holistic view of your application’s performance
- Explore a case study of profiling Python code with
perf
- Visualize the collected metrics with flame graphs and more
To fully benefit from using the perf
profiler in Python 3.12, you should have a fairly good understanding of how the underlying hardware and the operating system work. In addition to that, you need to be comfortable using a Linux distribution and the build tools for compiling Python from source code.
There are many other new features and improvements coming in Python 3.12. The highlights include the following:
- Ever better error messages
- More powerful f-strings
- Better support for subinterpreters
- Improved static typing features
Check out what’s new in the changelog for more details on these features.
Free Bonus: Click here to download your sample code for a sneak peek at Python 3.12, coming in October 2023.
Seeing the Big Picture Through the Lens of perf
The Linux perf
profiler is a truly versatile performance analysis tool. At the very least, you can use it as a statistical profiler to find hot spots in your own code, library code, and even the operating system’s code. In fact, you can hook it up to any running process, such as your web browser, and obtain its live profile, as long as you have sufficient permissions and the program was compiled with debug symbols.
The tool can also work as an event counter by measuring the exact number of low-level events occurring in both hardware and software. For example, it’s capable of counting the number of CPU cycles, instructions executed by the processor, or context switches during a program’s execution. The specific types of events may vary depending on your hardware architecture and the Linux kernel version.
Another useful feature of perf
is the ability to retain the call graph of functions, which can help you understand which of potentially many calls to the same function is an actual bottleneck. With a bit of effort, you can even visualize your code paths in the form of a mathematical graph consisting of nodes and edges.
In this short section, you’ll get a basic understanding of the Linux perf
profiler and its advantages over other profiling tools. If you’re already familiar with this tool, then you can jump ahead to a later section to get your hands dirty and see it in action.
What’s the Linux perf
Profiler?
If you search online for information about the Linux perf
profiler, then you may get confused by the plethora of names that people use to talk about it. To make matters worse, the profiler seems to be poorly documented. You really need to dig deep, as the corresponding article on Wikipedia and the official Wiki page don’t provide a lot of help.
The Linux perf
profiler actually consists of two high-level components:
-
perf_events
, or performance counters for Linux (PCL): A subsystem in the Linux kernel with an API that provides an abstraction layer over a hardware-specific performance monitoring unit (PMU) present in modern architectures. The PMU consists of special CPU registers or counters for collecting metrics about events such as the number of cache misses, CPU cycles, or executed instructions. This information is invaluable during performance analysis. -
perf
command-line tool: A user-space utility tool built on top of theperf_events
API from the Linux kernel, which can help you collect and make sense of the data. It follows the same philosophy as Git by offering several specialized subcommands. You can list them all by typingperf
at your command prompt.
The first component is baked right into the Linux kernel, meaning that recent versions of mainstream Linux distributions will usually have it shipped and enabled. On the other hand, you’ll most likely need to install an additional package to start using the perf
command, as it’s not essential for regular users.
Note: Each processor and operating system combination supports a different set of event types. Therefore, you may sometimes need to reinstall perf
to match your current kernel version after an upgrade.
While the Linux perf
profiler came into existence primarily as an interface to hardware-level performance counters, it also covers numerous software performance counters. For example, you can use it to track the number of context switches made by your operating system’s task scheduler or the number of times a running thread has migrated from one CPU core to another.
Okay, but are there any other benefits of using the Linux perf
profiler aside from its hardware capabilities?
How Can Python Benefit From perf
?
For several reasons, adding support for the Linux perf
profiler will greatly impact your ability to profile Python applications. There’s no better source to refer to than Pablo Galindo Salgado, who’s the primary contributor behind this new feature. In a Twitter announcement, he thoroughly explains the benefits of the Python and perf
integration:
Python 3.12 will add support for the Linux perf profiler! 🔥🔥 Perf is one of the most powerful and performant profilers for Linux that allows getting a ridiculous amount of information such as CPU counters, cache misses, context switching, and much more. (Source)
The lengthy tweet thread gives examples of several new capabilities:
Read the full article at https://realpython.com/python312-perf-profiler/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]