5 great libraries for profiling Python code

Maria J. Danford

Each programming language has two varieties of pace: pace of development, and pace of execution. Python has normally favored writing fast as opposed to managing fast. Even though Python code is virtually normally fast more than enough for the endeavor, from time to time it is not. In these cases, you need to have to locate out in which and why it lags, and do a thing about it.

A properly-highly regarded adage of application development, and engineering typically, is “Measure, really do not guess.” With application, it’s straightforward to think what is improper, but in no way a fantastic thought to do so. Statistics about real program efficiency are normally your greatest initially tool to building programs speedier.

The fantastic information is, Python delivers a complete slew of deals you can use to profile your programs and master in which it’s slowest. These tools selection from basic one-liners provided with the normal library to refined frameworks for gathering stats from managing programs. In this article I cover 5 of the most significant, all of which run cross-platform and are quickly obtainable possibly in PyPI or in Python’s normal library.

Time and Timeit

In some cases all you need to have is a stopwatch. If all you are accomplishing is profiling the time among two snippets of code that choose seconds or minutes on end to run, then a stopwatch will more than suffice.

The Python normal library arrives with two features that operate as stopwatches. The Time module has the perf_counter operate, which calls on the working system’s high-resolution timer to obtain an arbitrary timestamp. Get in touch with time.perf_counter after just before an motion, after soon after, and obtain the change among the two. This gives you an unobtrusive, small-overhead—if also unsophisticated—way to time code.

The Timeit module tries to conduct a thing like real benchmarking on Python code. The timeit.timeit operate will take a code snippet, operates it a lot of situations (the default is one million passes), and obtains the overall time required to do so. It’s greatest applied to identify how a single operation or operate get in touch with performs in a tight loop—for occasion, if you want to identify if a record comprehension or a typical record construction will be speedier for a thing accomplished a lot of situations about. (Checklist comprehensions generally gain.)

The draw back of Time is that it’s very little more than a stopwatch, and the draw back of Timeit is that its major use case is microbenchmarks on unique lines or blocks of code. These modules only operate if you are dealing with code in isolation. Neither one suffices for complete-program analysis—finding out in which in the 1000’s of lines of code your program spends most of its time.


The Python normal library also arrives with a complete-program evaluation profiler, cProfile. When run, cProfile traces each individual operate get in touch with in your program and generates a record of which types were being referred to as most typically and how extensive the calls took on typical.

cProfile has three big strengths. A person, it’s provided with the normal library, so it’s obtainable even in a inventory Python set up. Two, it profiles a number of various stats about get in touch with behavior—for occasion, it separates out the time spent in a operate call’s individual instructions from the time spent by all the other calls invoked by the operate. This allows you identify no matter whether a operate is sluggish itself or it’s contacting other features that are sluggish.

Three, and possibly greatest of all, you can constrain cProfile freely. You can sample a complete program’s run, or you can toggle profiling on only when a choose operate operates, the better to target on what that operate is accomplishing and what it is contacting. This method is effective greatest only soon after you have narrowed matters down a bit, but will save you the trouble of getting to wade by the noise of a total profile trace.

Which brings us to the initially of cProfile’s drawbacks: It generates a ton of stats by default. Attempting to locate the suitable needle in all that hay can be overwhelming. The other disadvantage is cProfile’s execution product: It traps each individual single operate get in touch with, generating a significant sum of overhead. That helps make cProfile unsuitable for profiling apps in production with reside knowledge, but properly fantastic for profiling them throughout development.

For a more thorough rundown of cProfile, see our different report.


Pyinstrument is effective like cProfile in that it traces your program and generates studies about the code that is occupying most of its time. But Pyinstrument has two key positive aspects about cProfile that make it really worth making an attempt out.

1st, Pyinstrument doesn’t try to hook each individual single occasion of a operate get in touch with. It samples the program’s get in touch with stack each individual millisecond, so it’s much less obtrusive but nevertheless sensitive more than enough to detect what is having most of your program’s runtime.

2nd, Pyinstrument’s reporting is far more concise. It exhibits you the best features in your program that choose up the most time, so you can target on examining the most significant culprits. It also allows you locate these success promptly, with minor ceremony.

Pyinstrument also has a lot of of cProfile’s conveniences. You can use the profiler as an object in your software, and report the actions of picked features instead of the complete software. The output can be rendered any number of means, like as HTML. If you want to see the total timeline of calls, you can desire that also.

Two caveats also arrive to intellect. 1st, some courses that use C-compiled extensions, such as these made with Cython, might not operate properly when invoked with Pyinstrument by the command line. But they do operate if Pyinstrument is applied in the program itself—e.g., by wrapping a major() operate with a Pyinstrument profiler get in touch with.

The next caveat: Pyinstrument doesn’t offer properly with code that operates in several threads. Py-spy, thorough underneath, might be the better preference there.


Py-spy, like Pyinstrument, is effective by sampling the condition of a program’s get in touch with stack at normal intervals, instead of making an attempt to report each individual single get in touch with. Compared with PyInstrument, Py-spy has main factors prepared in Rust (Pyinstrument employs a C extension) and operates out-of-process with the profiled program, so it can be applied properly with code managing in production.

This architecture permits Py-spy to effortlessly do a thing a lot of other profilers just cannot: profile multithreaded or subprocessed Python programs. Py-spy can also profile C extensions, but these need to have to be compiled with symbols to be valuable. And in the case of extensions compiled with Cython, the produced C file requirements to be current to acquire appropriate trace information.

There are two essential means to examine an application with Py-spy. You can run the application applying Py-spy’s report command, which generates a flame graph soon after the run concludes. Or you can run the application applying Py-spy’s best command, which brings up a reside-current, interactive display screen of your Python app’s innards, shown in the exact same method as the Unix best utility. Individual thread stacks can also be dumped out from the command line.

Py-spy has one big disadvantage: It’s mostly intended to profile an entire program, or some factors of it, from the outside. It doesn’t let you enhance and sample only a unique operate.


Yappi (“Yet Yet another Python Profiler”) has a lot of of the greatest characteristics of the other profilers mentioned in this article, and a handful of not supplied by any of them. PyCharm installs Yappi by default as its profiler of preference, so buyers of that IDE by now have constructed-in entry to Yappi.

To use Yappi, you enhance your code with instructions to invoke, begin, cease, and generate reporting for the profiling mechanisms. Yappi allows you decide on among “wall time” or “CPU time” for measuring the time taken. The former is just a stopwatch the latter clocks, by means of process-native APIs, how extensive the CPU was in fact engaged in executing code, omitting pauses for I/O or thread sleeping. CPU time gives you the most precise sense of how extensive particular functions, such as the execution of numerical code, in fact choose.

A person very wonderful gain to the way Yappi handles retrieving stats from threads is that you really do not have to enhance the threaded code. Yappi offers a operate, yappi.get_thread_stats(), that retrieves stats from any thread activity you report, which you can then parse individually. Stats can be filtered and sorted with high granularity, comparable to what you can do with cProfile.

At last, Yappi can also profile greenlets and coroutines, a thing a lot of other profilers simply cannot do effortlessly or at all. Provided Python’s growing use of async metaphors, the capability to profile concurrent code is a effective tool to have.

Study more about Python

Copyright © 2020 IDG Communications, Inc.

Next Post

4 technical hurdles to quantum computing

If we experienced thousands and thousands of qubits right now, what could we do with quantum computing? The reply: nothing with no the rest of the system. There’s a lot of excellent development taking place in quantum research throughout the industry. On the other hand, as an industry, we must […]

Subscribe US Now