Blackfire Profiler for Python – Pre-Launch news #1

By Sümer Cip, on Dec 04, 2019

We have been working on Blackfire Profiler for Python for a while now. We are very excited about the outcome and I would like to share some insight on the main features.

How does it work?

As exposed in our early announcement, under the hood it is a deterministic profiler. It:

  1. catches all events occurring in the Python interpreter;
  2. aggregates all profiling information;
  3. sends the data to Blackfire.io servers.

Profiles and metrics can then be viewed from the web UI of Blackfire.

Profiles can be executed via Blackfire CLI tools or a browser extension at any time during execution, with no overhead for end-users.

Supported Platforms

We are working on the support of all Python versions from 2.7.x-3.x. (Including the EOL versions like 3.1, 3.2, 3.3… .)

We already support Linux and MacOSX, and actively work on Windows support.

Output Format

There are various approaches taken by different profilers regarding the output format generated by the profiler. Some profilers give information based on function names, some of them give information per line numbers, per modules, etc.

Blackfire generates a callgraph with edges being caller/callee pairs. So, every unique caller/callee pair will have a valid edge in the output callgraph and will have various metrics specific to this call.

This has some nice benefits while analyzing code performance problems as it will highlight the underlying problem more by adding extra levels of stack information.

Metrics

Blackfire provides several metrics that enable you to understand how your code performs. Let’s look at each of them in more detail:

Call count

As the name suggests, this is the number of times a specific caller calls a specific callee.

Wall Time, CPU Time & I/O Wait

Wall time is the time the difference between the time at which the function was entered and the time at which the function ended.

CPU time is the time the CPU actually spent on executing code.

I/O Wait is the time the CPU has been waiting for an input or an output, while no code was executed (memory access,…).

Read more about these metrics here.

Let’s look at an example:

# needs Python 3.3 and higher

import time

def example1():
    t1 = time.time()
    t2 = time.process_time()
    time.sleep(1.0) # sleep for 1 sec
    print("t1=%0.6f, t2=%0.6f" % (time.time()-t1, time.process_time()-t2))

Now, let’s look at the output of above script:

t1=1.001341, t2=0.000065

What above is telling us is the `example1` function takes ~1 second to execute in wall time and ~65 microseconds in CPU time.

From profiling perspective, this is pretty important distinction as your code performance can either be a I/O bound or CPU bound application. You will need both these metrics in order to identify the root cause of your performance bottlenecks. Blackfire profiler will easily point you to the right direction.

Beyond time metrics, we are also actively working on acquiring memory metrics. More on this soon!

An example Django application

To enable profiling of a Django application, Blackfire provides a Django middleware. When the middleware is enabled, you will be able to profile specific HTTP calls without affecting other requests in the server.

Below is a simple screenshot taken from a Django application.

This image has an empty alt attribute; its file name is django-profile.png

Future work

There is lots of more to come. Most notably, we will be adding:

  • views to show HTTP and SQL queries being made from a profile session;
  • deep integration with frameworks (SQLAlchemy, Celery, Flask,…);
  • timeline view.

We’re working on releasing a first version soon, so stay tuned!

Happy Python profiling!

Sümer Cip

Sümer Cip is currently a Senior Software Engineer at Blackfire. He has been coding for nearly 20 years and contributed to many open source projects in his career. He has a deep love for Python and obsession with code performance and low-level stuff.