Understanding Monitoring Traces, Extended Traces, and Profiles

Let’s dive into the different levels of data collected and used by Blackfire, including Monitoring Traces, Extended Traces, and Profiles

Share this:
Click to share on Twitter (Opens in new window)
Click to share on Reddit (Opens in new window)
Click to share on Facebook (Opens in new window)
Click to share on LinkedIn (Opens in new window)
More
Click to email a link to a friend (Opens in new window)
Click to share on Pinterest (Opens in new window)
Click to share on Pocket (Opens in new window)
Click to share on Tumblr (Opens in new window)

By Thomas di Luccio, on Jan 30, 2023

Blackfire’s complete Observability solution lets our customers optimize their applications for the long run. It provides a unique set of tools for users to better understand an app’s real behavior and the services it relies on.

These tools leverage many different sets of data to run. This article explores the different layers of data that Blackfire collects and uses.

Monitoring Traces

Blackfire Monitoring relies on Monitoring Traces, the lightest level of data collected. That means a minimal amount of data is gathered for all parts of the application and at whichever frequency you desire (the sample rate is configurable).

To get an idea of what this looks like, we can compare Blackfire Monitoring to real-time car traffic information. The overhead for collecting Monitoring Traces is insignificant and only affects a subset of the traffic defined by the sample rate.

Extended Traces and Spans

Meanwhile, the Extended Traces are a small subset of Monitoring Traces for which Blackfire collects more in-depth metrics, such as Spans. A Span represents a function call over time, just like in a profile timeline.

Collecting Spans lets Blackfire identify the relationship between transactions and the services they’re using. It provides a clear and full picture of their impact on the performance of the incoming requests. Collecting increasingly larger spans allows us to zoom in on an issue, improve our diagnosis of the situation, and locate its origin. We can then answer some questions, including when and where some performance-related events are happening.

Once a reliable diagnostic is made, it’s then time to zoom in even further and understand why the issue happened. This is the mission of the Profile.

The profile

When triggering a profile, all possible observability metrics are collected. Currently, this is the most comprehensive dataset. It’s so powerful that we can locate the origin of an issue down to the precise function or service call.

The overhead of a Profile is much higher. Yet, apart from the 10 daily automatic Profiles, only the Blackfire user triggering it is affected. A profile doesn’t have any impact on end users.

The agent

These different datasets, including traces, extended traces, and profiles, are collected the same way with the same tools. Blackfire Probe, an extension for PHP, and a package for Python and Go, is witnessing the code being executed. It collects and gathers the raw performance data. It then sends it to the Agent.

The Agent is a server-side daemon that receives profiles from the Probe, aggregates them, and forwards them to Blackfire. Its sole purpose is to communicate back and forth with Blackfire servers.

You can learn more about data privacy and how we store Observability data in our FAQ and this blog post.

We will detail how overheads could be estimated in a future article. Spoiler: we can say the results depend on the script’s complexity.

A little about overhead

First, only an imperceptible overhead is added when Blackfire is not used in a request. Blackfire only activates when a specific header is added to the request. Otherwise, Blackfire is safe to be used in Production.

The overhead induced by Blackfire depends on the number of PHP/Python/Go operations being called, not on the time a server spends processing those operations.

Yet, in most cases, external services and calls are the main contributors to the wall time. A limited number of calls are consuming the most resources.

The time added to collected data on that small number of calls is insignificant compared to all the I/0 delays, especially for applications that are more I/0 bound than CPU bound.

We are confident that Blackfire allows performance gains much higher than the minimal overhead added to collected observability data. Blackfire is a game-changer.

Happy Performance Optimization!

Thomas di Luccio

Thomas is Product Manager at Platform.sh for Blackfire.io. He likes nothing more than understanding the users' needs and helping them find practical and empowering solutions. He’ll support you as a day-to-day user.