Understanding continuous profiling: part 2

By Thomas di Luccio, on Jun 05, 2024

In the first installment of this series, we explore deterministic profiling to illuminate its nature. We discover that deterministic profiling can be explained as the data collection for each function and service call during the execution of user-designated requests or scripts.

At this point of our journey, the state of deterministic observability for an application can be summarized in the following illustration. An application handles a series of requests during a certain period of time and only some of them are profiled.

Going beyond profiling

The ability to turn the collected multidimensional data into actionable and visual information makes the deterministic profiler one of the most valuable observability tools on the market. Yet, it allows way more than the already powerful timeline and call graph views.

Performance testsThe collected metrics can be used to ensure the observed performance matches expectations. Performance tests can, and should, be written to enforce the highest performance standards for your applications. All the matching assertions are automatically evaluated every time a deterministic profile is made.

Custom metricsThe underlying metrics system can be extended by users with custom metrics based on their application logic. This gives the possibility to have certain key functions be on close watch. Those custom metrics are often the most powerful ones users can rely on.

Synthetic monitoring

Performance tests are the best way to ensure developers are not introducing performance regressions while updating their applications. Yet, so far on our journey, those tests are only evaluated after triggering profiles.

The logic behind synthetic monitoring is the automation of profiles, and therefore performance tests. The critical user journeys can be described in custom YAML files containing as many scenarios and steps as needed. When triggered, Blackfire synthetic monitoring will trigger a profile for each and every step described and gather the outcome into a convenient document called a Buid Report.

Plugging performance tests into CI/CD pipelines

The automation of your performance tests could go even further with their integration into your CI/CD pipelines. Such integrations enable our users to assess the consequences of upcoming changes which is undoubtedly powerful.

The extent of this set of very powerful features can be discovered in this series of blog posts guiding readers from their first test to their automation and integration with a CI/CD platform.

Building a deterministic observability solution 

Blackfire’s deterministic profiler is proven to be a strong asset in our customer’s observability strategy. Let’s continue our journey behind the scenes to explore how far we can push this deterministic approach.

To do so, let’s delve into the mechanics of deterministic data collection. Blackfire’s deterministic stack is composed mostly of a probe and an agent. A Blackfire probe is a PHP extension or a Python package whose responsibility is to observe the code execution and collect data for every function and service call when a specific header is sent with the request. Otherwise, the probe remains idle.

The Blackfire agent is a daemon responsible for communicating between your server and Blackfire’s. It collects the payload from the probe, packages it, and sends it to our servers to be processed and analyzed.While the probe occasionally collects all possible information for a deterministic profile, what about shifting paradigms and frequently collecting a minimal data set? This is the idea behind Blackfire Monitoring and led to the introduction of a new type of data: the monitoring trace. A third intermediary type is the monitoring extended trace, which fits between the trace and the profile.

Introducing probabilistic observability

Blackfire monitoring involves the frequent collection of minimal observability data. The question is then how to control the frequency of the data collection. This is done by determining the sampling rate for the monitoring which represents the percentage of the HTTP and/or CLI traffic that should be traced.

The extended sampling rate completes this apparatus by controlling the frequency of monitoring traces for which more in-depth data is collected. The different layers of collected data increase the depth of the information available to our users so they can quickly identify when and where something specific happened to their application.

The fact that sampling rates can control Blackfire Monitoring highlights its nature. It’s deterministic in the sense that some requests are selected to be traced leading to the collection of little (trace), medium (extended trace), or a lot of information (profile).

Blackfire Monitoring is also probabilistic since the quality of the information available depends on how representative the observed samples are of the entire traffic. This allows our journey to move forward by introducing probabilistic observability and seeing how it articulates with the fully deterministic one.

Building on top of Blackfire monitoring

Blackfire monitoring transforms all available data into rich and actionable information that is visually displayed in a dashboard designed to explore, investigate, and identify performance improvement opportunities in real time.

Services monitoring

The deterministic side of Blackfire monitoring observes all functions and service calls while collecting extended traces. Therefore, we are in a position to deep-dive into the relationship between our application and the services it relies on. Service monitoring is a powerful feature since the bottlenecks we are looking for might not be the code but in the service or in the way we interact with them.

Alerting

Blackfire’s mission is to empower developers to identify bottlenecks or performance improvement opportunities quickly. We want our users to have the freedom to spend more time doing what they love: adding features and values to the application they work on.

By setting up alerts, our users can ensure they are the first to be warned of any issues with their applications in production. They will be one click away from beginning their investigation and circumventing any crisis before it escalates.

Health report

While the monitoring dashboard allows for the exploration of observability data over a period of time, alerts focus on instant reactions to certain conditions, and the health report illuminates the trends behind the evolution of your application’s performance.

How do the response time, the traffic, and the number of errors evolve over time? What are the most impactful parts of your application? What’s slowly degrading and might need attention? What are the most impactful but untested requests that might degrade without you being warned in the first place?

When considering all the tools and features described so far in this journey toward continuous profiling, we can illustrate a new state of deterministic observability. While an application handles a series of requests, some of them will lead to monitoring traces and a subset of those will be extended traces.Some profiles could be manually triggered, as well as some automatic ones, for the critical parts of your application. More importantly, a part of the traffic won’t be scrutinized at all. The quality of the information available then depends on the representativity of the observed activities.

By using the entirety of Blackfire’s unique observability solution, our users are in a situation to proactively identify existing bottlenecks and the consequences of upcoming changes. This truly empowers developers to work more efficiently.

As we wrap up the second installment of this series, we might wonder if we really need extra tools to enhance the power and flexibility of the existing observability solution. It might be a spoiler alert and not a surprise, but the answer is yes, we need an extra tool.

The final part of this journey will introduce a key element to every observability initiative: the ratio between the information available and the overhead caused by the data collection. We will see the inherent strength, and limitations, of continuous profiling, and how well it integrates into an existing observability workflow.

To better observability and beyond!


The “Understanding continuous profiling” series:

Thomas di Luccio

Thomas is a Developer Relations Engineer at Platform.sh for Blackfire.io. He likes nothing more than understanding the users' needs and helping them find practical and empowering solutions. He’ll support you as a day-to-day user.