Demystifying observability: deterministic vs. probabilistic approaches

Blackfire for PHP and Python relies on a deterministic approach to observability. Let’s explore what this means and discover how it compares to a probabilistic one and how both could shape the future of observability.

Share this:
X
Reddit
Facebook
LinkedIn
More
Email
Pinterest
Pocket
Tumblr

By Thomas di Luccio, on Oct 11, 2023

Blackfire’s continuous observability solution helps our users make long-lasting performance optimizations. Our unique sets of tools are powerful companions siding with you at every step of your development and deployment workflow.

Our users can monitor their applications in production to identify when and where a performance issue has occurred, react quickly to alerts, and understand precisely why the bottlenecks occurred down to the function or service-call level. Finally, our extensive testing and synthetic monitoring suite is designed to have their optimizations last by never deploying code that would degrade their apps’ performance.

Today, we’ll delve into the internals of Blackfire by looking at how we collect observability metrics—starting with deterministic profiling.

Deterministic observability for PHP and Python

Blackfire has a deterministic approach for PHP and Python, meaning it collects an extensive range of metrics for every instrumented request or script.

The quantity of data and metrics collected differ for every layer of data Blackfire collects. The way scripts or requests are chosen to be instrumented also depends on the data layer. Profiles are manually triggered by a Blackfire user, or automatically triggered by Blackfire builds, and our synthetic monitoring solution evaluates the performance of critical user journeys.

Meanwhile, monitoring traces and extended traces are based on the sample rate, which is the percentage of requests that are monitored. In that matter, Blackfire monitoring offers a mixed approach: a probabilistic one on how requests are selected to be instrumented and a deterministic one on how they are monitored with the instrumentation starting at the very beginning of the request and ending with it.

Earlier this year, we published a blog post providing all the details on the differences and complementarity between monitoring traces, extended traces, and profiles.

What is probabilistic profiling?

Probabilistic profiling involves capturing data intermittently. It collects information at defined intervals, logging functions or services activated by any ongoing request or script. This approach provides a more comprehensive view of your application’s performance over time, but certain event nuances may be overlooked due to the frequency of sampling.

Comparing deterministic to probabilistic profiling is akin to contrasting medical imaging devices. Asserting that an fMRI is unequivocally better than a PET scan or ultrasound is a misplaced judgment; each tool has its specific diagnostic purpose.

You might not know all the details about a specific script. But you will have a good overview of everything happening at a specific time. Information on shorter spans starting and ending between two consecutive ticks won’t be collected.

Pros and cons

Both approaches have their strengths and weaknesses. One is not better than the other. It all depends on the instrumented languages and your understanding of the data.

Deterministic profiling: Its strength lies in precision and facilitating meticulous script analysis. But it’s resource-intensive, leading to considerable overhead and potential data overload, making analysis potentially tedious.
Probabilistic profiling: Lightweight and scalable, tailored for holistic application oversight. However, its periodic snapshots might miss rapid function calls, yielding a not-so-perfect application map.

Deterministic and probabilistic profiling each hold value within the development process. The former delivers a thorough and detailed view, while the latter offers a wider, more adaptable perspective. Developers may choose one or even combine both approaches based on the project’s specifics and the issues faced.

While fully embracing the deterministic approach is the best for PHP and Python, we are considering a probabilistic profiler for some other languages we might soon support. Stay tuned for more information on that coming soon!

Join the conversation

We’ve delved into deterministic and probabilistic observability, but this is just the tip of the iceberg. So, we’re keen to hear your experiences, insights, and questions! Are there languages beyond PHP and Python that you’re working with and wish to optimize with Blackfire? How would a deterministic or probabilistic approach influence your understanding of your applications? Let us know.

Join our community on Dev.to, Discord, and Reddit and keep the conversation going—let’s shape the future of web application observability together!

We’re excited to hear your thoughts and learn from your experiences. Until then, and as always, happy performance optimization!

Thomas di Luccio

Thomas is Product Manager at Platform.sh for Blackfire.io. He likes nothing more than understanding the users' needs and helping them find practical and empowering solutions. He’ll support you as a day-to-day user.