Profiling Data

Core Challenges Discussed:

Key Solutions & Concepts Presented:

Optimizing Profiler Data Storage & Querying (Pat Somaru, Meta)

Principled Performance Analytics - Cohorting for Consistency (Narayan Desai, Google)

Highlights

  1. Data Representation & Modeling is Foundational:
    • The way observability data is structured profoundly impacts its utility.
    • Example: Treating profiler call stacks as graphs (Sto) or modeling diverse workloads as distinct cohorts (Two Sigma) unlocks deeper insights.
  2. Prioritize Signal-to-Noise Ratio:
    • A primary goal of advanced analytics is to reduce noise and surface true signals.
    • Example: Cohorting in Two Sigma normalizes for workload mix effects; DAGs in Sto de-duplicate redundant information.
  3. Accessibility for All Engineers:
    • Performance insights shouldn't be confined to experts. Tools and systems should aim to make performance implications clear and actionable for every developer.
    • Example: IDE feedback based on Sto data, simplified query interfaces.
  4. Historical Context is More Powerful Than Static Thresholds:
    • Comparing current system behavior to its own past behavior, especially within specific contexts (like workload cohorts), often provides more meaningful signals of change than comparing against arbitrary, static thresholds.
  5. Efficiency at Scale Matters:
    • Small performance changes or data storage/processing inefficiencies can have a massive cost and operational impact at scale.
    • Example: Sto's data compression; Two Sigma's ability to process vast event streams.
  6. Look Beyond the Traditional "Three Pillars":
    • While metrics, logs, and traces are essential, deep system understanding often requires richer data types (like detailed profiler data) and more sophisticated analytical approaches that can model complex behaviors.
  7. Understand Your Workloads Deeply:
    • "Workloads matter." The inputs to your system and the different ways it's used are critical context for interpreting performance data. Avoid treating the system as a monolithic black box.

Resources