Optimizing Profiler Data Storage & Querying
Problem
Profiler data (from tools like perf
, YourKit
) is crucial for understanding application behavior but is extremely voluminous and highly redundant. This makes traditional storage and querying inefficient and costly.
Sto's ("Store Things Optimized") Solution
- Represent as a Directed Acyclic Graph (DAG): Call stacks naturally form a graph. Sto leverages this by storing unique stack frames and their parent-child relationships, de-duplicating common sub-paths.
- Massive Data Footprint Reduction: This DAG approach can reduce storage needs by orders of magnitude (e.g., up to 10,000x; 2GB raw -> ~5MB Sto data in some cases).
- "Time Series of Graphs": Performance is viewed as an evolving call graph over time. Sto helps analyze these changes.
The DAG
stack_node_data
: Represents a unique code location (symbol, filename, line number). This has relatively low cardinality.- Fields:
id (pk)
,line (int)
,file (text)
,symbol (text)
- Fields:
executable
: Represents a unique binary build (name, version). Also low cardinality.- Fields:
id (pk)
,name (text)
,version (int)
,samples (int)
- Fields:
stack_node
: The vertices of the DAG. Eachstack_node
represents:- A specific
stack_node_data
(code location). - Occurring within a specific
executable
(binary build). - Having a specific
parent_id
(anotherstack_node
, or null for roots). - An aggregated
sample_count
(how many times this exact path was observed). - ID is typically a hash of (parent_id, exe_id, data_id).
- A specific
Benefits
- Dramatically Reduced Storage: Less cost, easier data retention.
- Efficient Querying: Enables SQL-based queries across numerous profiles and binaries.
- Regression Detection: Facilitates comparing
sample_counts
for the samestack_node_data
across differentexecutable
versions to pinpoint performance changes. - In-Database Call Graph Reconstruction: Allows for generating flame graph-like views directly from the database.
- Accessibility: Aims to make performance insights available to all developers, e.g., through IDE integrations indicating costly code.
Example
If a function doLogging
in demo.c
becomes 10x more expensive in a new version (version two
) compared to an old one (version one
), Sto's CLI can ingest profiles for both. A generic SQL query (e.g., findRegressions
) can then compare sample_counts
for the doLogging
stack_node_data
across the two executable.build_id
s, clearly highlighting the regression.