On this page
v26.1.1​
web: allow kernel functions to be viewed in top functions page
web: add CEL array type support
web: add ability to group by deployment name
web: improve metrics performance using pre-filtering by meta_id
ui: fix an issue that could cause the sandwich graph on the top-function page to not show up correctly after re-loading the data
ui: fix handling of non-ascii chars in metric attributes
ui: fix querybar autocomplete suggestions not showing up correctly in Firefox
ui: fix an issue that caused the middle truncation of text to not work correctly in Firefox
profiler/symdb: improve handling of file size limit if -dwarf is passed
profiler: add allocation profiling for Java (Zing + OpenJDK)
enabled via -alloc-profile <size> flag, instructing the profiler to profile allocations every <size> bytes
for example, -alloc-profile 8mib
profiler: reduce likelihood of incorrect executable paths (e.g. just /) showing up
this could happen due to the kernel reporting incorrect paths directly after execve
v26.1.0​
assistant: add support for PDF uploads
assistant: show consumed tokens at bottom of answer
assistant: add support for image uploads
profiler: fix edge cases in metrics collection
processes that were fully idle for a full collection cycle would sometimes report incorrect metrics data
systems that were launching a lot of processes might've reported incorrect metrics data
both of those bugs are fixed now
profiler: fix GPU metrics collection in the presence of more than one GPU
profiler: automatically detect and extract metrics from supported applications
currently vLLM is supported
enabled (or disabled) via the -enable-vllm-metrics flag
web: add metric activity aggregation endpoint
cudaprofiler: fix several bugs
added better handling of CUDA graphs
fix stack trace correlation when PC offsets are disabled for GPU profiling
fix segfault (within CUDA's code) on old, buggy CUDA driver versions
cudaprofiler: use (and ship) CUPTI 13 if runtime CUDA version supports it
ingest: fix unbalanced metrics sharding
helm: adjust service config to route by request, not connection
ensures that MCP requests are always routed to the same web replica
ui: integrate API explorer into the web app
ui: sort "[other ]" entry to the bottom in metric charts/tooltips
v25.12.6​
profiler: add module names to Python <module> frames
profiler: rework stack unwinding mechanism
can unwind stacks that are much much longer than before now
some Python / Java stack traces were cut off in the middle before
profiler: fix an issue where in rare cases the profiler could crash during shutdown when GPU
metrics collection is enabled
cudaprofiler: significant performance improvements
cudaprofiler: fix possible segfault on certain CUDA versions
cudaprofiler: print version on load if logging is enabled to ease troubleshooting
web: fix off by one in /events endpoint in the public API (beta)
web: updated top field values endpoint with pagination support
web: fix bug that could cause simplified function names to not appear on the top function page
web: reduce histogram query memory consumption
backend: fix decoding of older license keys with zing-jdk feature
assistant: improve user-facing error messages
assistant: simplify custom LLM configuration
assistant: call tree + improved default prompts
ui: hide Python runtime frames by default
ui: add fullscreen mode for diff top functions view
ui: add search support for flamegraph on top functions view
ui: initial AI Assistant for CPU flamegraph
ui: add buttons to export CPU / GPU events, and profiler diagnostic data in the Support dialog
these can be sent to the zymtrace team for support
ui: add documentation links to agent installation steps
ui: add service and kernel version to host details page
ui: add Slack support option to Support dialog
v25.12.5​
migrate: fix migration checksum checks for manually managed distributed Clickhouse deployments
v25.12.4​
profiler: add ARM64 support for cudaprofiler
profiler: don't export PC samples by default
these are still captured, and used to disassemble instructions
but as they're high cardinality, exporting the addresses is turned off
by default now, to reduce resource costs of ingest and clickhouse
this can still be enabled by running the workload with env ZYMTRACE_CUDAPROFILER__ENABLE_PC_OFFSETS=true
profiler: fix unwinding for PGO builds of interpreters
the previous fix for this had a small bug which is fixed now
ui: add differential functions view for GPU functions
ui: fix a bug that could cause wrong GPUs to show up on container/pod details pages
ui: fix an issue that could cause the wrong filter to be applied when using the actions inside the chart tooltip
v25.12.3​
ui: fixed an issue that could prevent navigation from the "Top Entities" section on the Efficiency IQ page
profiler: remove special coloring of cuda launch frames
v25.12.2​
profiler: fix bug that could cause cuda launch function frames to appear on CPU flamegraph
ui: adjust truncation of legend entries in charts to use end truncation instead of middle truncation
ui: apply sorting to tooltip entries in metric charts
ui: add select all and deselect all buttons to language filter dropdown
gateway: update Envoy to v1.36.3
Versions ≥ v1.50.0 support automatically raising ulimits to their hard limit, which is
important in K8S clusters using containerd ≥ 2.0, where default limits are very conservative now
storage: don't use EXCHANGE TABLE ClickHouse DDL
allows deploying ClickHouse on filesystems without support for atomic file renames (renameat2)
v25.12.1​
ui: add light mode support
ui: show description of SASS instruction on hover for GPU profiles
ui: fixed an issue that could cause chart legend entries to not be truncated correctly in Firefox
ui: enable transport compression for flamegraph WASM blob
backend: replace gRPC health checks with HTTP health checks
symdb: retry auto upload for broken executables after a day
interval can be configured via SYMDB__BROKEN_EXECUTABLE_RETRY_AFTER env variable (in seconds)
profiler: make auto upload size limits configurable
these limits only apply if -dwarf is given
ZYMTRACE_MAX_SYMBFILE_SIZE (for the symbol file) and ZYMTRACE_MAX_INPUT_FILE_SIZE (for the size of the binary file that we extract symbols from) can be used to configure this
both are in bytes
profiler: add support for NVIDIA MIG devices (for metrics)
profiler: fix bug that could cause available compute graph (and idle time of a machine) to drop significantly, if GPU profiling was active and GPU was heavily used
cudaprofiler: fix rare segfaults in CUDA caused by incompatible version of CUPTI already being loaded in some processes
PyTorch based processes sometimes would load an incompatible libcupti.so, which could lead to segfaults
cudaprofiler: enable sampling of kernel launches by default
cudaprofiler: expose stack trace sampling config via env vars
v25.11.6​
mcp: improve time ranges and limit retries if no data found
profiler: fix unwinding for LTO/PGO builds of interpreters
.cold parts of a split interpreter loop are now supported
profiler: add source location mapping for GPU profiling with PC sampling
profiler: add arguments to control how NVML is located
-nvml-path allows the NVML path to be specified explicitly
-nvml-auto-scan allows opting into an automatic scan for the NVML library
profiler: fixed an issue where in rare cases the profiler could crash during symbol extraction
cudaprofiler: flush PC samples more frequently, and track hardware buffer being full
cudaprofiler: support disassembling more SASS instructions
web/profiler: allow top GPU stall reasons to be viewed in top functions list
ui: show description of stall reason in tooltip for GPU flamegraph
makes the stall reasons more descriptive, and offers possible solutions for reducing their impact
ui: improve click-to-copy handling for text elements
ui: add script/container name support for GPU consumer metrics
ui: persist aggregation and language settings in both URL and local storage
allows sharing links without overwriting local settings; conflicting settings can be accepted, discarded, or kept diverged
v25.11.5​
profiler: fix bug that caused GPU implant instrumentation to not work when profiler runs in docker container
v25.11.4​
profiler: support zing 25.01 with JDK 11 and 17 (21 was already supported)
symblib: fix Rust function name demangling which failed in some cases
mcp: suggest using local time instead of UTC
profiler: reduce allocations in parseFDE()
profiler: place uprobes on GPU implant dynamically (https://github.com/zystem-io/zymtrace/pull/1453 )
implant no longer needs to be mapped into the kubernetes container (for the profiler)
ui: fix rendering bug in flamegraph where child node could be wider than its parent
cudaprofiler: several improvements (https://github.com/zystem-io/zymtrace/pull/1448 )
fix bug with force flushing incomplete kernels
don't delay process synchronizations during startup for CUDA processes, leading to better CPU stack traces for the first few frames
support building with CUDA 13
allow to bypass GPU presence checks in profiler
profiler: support zing 25.01 with jdk 11 and 17 (21 was already supported) (https://github.com/zystem-io/zymtrace/pull/1441 )
symblib: Fix Rust function name demangling which failed in some cases
mcp: Suggest using local time instead of UTC
profiler: Reduce allocations in parseFDE()
brings parseFDE down from 60% of all allocs to 0.3%
mcp: enable collapse_go_system_frames, collapse_jvm_threads, filter_error_events, filter_unreported to reduce number of tokens in the flamegraph response
ui: add new display option collapse_go_system_frames
default enabled and inverted in UI: "Show Go system frames"
aggregates GC frames into "Garbage Collector"
aggregates scheduler frames into "Scheduler"
ui: fix frame filter for java standard library functions
it was previously filtering out functions too aggressively
profiler: improve zing symbolization
adds support for the GC in zing, leading to more debug information being resolved, and thus deeper / longer stack traces
profiler: place uprobes on GPU implant dynamically
implant is now detected and instrumented regardless of its path
cudaprofiler: several improvements
improve flushing of CUPTI activity records
don't delay process synchronizations during startup for CUDA processes, leading to better CPU stack traces for the first few frames
support building with CUDA 13
allow to bypass GPU presence checks in profiler
v25.11.3​
mcp: change flamegraph culling to root-based culling as in the UI
ui: new display option 'filter_error_frames'
filters error frames by default, thus allows the profiler to send error frames by default to improve CPU usage accuracy
all: add oidc/local auth support along with service tokens
v25.11.2​
profiler: increase the maximum number of unwound frames from 128 to 256
this avoids unwind errors with long stack traces and thus improves CPU attribution
profiler: improve log message when falling back from BTF to binary analysis
profiler: print out system info if attaching to tracepoints fails
mcp: add topentities as tool, resource and resource template
v25.11.0​
profiler: reworked PID reporting mechanism, significantly reducing CPU usage
Especially high impact on systems that spawn many short-lived processes
profiler: prefer DWARF symbols over Go symbols during automatic symbol upload
Improves symbol quality for Go executables with DWARF debug info when running the profiler with -dwarf argument
profiler: more efficient stack delta extraction for native executables
Significantly reduces the peak memory usage of the profiler in the presence of large native executables
ui: add new display option collapse_jvm_threads
aggregates GC threads into "Garbage Collector" (see with grouped by "Thread Name" in the flamegraph)
aggregates all GC frames into a single one called "Garbage Collector"
also aggregates JVM JIT frames and threads in the same fashion
turned on by default
v25.10.12​
profiler: optimize performance
v25.10.11​
ui: switch to relative mode if matches is used on column incompatible with absolute mode
v25.10.10​
profiler: fixed an issue that prevented the script name attribute to be set for GPU traces
ui: support matches for regex matches in advanced query mode
v25.10.9​
web: better errors if regex syntax is invalid in matches CEL query
backend: add local login and RBAC support with CRUDs
profiler: fix version matching logic for zing offsets
profiler: add support for more zing versions
we now additionally support these versions
JDK 1.8.X with zing 24.02.X
JDK 1.8.X with zing 24.08.X
JDK 1.8.X with zing 25.02.X
JDK 11.0.X with zing 23.02.X
JDK 11.0.X with zing 23.08.X
JDK 11.0.X with zing 24.02.X
JDK 11.0.X with zing 24.08.X
JDK 11.0.X with zing 25.02.X
JDK 17.0.X with zing 24.02.X
JDK 21.0.X with zing 24.01.X
JDK 21.0.X with zing 24.03.X
JDK 21.0.X with zing 24.04.X
JDK 21.0.X with zing 24.05.X
JDK 21.0.X with zing 24.07.X
JDK 21.0.X with zing 24.09.X
JDK 21.0.X with zing 24.10.X
JDK 21.0.X with zing 24.12.X
JDK 21.0.X with zing 25.01.X
v25.10.8​
ui: fix possible crash during flamegraph rendering
v25.10.7​
profiler: add support for OpenJDK 25
profiler: fix retry logic for fetching container name(s)
gpu: support CUkernels properly in addition to CUfuncs
ui: add detail pages for namespace, pod and deployment
profiler: fix aggregations for metrics (script names were wrongly merged)