profiler: fix bug that could cause cuda launch function frames to appear on CPU flamegraph
ui: adjust truncation of legend entries in charts to use end truncation instead of middle truncation
ui: apply sorting to tooltip entries in metric charts
ui: add select all and deselect all buttons to language filter dropdown
gateway: update Envoy to v1.36.3
Versions ≥ v1.50.0 support automatically raising ulimits to their hard limit, which is
important in K8S clusters using containerd ≥ 2.0, where default limits are very conservative now
storage: don't use EXCHANGE TABLE ClickHouse DDL
allows deploying ClickHouse on filesystems without support for atomic file renames (renameat2)
ui: show description of SASS instruction on hover for GPU profiles
ui: fixed an issue that could cause chart legend entries to not be truncated correctly in Firefox
ui: enable transport compression for flamegraph WASM blob
backend: replace gRPC health checks with HTTP health checks
symdb: retry auto upload for broken executables after a day
interval can be configured via SYMDB__BROKEN_EXECUTABLE_RETRY_AFTER env variable (in seconds)
profiler: make auto upload size limits configurable
these limits only apply if -dwarf is given
ZYMTRACE_MAX_SYMBFILE_SIZE (for the symbol file) and ZYMTRACE_MAX_INPUT_FILE_SIZE (for the size of the binary file that we extract symbols from) can be used to configure this
both are in bytes
profiler: add support for NVIDIA MIG devices (for metrics)
profiler: fix bug that could cause available compute graph (and idle time of a machine) to drop significantly, if GPU profiling was active and GPU was heavily used
cudaprofiler: fix rare segfaults in CUDA caused by incompatible version of CUPTI already being loaded in some processes
PyTorch based processes sometimes would load an incompatible libcupti.so, which could lead to segfaults
cudaprofiler: enable sampling of kernel launches by default
cudaprofiler: expose stack trace sampling config via env vars
symblib: Fix Rust function name demangling which failed in some cases
mcp: Suggest using local time instead of UTC
profiler: Reduce allocations in parseFDE()
brings parseFDE down from 60% of all allocs to 0.3%
mcp: enable collapse_go_system_frames, collapse_jvm_threads, filter_error_events, filter_unreported to reduce number of tokens in the flamegraph response
ui: add new display option collapse_go_system_frames
default enabled and inverted in UI: "Show Go system frames"
aggregates GC frames into "Garbage Collector"
aggregates scheduler frames into "Scheduler"
ui: fix frame filter for java standard library functions
it was previously filtering out functions too aggressively
profiler: improve zing symbolization
adds support for the GC in zing, leading to more debug information being resolved, and thus deeper / longer stack traces
profiler: place uprobes on GPU implant dynamically
implant is now detected and instrumented regardless of its path
cudaprofiler: several improvements
improve flushing of CUPTI activity records
don't delay process synchronizations during startup for CUDA processes, leading to better CPU stack traces for the first few frames