Skip to main content

Install zymtrace profiler

The zymtrace profiler is a lightweight, OpenTelemetry-compliant agent that collects performance profiles from your applications and systems with minimal overhead (<1% CPU and ~256MB RAM ). It can be deployed in various ways to suit your infrastructure needs. This guide covers installation using Kubernetes manifests, Helm charts, Docker containers, or direct binary installation.

tip

Please review the prerequisites before beginning, particularly if you are working in an airgapped environment.

Installation methods​

Choose the installation method that best suits your environment. Each method provides the same functionality with different deployment characteristics.

Install with Kubectl​

The profiler agent is deployed as a DaemonSet.

  1. Create a namespace

    kubectl create namespace zymtrace

  2. Deploy

    kubectl apply -n zymtrace -f https://helm.zystem.io/k8s-manifests/profiler/deploy.yaml

tip

Important: If you need GPU profiling capabilities, we recommend using the Helm installation method instead, which provides more configuration options for enabling GPU profiling.

tip

Collection Agent Configuration

By default, the collection agent is set to zymtrace-gateway.zymtrace.svc.cluster.local:80.

Remember that the collection agent should point to the zymtrace gateway service. If you're installing the profiling agent on a different cluster than the one hosting the backend services, you'll need to modify this setting. In this case, we recommend downloading the configuration file first.

curl -O https://helm.zystem.io/k8s-manifests/profiler/deploy.yaml

</TabItem>
<TabItem value="helm" label="Helm">

### Install with Helm \{#install-helm}

:::info[Helm Chart Source]
The Helm chart source code is available on GitHub: [zystem-io/zymtrace-charts](https://github.com/zystem-io/zymtrace-charts/tree/main/charts/)
# Add the zystem repository
helm repo add zymtrace https://helm.zystem.io

# List available charts and versions
helm search repo zymtrace --versions

# Install
helm install profiler zymtrace/profiler \
--create-namespace \
--namespace zymtrace \
--set profiler.args[0]="--collection-agent=zymtrace-gateway.zymtrace.svc.cluster.local:80" \
--set profiler.args[1]="--disable-tls" \
--set profiler.args[2]="--project=colossus" \
--set profiler.args[3]="--tags=prod;us-east" \
--set "profiler.env.HTTPS_PROXY=http://username:password@proxy:port"

Enabling GPU Profiling​

important

GPU profiling is only available on AMD64/x86_64 architecture.

To enable GPU profiling capabilities with Helm, add the cudaProfiler.enabled setting.

Optionally, include the --enable-gpu-metrics flag to collect GPU metrics as well as shown below:

# Install with GPU profiling and metrics enabled
helm install profiler zymtrace/profiler \
--create-namespace \
--namespace zymtrace \
--set profiler.cudaProfiler.enabled=true \
--set profiler.args[0]="--collection-agent=zymtrace-gateway.zymtrace.svc.cluster.local:80" \
--set profiler.args[1]="--disable-tls" \
--set profiler.args[2]="--enable-gpu-metrics" \
--set profiler.args[3]="--enable-vllm-metrics" \
--set profiler.args[4]="--nvml-auto-scan" # Optional: use if NVML library path is unknown

This will automatically:

  • Deploy the necessary GPU profiling libraries
  • Configure the volume mounts for sharing libraries between containers
  • Extract the profiling libraries to the shared host path
  • Make GPU profiling available to all containers that mount the shared volume
  • If defined, enable GPU metrics collection (power usage, memory utilization, temperature, and performance metrics)
tip

The --enable-gpu-metrics flag is recommended for comprehensive GPU monitoring, but you can remove it if you only want CUDA profiling without metrics collection. You can profile CUDA applications without collecting GPU metrics.

You can also collect only GPU metrics without profiling.

The --enable-vllm-metrics flag enables automatic metrics collection for vLLM-based LLM inference applications. The metrics are correlated with GPU Profilers.

NVML Library Path: Use --nvml-auto-scan to automatically detect the NVML library path. If you know the exact path, use --nvml-path=/path/to/libnvidia-ml.so instead. This is only required for GPU metrics collection.

Using custom-values.yaml​

Artifact Hub  backend

Using custom values

You can create a custom values file with your configurations:

profiler:
args:
- "--collection-agent=zymtrace-gateway.zymtrace.svc.cluster.local:80" # Point to your gateway service
- "--disable-tls"
- "--enable-gpu-metrics"
- "--enable-vllm-metrics"
- "--nvml-auto-scan" # Optional: use if NVML library path is unknown
- "--project=my-project-123"
- "--tags=prod;ha100;fp16"

# Enable GPU profiling
cudaProfiler:
enabled: true
hostMountPath: "/var/lib/zymtrace/profiler" # Default path

env:
HTTPS_PROXY: "http://user:pass@proxy.company.com:8080"

Then install using:

helm install profiler zymtrace/profiler \
--create-namespace \
--namespace zymtrace \
-f custom-values.yaml

Next Step: Hook up your CUDA application​

After enabling the GPU profiling module in the profiler, connect it to your CUDA application by referring to the GPU Profiler documentation.

Management​

These commands help you monitor and maintain your zymtrace profiler installation. Use them to check the agent's status and logs.

Kubectl Management​

# Check agent status
kubectl get pods -n zymtrace -l app=zymtrace,component=profiler

# View agent logs
kubectl logs -f -n zymtrace -l app=zymtrace,component=profiler