Skip to content

Prometheus + Grafana stack up on Kind, scraping pine-gate metrics and showing basic dashboards.

Notifications You must be signed in to change notification settings

amanycodes/chashma

Repository files navigation

chashma - Observability for AI Gateways (Prometheus + Grafana)

Chashma is a production‑ready Prometheus + Grafana bundle with prebuilt dashboards and alerts for AI gateway workloads (like pine‑gate). Install it, point it at your gateway’s /metrics, and get useful graphs and SLO alerts within minutes.

Features

  • Gateway dashboards: requests/sec, latency p50/p95/p99, errors and 429s
  • Backend views: per‑backend traffic share, latency, error rate
  • Quota/Errors view: error codes, error%, 429 pressure; tokens/bytes placeholders
  • SLO alerts: p95 > 2s, error rate > 1%, 429 rate > 5% (tunable)
  • Optional: GPU (DCGM) and tracing (OTel Collector) overview panels
  • Helm chart + small CLI for install, connect, and port‑forward

Compatibility

  • Kubernetes: 1.24+
  • kube‑prometheus‑stack: 55.x+ (bundled by default)
  • pine‑gate: emits gateway_* metrics as described below

Quickstart

  1. Install the chart (bundles kube‑prometheus‑stack by default)
helm upgrade --install chashma charts/chashma -n monitoring --create-namespace
  1. Port‑forward Prometheus and Grafana
# Prometheus
kubectl -n monitoring port-forward svc/prometheus-operated 9090:9090

# Grafana
kubectl -n monitoring port-forward svc/chashma-grafana 3000:80

# Grafana password (user: admin)
kubectl -n monitoring get secret chashma-grafana -o jsonpath='{.data.admin-password}' | base64 -d; echo
  1. Add a ServiceMonitor for your gateway (see next section)

CLI alternative

# Build
make cli-build

# Validate environment
./bin/chashma validate

# Install chart
./bin/chashma install --namespace monitoring

# Connect pine‑gate (replace placeholders)
./bin/chashma connect pine-gate \
  --namespace monitoring \
  --pine-namespace <NS> \
  --pine-selector app=<APP_LABEL> \
  --pine-port-name http

# Port‑forward helpers
./bin/chashma port-forward grafana
./bin/chashma port-forward prometheus

Connect pine‑gate (ServiceMonitor)

Chashma discovers scrape targets via a ServiceMonitor that must match your gateway Service.

Required shape of your Service

  • Namespace: your pine‑gate namespace (for example, default)
  • Label: app=<APP_LABEL> (for pine‑gate Helm, this is usually <release>-pine-gate)
  • Port: named http, serving /metrics

Create a matching ServiceMonitor (pick one)

  • CLI (recommended):
./bin/chashma connect pine-gate \
  --namespace monitoring \
  --pine-namespace <NS> \
  --pine-selector app=<APP_LABEL> \
  --pine-port-name http
  • Helm values (chart‑only):
helm upgrade --install chashma charts/chashma -n monitoring --reuse-values \
  --set pineGate.namespace=<NS> \
  --set pineGate.selector.app=<APP_LABEL> \
  --set pineGate.portName=http

Verify it’s working

  • Prometheus targets: open http://localhost:9090/targets → a pine‑gate job should be UP Discover the job label Prometheus uses:
# In Prometheus "Graph"
label_values(gateway_requests_total, job)
  • Grafana dashboards:
    • Browse (if not organized into folders yet, look under “k8s‑sidecar‑target‑directory”)
    • In “Gateway — pine‑gate” (and other dashboards), set the job dashboard variable to your job value (for example, chashma-pine-gate)

Send demo traffic

# Port‑forward pine‑gate
kubectl -n <NS> port-forward svc/<pine-gate-svc> 8080:80

# Requests
curl -sS -H 'x-api-key: dev-key' -H 'Content-Type: application/json' \
  -X POST localhost:8080/v1/completions -d '{"model":"echo","prompt":"hi"}'

curl -N -H 'x-api-key: dev-key' \
  'http://localhost:8080/v1/stream?model=echo&prompt=hi'

# Optional load burst
for i in {1..50}; do \
  curl -s -o /dev/null -H 'x-api-key: dev-key' -H 'Content-Type: application/json' \
    -X POST localhost:8080/v1/completions -d '{"model":"echo","prompt":"load"}'; \
done

What metrics chashma expects

  • gateway_requests_total{route,method,backend}
  • gateway_request_latency_seconds_bucket{route,method,backend} (histogram)
  • gateway_request_errors_total{route,method,code,backend} These are emitted by pine‑gate. Mock exporters that do not emit gateway_* will not populate these dashboards.

Configuration (Helm values)

  • kps.enabled (bool): install kube‑prometheus‑stack (default true)
  • kps.releaseLabel (string): label to match an existing KPS release if not installing
  • pineGate.serviceMonitor.enabled (bool): create the ServiceMonitor (default true)
  • pineGate.namespace (string): namespace of pine‑gate Service
  • pineGate.selector.* (map): labels to match pine‑gate Service (e.g., app)
  • pineGate.portName (string): named port serving /metrics (default http)
  • dashboards.gateway|backends|quotaErrors|gpu|tracing (bools): enable dashboards
  • alerts.enabled (bool): install PrometheusRule for gateway SLOs
  • Optional Grafana folders: set kube-prometheus-stack.grafana.sidecar.dashboards.folderAnnotation=grafana_folder

CLI quick reference

# install/upgrade the chart
chashma install --namespace monitoring

# create/patch ServiceMonitor
chashma connect pine-gate --pine-namespace <NS> --pine-selector app=<APP_LABEL> --pine-port-name http

# list pods, SMs, services
chashma status

# port‑forward helpers
chashma port-forward grafana
chashma port-forward prometheus

# preflight check
chashma validate

# uninstall
chashma uninstall

Alerts included

  • p95 latency > 2s for 5m (by backend/route)
  • Error rate > 1% for 5m (by backend/route)
  • 429 rate > 5% for 5m (by backend/route) Tune thresholds by editing the PrometheusRule or templating them via values in your fork.

Troubleshooting

  • No targets in Prometheus:
    • Ensure a Service exists in <NS> with app=<APP_LABEL> and a port named http
    • Ensure endpoints are ready: kubectl -n <NS> get endpoints <SVC>
    • Ensure the ServiceMonitor has release: chashma, namespaceSelector.matchNames: [<NS>], and selector.matchLabels.app: <APP_LABEL>
    • One‑liners:
      • Patch SM release: kubectl -n monitoring patch servicemonitor chashma -p '{"metadata":{"labels":{"release":"chashma"}}}' --type=merge
      • Patch SM selector/namespace: use ./bin/chashma connect … or helm upgrade --reuse-values --set pineGate.*
  • Dashboards empty:
    • Set the dashboard job variable to your actual job value (see Verify)
    • In Grafana Explore, check gateway_requests_total returns series
  • Dashboards not visible:
    • Check imports: kubectl -n monitoring logs deploy/chashma-grafana -c grafana-sc-dashboard | tail -n 100
    • Ensure ConfigMaps are labeled grafana_dashboard: "1" (created by the chart)
    • Optional: set folderAnnotation (see Configuration)

Security & operations

  • Set a non‑default Grafana admin password:
    • --set kube-prometheus-stack.grafana.adminPassword=<strong-password>
  • Restrict /metrics exposure with a NetworkPolicy; do not expose pods externally
  • Prefer Ingress + auth (basic/OIDC) over port‑forwards in shared clusters
  • Prometheus retention and resources: adjust in kube‑prometheus‑stack values to control cost
  • Upgrades: use helm upgrade --install chashma … --reuse-values

Release & packaging

# Fetch and build chart dependencies
make chart-deps

# Package chart to dist/
make chart-package

# GitHub Actions on tags build CLI binaries and attach a packaged chart
# (push a tag like v0.1.0 to trigger)

About

Prometheus + Grafana stack up on Kind, scraping pine-gate metrics and showing basic dashboards.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published