Benchmark tools (rev2) [kernel-images] #97

raiden-staging · 2025-11-20T05:57:55Z

Benchmark tools (revision 2) , with a new approach for CDP.

curl http://localhost:444/dev/benchmark?components=all

Example output (ran on local) [updated]:

{
	"elapsed_seconds": 90.99592,
	"errors": [],
	"results": {
		"cdp": {
			"concurrent_connections": 1,
			"direct_endpoint": {
				"endpoint_url": "http://localhost:9223",
				"scenarios": [
					{
						"attempt_count": 9194,
						"category": "Runtime",
						"description": "Evaluate a simple arithmetic expression",
						"duration_seconds": 5.0825887,
						"latency_ms": {"p50": 0.501, "p95": 0.831, "p99": 1.023},
						"name": "Runtime.evaluate-basic",
						"operation_count": 9194,
						"success_rate": 100,
						"throughput_ops_per_sec": 1808.9208,
						"type": "micro"
					},
					{
						"attempt_count": 8592,
						"category": "Runtime",
						"description": "Evaluate JavaScript that reads DOM content",
						"duration_seconds": 5.3275995,
						"latency_ms": {"p50": 0.554, "p95": 0.863, "p99": 1.023},
						"name": "Runtime.evaluate-dom",
						"operation_count": 8592,
						"success_rate": 100,
						"throughput_ops_per_sec": 1612.7339,
						"type": "micro"
					},
					{
						"attempt_count": 9121,
						"category": "DOM",
						"description": "Query DOM for benchmark node",
						"duration_seconds": 5.0924935,
						"event_count": 2,
						"event_throughput_sec": 0.39273492,
						"latency_ms": {"p50": 0.513, "p95": 0.833, "p99": 0.995},
						"name": "DOM.querySelector",
						"operation_count": 9121,
						"success_rate": 100,
						"throughput_ops_per_sec": 1791.0675,
						"type": "dom"
					},
					{
						"attempt_count": 4312,
						"category": "DOM",
						"description": "Fetch layout information for benchmark node",
						"duration_seconds": 5.0856986,
						"event_count": 2,
						"event_throughput_sec": 0.39325964,
						"latency_ms": {"p50": 1.12, "p95": 1.646, "p99": 1.906},
						"name": "DOM.getBoxModel",
						"operation_count": 4312,
						"success_rate": 100,
						"throughput_ops_per_sec": 847.8678,
						"type": "dom"
					},
					{
						"attempt_count": 5685,
						"category": "Performance",
						"description": "Collect performance metrics from the page",
						"duration_seconds": 5.3257976,
						"latency_ms": {"p50": 0.855, "p95": 1.215, "p99": 1.397},
						"name": "Performance.getMetrics",
						"operation_count": 5685,
						"success_rate": 100,
						"throughput_ops_per_sec": 1067.4457,
						"type": "perf"
					},
					{
						"attempt_count": 6991,
						"category": "Runtime",
						"description": "Mutate page state deterministically",
						"duration_seconds": 5.0924788,
						"latency_ms": {"p50": 0.692, "p95": 1.02, "p99": 1.169},
						"name": "Runtime.increment-counter",
						"operation_count": 6991,
						"success_rate": 100,
						"throughput_ops_per_sec": 1372.809,
						"type": "micro"
					},
					{
						"attempt_count": 2,
						"category": "Navigation",
						"description": "Navigate to Hacker News and count headlines",
						"duration_seconds": 0.7890788,
						"event_count": 22,
						"event_throughput_sec": 27.880611,
						"latency_ms": {"p50": 359.158, "p95": 359.158, "p99": 359.158},
						"name": "Navigation.hackernews",
						"operation_count": 2,
						"success_rate": 100,
						"throughput_ops_per_sec": 2.5346012,
						"type": "navigation"
					},
					{
						"attempt_count": 2,
						"category": "Navigation",
						"description": "Navigate to GitHub trending and inspect repository list",
						"duration_seconds": 2.7033806,
						"event_count": 26,
						"event_throughput_sec": 9.617588,
						"latency_ms": {"p50": 1763.612, "p95": 1763.612, "p99": 1763.612},
						"name": "Navigation.github-trending",
						"operation_count": 2,
						"success_rate": 100,
						"throughput_ops_per_sec": 0.73981446,
						"type": "navigation"
					},
					{
						"attempt_count": 1682,
						"category": "Network",
						"description": "Generate network traffic via fetch burst against data URLs",
						"duration_seconds": 5.0762534,
						"event_count": 33640,
						"event_throughput_sec": 6626.935,
						"latency_ms": {"p50": 2.866, "p95": 3.791, "p99": 4.797},
						"name": "Network.fetch-burst",
						"operation_count": 1682,
						"success_rate": 100,
						"throughput_ops_per_sec": 331.34674,
						"type": "network"
					}
				],
				"sessions_started": 9,
				"total_throughput_ops_per_sec": 1149.3956
			},
			"memory_mb": {
				"baseline": 9.5234375,
				"per_connection": 9.53125
			},
			"proxied_endpoint": {
				"endpoint_url": "http://localhost:9222",
				"scenarios": [
					{
						"attempt_count": 7035,
						"category": "Runtime",
						"description": "Evaluate a simple arithmetic expression",
						"duration_seconds": 5.094317,
						"latency_ms": {"p50": 0.674, "p95": 1.046, "p99": 1.256},
						"name": "Runtime.evaluate-basic",
						"operation_count": 7035,
						"success_rate": 100,
						"throughput_ops_per_sec": 1380.9506,
						"type": "micro"
					},
					{
						"attempt_count": 7507,
						"category": "Runtime",
						"description": "Evaluate JavaScript that reads DOM content",
						"duration_seconds": 5.079521,
						"latency_ms": {"p50": 0.631, "p95": 0.971, "p99": 1.159},
						"name": "Runtime.evaluate-dom",
						"operation_count": 7507,
						"success_rate": 100,
						"throughput_ops_per_sec": 1477.8953,
						"type": "micro"
					},
					{
						"attempt_count": 7623,
						"category": "DOM",
						"description": "Query DOM for benchmark node",
						"duration_seconds": 5.0836434,
						"event_count": 2,
						"event_throughput_sec": 0.39341864,
						"latency_ms": {"p50": 0.623, "p95": 0.967, "p99": 1.158},
						"name": "DOM.querySelector",
						"operation_count": 7623,
						"success_rate": 100,
						"throughput_ops_per_sec": 1499.5151,
						"type": "dom"
					},
					{
						"attempt_count": 4134,
						"category": "DOM",
						"description": "Fetch layout information for benchmark node",
						"duration_seconds": 5.083375,
						"event_count": 2,
						"event_throughput_sec": 0.3934394,
						"latency_ms": {"p50": 1.143, "p95": 1.759, "p99": 2.045},
						"name": "DOM.getBoxModel",
						"operation_count": 4134,
						"success_rate": 100,
						"throughput_ops_per_sec": 813.23926,
						"type": "dom"
					},
					{
						"attempt_count": 5805,
						"category": "Performance",
						"description": "Collect performance metrics from the page",
						"duration_seconds": 5.0773044,
						"latency_ms": {"p50": 0.821, "p95": 1.207, "p99": 1.474},
						"name": "Performance.getMetrics",
						"operation_count": 5805,
						"success_rate": 100,
						"throughput_ops_per_sec": 1143.3232,
						"type": "perf"
					},
					{
						"attempt_count": 7199,
						"category": "Runtime",
						"description": "Mutate page state deterministically",
						"duration_seconds": 5.081753,
						"latency_ms": {"p50": 0.659, "p95": 1.003, "p99": 1.214},
						"name": "Runtime.increment-counter",
						"operation_count": 7199,
						"success_rate": 100,
						"throughput_ops_per_sec": 1416.6372,
						"type": "micro"
					},
					{
						"attempt_count": 2,
						"category": "Navigation",
						"description": "Navigate to Hacker News and count headlines",
						"duration_seconds": 0.8605688,
						"event_count": 22,
						"event_throughput_sec": 25.564487,
						"latency_ms": {"p50": 439.364, "p95": 439.364, "p99": 439.364},
						"name": "Navigation.hackernews",
						"operation_count": 2,
						"success_rate": 100,
						"throughput_ops_per_sec": 2.3240442,
						"type": "navigation"
					},
					{
						"attempt_count": 2,
						"category": "Navigation",
						"description": "Navigate to GitHub trending and inspect repository list",
						"duration_seconds": 2.6448922,
						"event_count": 26,
						"event_throughput_sec": 9.830268,
						"latency_ms": {"p50": 1943.432, "p95": 1943.432, "p99": 1943.432},
						"name": "Navigation.github-trending",
						"operation_count": 2,
						"success_rate": 100,
						"throughput_ops_per_sec": 0.7561745,
						"type": "navigation"
					},
					{
						"attempt_count": 1558,
						"category": "Network",
						"description": "Generate network traffic via fetch burst against data URLs",
						"duration_seconds": 5.3473654,
						"event_count": 31160,
						"event_throughput_sec": 5827.1685,
						"latency_ms": {"p50": 3.135, "p95": 4.099, "p99": 4.506},
						"name": "Network.fetch-burst",
						"operation_count": 1558,
						"success_rate": 100,
						"throughput_ops_per_sec": 291.35843,
						"type": "network"
					}
				],
				"sessions_started": 9,
				"total_throughput_ops_per_sec": 1036.4667
			},
			"proxy_overhead_percent": 9.825072
		},
		"recording": {
			"avg_encoding_lag_ms": 67.676765,
			"concurrent_recordings": 1,
			"cpu_overhead_percent": 0.09299688,
			"disk_write_mbps": 0.0025146485,
			"frame_rate_impact": { "before_recording_fps": 24.418766, "during_recording_fps": 24.489937, "impact_percent": -0.29146284 },
			"frames_captured": 111,
			"frames_dropped": 0,
			"memory_overhead_mb": 0
		},
		"webrtc_live_view": {
			"bitrate_kbps": { "audio": 0, "total": 578.4521, "video": 578.4521 },
			"codecs": { "audio": "unknown", "video": "video/VP8" },
			"concurrent_viewers": 1,
			"connection_state": "connected",
			"cpu_usage_percent": 0,
			"frame_latency_ms": { "p50": 40, "p95": 40.818367, "p99": 1997.5 },
			"frame_rate_fps": { "achieved": 24.418766, "max": 25.50255, "min": 0.5006258, "target": 30 },
			"frames": { "corrupted": 0, "decoded": 2100, "dropped": 20, "key_frames_decoded": 84, "received": 2100 },
			"ice_connection_state": "connected",
			"jitter_ms": { "audio": 0, "video": 0 },
			"memory_mb": { "baseline": 19.054688, "per_viewer": 19.054688 },
			"network": { "available_outgoing_bitrate_kbps": 300, "bytes_received": 6387147, "bytes_sent": 34137, "rtt_ms": 1 },
			"packets": { "audio_lost": 0, "audio_received": 0, "loss_percent": 0, "video_lost": 0, "video_received": 6026 },
			"resolution": { "height": 1080, "width": 1920 }
		}
	},
	"startup_timing": {
		"phase_summary": { "fastest_ms": 1, "fastest_phase": "shm_setup", "slowest_ms": 5974, "slowest_phase": "chromium_start" },
		"phases": [
			{"duration_ms": 1, "name": "shm_setup", "percentage": 0.005362218},
			{"duration_ms": 1, "name": "scale_to_zero_disable", "percentage": 0.005362218},
			{"duration_ms": 19, "name": "user_dirs_setup", "percentage": 0.10188214},
			{"duration_ms": 1, "name": "log_aggregator_start", "percentage": 0.005362218},
			{"duration_ms": 149, "name": "supervisord_start", "percentage": 0.79897046},
			{"duration_ms": 2198, "name": "xorg_start", "percentage": 11.786155},
			{"duration_ms": 2387, "name": "mutter_start", "percentage": 12.799614},
			{"duration_ms": 2169, "name": "dbus_start", "percentage": 11.6306505},
			{"duration_ms": 5974, "name": "chromium_start", "percentage": 32.03389},
			{"duration_ms": 2599, "name": "neko_start", "percentage": 13.936404},
			{"duration_ms": 2982, "name": "kernel_api_start", "percentage": 15.990133},
			{"duration_ms": 162, "name": "pulseaudio_start", "percentage": 0.8686793}
		],
		"total_startup_time_ms": 18649
	},
	"system": { "arch": "amd64", "cpu_count": 16, "memory_total_mb": 15999, "os": "linux" },
	"timestamp": "2025-11-21T07:41:36.218198196-08:00"
}

Next Steps :

onkernel/neko
- Merge PR (Benchmark tools [neko])
- Release tag v3.0.8-v1.3.1
onkernel/kernel-images
- Merge this PR (Benchmark tools (rev2) [kernel-images]). Notes :
  - Dockerfile already set to ghcr.io/onkernel/neko/base:3.0.8-v1.3.1
  - after onkernel/neko:v3.0.8-v1.3.1 tag release and build completion, would use new neko build without needing changes

[ @rgarcia ]

Note

Introduces a new benchmarking suite (CDP/WebRTC/recording) exposed at GET /dev/benchmark, wires client-side WebRTC stats reporting, and records/export container startup timing.

API/Server:
- Add GET /dev/benchmark endpoint returning comprehensive results (CDP, WebRTC live view, Recording, system, startup timing).
- Implement CDP runtime benchmark (direct vs proxied), recording profiler (parses ffmpeg stderr), WebRTC collector (reads neko export), screenshot latency probe, CPU/mem utils, and startup timing tracker; expose DevTools /json/version proxy.
- Update OpenAPI schema and generated client/server code for new types and endpoint.
Client (WebRTC):
- Add WebRTCStatsCollector and emit benchmark/webrtc_stats events; extend events/messages typings; start/stop collection on ICE state changes.
Runtime/Wrapper:
- Instrument startup phases, export JSON timing to /tmp/kernel_startup_timing.json; wait for API readiness.
Recorder:
- Capture ffmpeg stderr during recording and expose via GetStderr() for benchmarks.
Image/Dockerfile:
- Bump neko base to 3.0.8-v1.3.1; integrate built API/launcher and client assets.

^{Written by Cursor Bugbot for commit 78fbc17. This will update automatically on new commits. Configure here.}

…tation use npm in server Makefile to install @apiture/openapi-down-convert

…p timing and phases expose startup timing and add screenshot/cdp/recording result conversions in api benchmarks

…eport actual elapsed time remove duration param from api/runbenchmark and per-component runners; compute elapsed_seconds from start time and convert to oapi (cdp 5s, recording 10s, screenshot fixed)

…erfile

…s/types/api simplify webrtc neko stats collection by removing trigger call and shortening wait

…nchmarks add fps measurement to recording profiler and convert/read neko stats; tidy cdp runtime result comments

remove erofs -b flag and drop --image arg from kraft volume import; simplify benchmark endpoint/schema descriptions

… continuous export; remove neko_client add freshness/wait retry for neko stats, small fps sampling delay and debug logs; drop heavy/unsupported cdp methods. tweak chromium unikernel build/run: erofs -b 4096, specify volimport image, increase vcpus to 4

…via websocket start/stop collection in base client; add benchmark event and payload types

…ng over websocket

…erated commands initialize cdp session per worker, use sendCDPCommand with unique ids, improve error handling and stats collection

…ividually and simplify results use total throughput ops per sec, embed scenarios in endpoint results, add benchmarkEndpoint and update api conversions

…ad runner introduce scenarioDef and scenarioStats, add runMixedWorkload; make benchmarkEndpoint accept duration and use 5s default; improve attachToPageTarget logging and domain enabling

…age target per endpoint, inline target lifecycle and cleanup (detach/close) add scenario success counter, remove requiresPage/attachToPageTarget flow, and fix worker msgID calculation

…rbose diagnostics log full attachToTarget response, report missing/wrong sessionId and add getMapKeys helper

… and session handling add sendCDPCommandSimple (sessionless) and update workloads/tests accordingly

select the first page target's webSocketDebuggerUrl and return clear errors on failure

move domain enabling into each worker, use /json/version to fetch debugger url, and make sendCDPCommandSimple loop until it receives the matching id (with safety limit)

…terministic scenarios, improved stats and error sampling api: expose optional session counts, scenario failure counts and error samples; add optionalInt helper and extend default benchmark duration

add navigation helpers and two navigation scenarios; increase timeouts and fix throughput calc

…chmark duration to 15s; compute per-scenario duration across rounds with 400ms min improve per-run accounting (attempts/failures/latency), extend run timeout to 4s, and detach session before closing websocket

run scenarios sequentially (concurrency=1), create/close session per scenario, increase timeouts & rounds, close leftover targets on session close and reset rootID on navigation

…per scenario increase benchmark duration and timeouts; replace scenarioRounds with scenarioIterations

…ork iteration loop increase default benchmark duration to 40s and drop Network.enable from enabled domains

… results and api tune cdpruntime: bump timeouts/durations/iterations, add page warmup, extend waitForReady retries/polling, and increase websocket read limit

…etwork fetch-burst; expose event counts/throughput in results

…api output use awaitPromise for Runtime.evaluate and robustly handle numeric result types; add optionalString helper

mesa-dot-dev · 2025-11-20T05:59:47Z

Mesa Description

Benchmark tools (revision 2) , with a new approach for CDP.

curl http://localhost:444/dev/benchmark?components=all

Example output (ran on local) :

{
  "elapsed_seconds": 109.42869,
  "results": {
    "cdp": {
      "concurrent_connections": 1,
      "direct_endpoint": {
        "endpoint_url": "http://localhost:9223",
        "scenarios": [
          {
            "attempt_count": 7808,
            "category": "Runtime",
            "description": "Evaluate a simple arithmetic expression",
            "duration_seconds": 5.3489366,
            "latency_ms": {
              "p50": 0.607,
              "p95": 0.964,
              "p99": 1.152
            },
            "name": "Runtime.evaluate-basic",
            "operation_count": 7808,
            "success_rate": 100,
            "throughput_ops_per_sec": 1459.7295,
            "type": "micro"
          },
          {
            "attempt_count": 7624,
            "category": "Runtime",
            "description": "Evaluate JavaScript that reads DOM content",
            "duration_seconds": 5.338071,
            "latency_ms": {
              "p50": 0.624,
              "p95": 0.97,
              "p99": 1.153
            },
            "name": "Runtime.evaluate-dom",
            "operation_count": 7624,
            "success_rate": 100,
            "throughput_ops_per_sec": 1428.2313,
            "type": "micro"
          },
          {
            "attempt_count": 8553,
            "category": "DOM",
            "description": "Query DOM for benchmark node",
            "duration_seconds": 5.0914407,
            "event_count": 2,
            "event_throughput_sec": 0.39281613,
            "latency_ms": {
              "p50": 0.55,
              "p95": 0.86,
              "p99": 1.039
            },
            "name": "DOM.querySelector",
            "operation_count": 8553,
            "success_rate": 100,
            "throughput_ops_per_sec": 1679.8782,
            "type": "dom"
          },
          {
            "attempt_count": 4117,
            "category": "DOM",
            "description": "Fetch layout information for benchmark node",
            "duration_seconds": 5.0799603,
            "event_count": 2,
            "event_throughput_sec": 0.39370388,
            "latency_ms": {
              "p50": 1.177,
              "p95": 1.746,
              "p99": 2.072
            },
            "name": "DOM.getBoxModel",
            "operation_count": 4117,
            "success_rate": 100,
            "throughput_ops_per_sec": 810.43945,
            "type": "dom"
          },
          {
            "attempt_count": 6211,
            "category": "Performance",
            "description": "Collect performance metrics from the page",
            "duration_seconds": 5.085529,
            "latency_ms": {
              "p50": 0.766,
              "p95": 1.149,
              "p99": 1.417
            },
            "name": "Performance.getMetrics",
            "operation_count": 6211,
            "success_rate": 100,
            "throughput_ops_per_sec": 1221.3086,
            "type": "perf"
          },
          {
            "attempt_count": 8678,
            "category": "Runtime",
            "description": "Mutate page state deterministically",
            "duration_seconds": 5.0850143,
            "latency_ms": {
              "p50": 0.539,
              "p95": 0.87,
              "p99": 1.053
            },
            "name": "Runtime.increment-counter",
            "operation_count": 8678,
            "success_rate": 100,
            "throughput_ops_per_sec": 1706.5831,
            "type": "micro"
          },
          {
            "attempt_count": 2,
            "category": "Navigation",
            "description": "Navigate to Hacker News and count headlines",
            "duration_seconds": 0.62673134,
            "event_count": 22,
            "event_throughput_sec": 35.10276,
            "latency_ms": {
              "p50": 450.444,
              "p95": 450.444,
              "p99": 450.444
            },
            "name": "Navigation.hackernews",
            "operation_count": 2,
            "success_rate": 100,
            "throughput_ops_per_sec": 3.1911602,
            "type": "navigation"
          },
          {
            "attempt_count": 2,
            "category": "Navigation",
            "description": "Navigate to GitHub trending and inspect repository list",
            "duration_seconds": 2.500876,
            "event_count": 26,
            "event_throughput_sec": 10.396357,
            "latency_ms": {
              "p50": 1826.134,
              "p95": 1826.134,
              "p99": 1826.134
            },
            "name": "Navigation.github-trending",
            "operation_count": 2,
            "success_rate": 100,
            "throughput_ops_per_sec": 0.79971975,
            "type": "navigation"
          },
          {
            "attempt_count": 1465,
            "category": "Network",
            "description": "Generate network traffic via fetch burst against data URLs",
            "duration_seconds": 5.3398223,
            "event_count": 29300,
            "event_throughput_sec": 5487.0737,
            "latency_ms": {
              "p50": 3.084,
              "p95": 5.482,
              "p99": 6.424
            },
            "name": "Network.fetch-burst",
            "operation_count": 1465,
            "success_rate": 100,
            "throughput_ops_per_sec": 274.3537,
            "type": "network"
          }
        ],
        "sessions_started": 9,
        "total_throughput_ops_per_sec": 1122.8552
      },
      "memory_mb": {
        "baseline": 1.0941315,
        "per_connection": 0.68310547
      },
      "proxied_endpoint": {
        "endpoint_url": "http://localhost:9222",
        "scenarios": [
          {
            "attempt_count": 7128,
            "category": "Runtime",
            "description": "Evaluate a simple arithmetic expression",
            "duration_seconds": 5.088774,
            "latency_ms": {
              "p50": 0.67,
              "p95": 1.002,
              "p99": 1.187
            },
            "name": "Runtime.evaluate-basic",
            "operation_count": 7128,
            "success_rate": 100,
            "throughput_ops_per_sec": 1400.7302,
            "type": "micro"
          },
          {
            "attempt_count": 7387,
            "category": "Runtime",
            "description": "Evaluate JavaScript that reads DOM content",
            "duration_seconds": 5.3282127,
            "latency_ms": {
              "p50": 0.647,
              "p95": 0.978,
              "p99": 1.149
            },
            "name": "Runtime.evaluate-dom",
            "operation_count": 7387,
            "success_rate": 100,
            "throughput_ops_per_sec": 1386.3936,
            "type": "micro"
          },
          {
            "attempt_count": 4744,
            "category": "DOM",
            "description": "Query DOM for benchmark node",
            "duration_seconds": 5.087707,
            "event_count": 2,
            "event_throughput_sec": 0.39310437,
            "latency_ms": {
              "p50": 1.034,
              "p95": 1.549,
              "p99": 1.836
            },
            "name": "DOM.querySelector",
            "operation_count": 4744,
            "success_rate": 100,
            "throughput_ops_per_sec": 932.4436,
            "type": "dom"
          },
          {
            "attempt_count": 2250,
            "category": "DOM",
            "description": "Fetch layout information for benchmark node",
            "duration_seconds": 5.186467,
            "event_count": 2,
            "event_throughput_sec": 0.38561895,
            "latency_ms": {
              "p50": 2.221,
              "p95": 3.026,
              "p99": 3.496
            },
            "name": "DOM.getBoxModel",
            "operation_count": 2250,
            "success_rate": 100,
            "throughput_ops_per_sec": 433.82132,
            "type": "dom"
          },
          {
            "attempt_count": 3450,
            "category": "Performance",
            "description": "Collect performance metrics from the page",
            "duration_seconds": 5.147936,
            "latency_ms": {
              "p50": 1.417,
              "p95": 2.032,
              "p99": 2.432
            },
            "name": "Performance.getMetrics",
            "operation_count": 3450,
            "success_rate": 100,
            "throughput_ops_per_sec": 670.1715,
            "type": "perf"
          },
          {
            "attempt_count": 6493,
            "category": "Runtime",
            "description": "Mutate page state deterministically",
            "duration_seconds": 5.1565146,
            "latency_ms": {
              "p50": 0.725,
              "p95": 1.188,
              "p99": 1.444
            },
            "name": "Runtime.increment-counter",
            "operation_count": 6493,
            "success_rate": 100,
            "throughput_ops_per_sec": 1259.1838,
            "type": "micro"
          },
          {
            "attempt_count": 2,
            "category": "Navigation",
            "description": "Navigate to Hacker News and count headlines",
            "duration_seconds": 1.1859651,
            "event_count": 22,
            "event_throughput_sec": 18.550295,
            "latency_ms": {
              "p50": 463.941,
              "p95": 463.941,
              "p99": 463.941
            },
            "name": "Navigation.hackernews",
            "operation_count": 2,
            "success_rate": 100,
            "throughput_ops_per_sec": 1.6863904,
            "type": "navigation"
          },
          {
            "attempt_count": 2,
            "category": "Navigation",
            "description": "Navigate to GitHub trending and inspect repository list",
            "duration_seconds": 2.86993,
            "event_count": 26,
            "event_throughput_sec": 9.059455,
            "latency_ms": {
              "p50": 2168.106,
              "p95": 2168.106,
              "p99": 2168.106
            },
            "name": "Navigation.github-trending",
            "operation_count": 2,
            "success_rate": 100,
            "throughput_ops_per_sec": 0.6968811,
            "type": "navigation"
          },
          {
            "attempt_count": 1326,
            "category": "Network",
            "description": "Generate network traffic via fetch burst against data URLs",
            "duration_seconds": 4.844074,
            "error_samples": [
              "read Runtime.evaluate: failed to get reader: context deadline exceeded"
            ],
            "event_count": 26500,
            "event_throughput_sec": 5470.6025,
            "failure_count": 1,
            "latency_ms": {
              "p50": 3.483,
              "p95": 4.757,
              "p99": 5.431
            },
            "name": "Network.fetch-burst",
            "operation_count": 1325,
            "success_rate": 99.92458,
            "throughput_ops_per_sec": 273.53012,
            "type": "network"
          }
        ],
        "sessions_started": 9,
        "total_throughput_ops_per_sec": 819.4982
      },
      "proxy_overhead_percent": 27.01657
    },
    "recording": {
      "avg_encoding_lag_ms": 66.666664,
      "concurrent_recordings": 1,
      "cpu_overhead_percent": 0.01,
      "disk_write_mbps": 0.0022338866,
      "frames_captured": 131,
      "frames_dropped": 0,
      "memory_overhead_mb": 0.057395935
    },
    "webrtc_live_view": {
      "bitrate_kbps": {
        "audio": 0,
        "total": 0,
        "video": 0
      },
      "codecs": {
        "audio": "unknown",
        "video": "unknown"
      },
      "concurrent_viewers": 0,
      "connection_state": "unknown",
      "cpu_usage_percent": 0,
      "frame_latency_ms": {
        "p50": 0,
        "p95": 0,
        "p99": 0
      },
      "frame_rate_fps": {
        "achieved": 0,
        "max": 0,
        "min": 0,
        "target": 30
      },
      "frames": {
        "corrupted": 0,
        "decoded": 0,
        "dropped": 0,
        "key_frames_decoded": 0,
        "received": 0
      },
      "ice_connection_state": "unknown",
      "jitter_ms": {
        "audio": 0,
        "video": 0
      },
      "memory_mb": {
        "baseline": 0
      },
      "network": {
        "available_outgoing_bitrate_kbps": 0,
        "bytes_received": 0,
        "bytes_sent": 0,
        "rtt_ms": 0
      },
      "packets": {
        "audio_lost": 0,
        "audio_received": 0,
        "loss_percent": 0,
        "video_lost": 0,
        "video_received": 0
      },
      "resolution": {
        "height": 0,
        "width": 0
      }
    }
  },
  "startup_timing": {
    "phase_summary": {
      "fastest_ms": 1,
      "fastest_phase": "shm_setup",
      "slowest_ms": 17813,
      "slowest_phase": "pulseaudio_start"
    },
    "phases": [
      {
        "duration_ms": 1,
        "name": "shm_setup",
        "percentage": 0.0056132474
      },
      {
        "duration_ms": 3,
        "name": "scale_to_zero_disable",
        "percentage": 0.016839743
      },
      {
        "duration_ms": 24,
        "name": "user_dirs_setup",
        "percentage": 0.13471794
      },
      {
        "duration_ms": 26,
        "name": "log_aggregator_start",
        "percentage": 0.14594443
      },
      {
        "duration_ms": 185,
        "name": "supervisord_start",
        "percentage": 1.0384507
      },
      {
        "duration_ms": 2366,
        "name": "xorg_start",
        "percentage": 13.280943
      },
      {
        "duration_ms": 4733,
        "name": "mutter_start",
        "percentage": 26.5675
      },
      {
        "duration_ms": 6910,
        "name": "dbus_start",
        "percentage": 38.78754
      },
      {
        "duration_ms": 12901,
        "name": "chromium_start",
        "percentage": 72.416504
      },
      {
        "duration_ms": 15366,
        "name": "neko_start",
        "percentage": 86.25316
      },
      {
        "duration_ms": 17657,
        "name": "kernel_api_start",
        "percentage": 99.113106
      },
      {
        "duration_ms": 17813,
        "name": "pulseaudio_start",
        "percentage": 99.98878
      }
    ],
    "total_startup_time_ms": 17815
  }
}

This PR introduces a comprehensive benchmarking suite to provide deep performance insights into the system's core components. The suite is accessible via a new GET /dev/benchmark API endpoint and covers startup timing, CDP proxy performance, WebRTC streaming, and recording overhead.

Key Changes:

New Benchmarking API & Framework:
- Adds a GET /dev/benchmark endpoint to orchestrate tests and return detailed metrics.
- Introduces a new benchmarks library in the server for running tests on CDP, WebRTC, recording, and startup, along with CPU/memory monitoring utilities.
- Updates openapi.yaml to define the new endpoint and its extensive response schemas.
Container Startup Instrumentation:
- The wrapper.sh script now instruments each phase of the container's startup (Xorg, Mutter, Chromium, etc.), exporting detailed timing data to /tmp/kernel_startup_timing.json.
- The API reads this file to include a full startup performance analysis in the benchmark report.
CDP Proxy Performance:
- Implements a new CDP benchmark that measures throughput and latency for various operations, comparing direct-to-Chromium performance against our proxy to quantify overhead.
WebRTC Client-Side Stats:
- The web client now features a WebRTCStatsCollector to gather detailed streaming metrics like frame rate, bitrate, and packet loss.
- These stats are emitted to the backend via a new benchmark/webrtc_stats WebSocket event for real-time analysis.
Recording Profiling:
- The FFmpegRecorder now captures stderr, allowing a new RecordingProfiler to parse this data and measure the performance impact of video recording, including encoding lag and frames dropped.

Next Steps :

onkernel/neko
- Merge PR (Benchmark tools [neko])
- Release tag v3.0.8-v1.3.1
onkernel/kernel-images
- Merge this PR (Benchmark tools (rev2) [kernel-images]). Notes :
  - Dockerfile already set to ghcr.io/onkernel/neko/base:3.0.8-v1.3.1
  - after onkernel/neko:v3.0.8-v1.3.1 tag release and build completion, would use new neko build without needing changes

[ @rgarcia ]

^{Description generated by Mesa. Update settings}

images/chromium-headful/wrapper.sh

server/lib/benchmarks/cpu_linux.go

server/lib/benchmarks/recording_profiler.go

cursor · 2025-11-20T05:59:48Z

server/lib/benchmarks/recording_profiler.go

+
+	p.logger.Debug("measured FPS from neko stats", "fps", stats.FrameRateFPS.Achieved)
+	return stats.FrameRateFPS.Achieved
+}


Bug: FPS measurement depends on potentially stale file

The measureCurrentFPS function reads from /tmp/neko_webrtc_benchmark.json without verifying the file's freshness or whether neko is actively writing stats. If neko hasn't started exporting stats yet or the file contains stale data from a previous run, the FPS measurements will be incorrect. The function should verify the timestamp in the JSON matches the current measurement window or wait for neko to initialize stats export.

mesa-dot-dev

Performed full review of 3dfcec2...c1be70f

Analysis

Critical Calculation Errors: The wrapper.sh startup timing incorrectly calculates cumulative rather than phase durations, and the CPU percentage calculation in cpu_linux.go uses an invalid formula, both producing completely misleading benchmark results.
Memory Measurement Flaws: Current implementation using Go's runtime.MemStats.Alloc is unreliable as it only measures Go heap (not process RSS), can report negative values after GC, and misses subprocess memory usage.
Parameter Inconsistencies: Duration parameter defined in API is ignored by all benchmark implementations, creating dead code and misleading API behavior.
Reliability Issues: Short CPU sampling windows (100ms) will produce highly variable results, and the packet loss calculation has an edge case that could incorrectly report 0% loss instead of 100% in certain scenarios.
Resource Management Concerns: CDP sessions create targets without robust cleanup on error paths, and benchmark implementations rely on file-based communication without handling corrupted files.

Tip

Help

Slash Commands:

/review - Request a full code review
/review latest - Review only changes since the last review
/describe - Generate PR description. This will update the PR body or issue comment depending on your configuration
/help - Get help with Mesa commands and configuration options

^{20 files reviewed | 0 comments | Edit Agent Settings • Read Docs}

server/lib/benchmarks/startup_timing.go

server/lib/benchmarks/recording_profiler.go

raiden-staging · 2025-11-21T15:49:15Z

(updated the json output in the PR description, as it had the previous version without webrtc metrics)

raiden-staging added 30 commits November 20, 2025 05:46

[benchmark] add benchmark endpoint and cdp runtime benchmark implemen…

edf0455

…tation use npm in server Makefile to install @apiture/openapi-down-convert

[benchmark] instrument wrapper to capture and export container startu…

19d3e61

…p timing and phases expose startup timing and add screenshot/cdp/recording result conversions in api benchmarks

[benchmark] benchmarks: use fixed-duration component benchmarks and r…

b04eaf7

…eport actual elapsed time remove duration param from api/runbenchmark and per-component runners; compute elapsed_seconds from start time and convert to oapi (cdp 5s, recording 10s, screenshot fixed)

[benchmark] remove ffmpeg-downloader stage from chromium-headful dock…

699e65b

…erfile

[benchmark] remove screenshot benchmark support and related converter…

d47c864

…s/types/api simplify webrtc neko stats collection by removing trigger call and shortening wait

[benchmark] integrate neko websocket trigger and client for webrtc be…

dd52dc8

…nchmarks add fps measurement to recording profiler and convert/read neko stats; tidy cdp runtime result comments

[benchmark] adjust unikernel scripts and trim openapi benchmark docs

6687d31

remove erofs -b flag and drop --image arg from kraft volume import; simplify benchmark endpoint/schema descriptions

[benchmark] lower vcpu count to 2 for chromium-headful unikernel

7e510b7

[benchmark] add webrtc stats collector and stream benchmark payloads …

a8f96f1

…via websocket start/stop collection in base client; add benchmark event and payload types

[benchmark] wrap benchmark webrtc stats into payload field when sendi…

0f109e1

…ng over websocket

[benchmark] bench: make cdp scenarios session-aware and switch to gen…

d61b629

…erated commands initialize cdp session per worker, use sendCDPCommand with unique ids, improve error handling and stats collection

[benchmark] refactor cdp runtime benchmark to benchmark endpoints ind…

6de4399

…ividually and simplify results use total throughput ops per sec, embed scenarios in endpoint results, add benchmarkEndpoint and update api conversions

[benchmark] refactor cdp runtime benchmark and implement mixed-worklo…

a3a9f74

…ad runner introduce scenarioDef and scenarioStats, add runMixedWorkload; make benchmarkEndpoint accept duration and use 5s default; improve attachToPageTarget logging and domain enabling

[benchmark] refactor cdp runtime benchmark: create and attach a new p…

8609538

…age target per endpoint, inline target lifecycle and cleanup (detach/close) add scenario success counter, remove requiresPage/attachToPageTarget flow, and fix worker msgID calculation

[benchmark] cdp: remove flatten from Target.attachToTarget and add ve…

97aadb4

…rbose diagnostics log full attachToTarget response, report missing/wrong sessionId and add getMapKeys helper

[benchmark] simplify cdp benchmark by removing target creation/attach…

16e110e

… and session handling add sendCDPCommandSimple (sessionless) and update workloads/tests accordingly

[benchmark] fetch page websocket url from /json instead of /json/version

d667530

select the first page target's webSocketDebuggerUrl and return clear errors on failure

[benchmark] run mixed cdp workload with per-worker websocket connections

20e54f2

move domain enabling into each worker, use /json/version to fetch debugger url, and make sendCDPCommandSimple loop until it receives the matching id (with safety limit)

[benchmark] refactor cdp runtime benchmark: session-based workers, de…

5fc9607

…terministic scenarios, improved stats and error sampling api: expose optional session counts, scenario failure counts and error samples; add optionalInt helper and extend default benchmark duration

[benchmark] timebox scenarios and record per-scenario duration

5867512

add navigation helpers and two navigation scenarios; increase timeouts and fix throughput calc

[benchmark] run scenarios in multiple rounds and increase default ben…

8ad7301

…chmark duration to 15s; compute per-scenario duration across rounds with 400ms min improve per-run accounting (attempts/failures/latency), extend run timeout to 4s, and detach session before closing websocket

[benchmark] simplify cdp runtime benchmark and session lifecycle

a0a51c1

run scenarios sequentially (concurrency=1), create/close session per scenario, increase timeouts & rounds, close leftover targets on session close and reset rootID on navigation

[benchmark] make cdp runtime benchmark sequential and run iterations …

10199f0

…per scenario increase benchmark duration and timeouts; replace scenarioRounds with scenarioIterations

[benchmark] support per-scenario duration/iterations/timeouts and rew…

5682cac

…ork iteration loop increase default benchmark duration to 40s and drop Network.enable from enabled domains

[benchmark] record attempt_count and duration_seconds in cdp scenario…

5a7a7a8

… results and api tune cdpruntime: bump timeouts/durations/iterations, add page warmup, extend waitForReady retries/polling, and increase websocket read limit

[benchmark] benchmarks/cdp: track cdp events, add scenario type and n…

69121c9

…etwork fetch-burst; expose event counts/throughput in results

[benchmark] include scenario type and event metrics in cdp benchmark …

caaf5ce

…api output use awaitPromise for Runtime.evaluate and robustly handle numeric result types; add optionalString helper

[benchmark] sync with latest

3b5cb7d

[benchmark] update neko base version

c1be70f

cursor bot reviewed Nov 20, 2025

View reviewed changes

mesa-dot-dev bot reviewed Nov 20, 2025

View reviewed changes

[benchmark] mem tracking , phases deltas , system metrics

9f88acf

cursor bot reviewed Nov 20, 2025

View reviewed changes

server/lib/benchmarks/startup_timing.go Show resolved Hide resolved

server/lib/benchmarks/recording_profiler.go Show resolved Hide resolved

[benchmark] NaN guards

78fbc17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark tools (rev2) [kernel-images] #97

Benchmark tools (rev2) [kernel-images] #97

Uh oh!

raiden-staging commented Nov 20, 2025 •

edited

Loading

Uh oh!

mesa-dot-dev bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Nov 20, 2025

Uh oh!

mesa-dot-dev bot left a comment

Uh oh!

Uh oh!

Uh oh!

raiden-staging commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Benchmark tools (rev2) [kernel-images] #97

Are you sure you want to change the base?

Benchmark tools (rev2) [kernel-images] #97

Uh oh!

Conversation

raiden-staging commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Next Steps :

Uh oh!

mesa-dot-dev bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Mesa Description

Key Changes:

Next Steps :

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Nov 20, 2025

Choose a reason for hiding this comment

Bug: FPS measurement depends on potentially stale file

Uh oh!

mesa-dot-dev bot left a comment

Choose a reason for hiding this comment

Analysis

Uh oh!

Uh oh!

Uh oh!

raiden-staging commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

raiden-staging commented Nov 20, 2025 •

edited

Loading

mesa-dot-dev bot commented Nov 20, 2025 •

edited

Loading