Benchmark tools [kernel-images] #89

raiden-staging · 2025-11-03T05:41:25Z

Added benchmark tools to measure various metrics

curl http://localhost:444/dev/benchmark?components=all

Example output (ran on local) :

{
  "elapsed_seconds":23.927925,
  "errors":[],
  "results":{
    "cdp":{
      "concurrent_connections":5,
      "direct_endpoint":{
        "endpoint_url":"http://localhost:9223",
        "latency_ms":{"p50":0.16,"p95":0.346,"p99":0.586},
        "scenarios":[
          {"category":"Runtime","description":"Evaluate simple JavaScript expression","latency_ms":{"p50":0.159,"p95":0.328,"p99":0.568},"name":"Runtime.evaluate","operation_count":7303,"success_rate":100,"throughput_ops_per_sec":1460.5398}
          // omitted ...
        ],
        "throughput_msgs_per_sec":26280.918
      },
      "latency_ms":{"p50":0.352,"p95":0.576,"p99":0.843},
      "memory_mb":{"baseline":1.1285477,"per_connection":0.7372101},
      "message_size_bytes":{"avg":100,"max":200,"min":50},
      "proxied_endpoint":{
        "endpoint_url":"http://localhost:9222",
        "latency_ms":{"p50":0.352,"p95":0.576,"p99":0.843},
        "scenarios":[
          {"category":"Runtime","description":"Evaluate simple JavaScript expression","latency_ms":{"p50":0.349,"p95":0.573,"p99":0.907},"name":"Runtime.evaluate","operation_count":3602,"success_rate":100,"throughput_ops_per_sec":720.31824}
          // omitted ...
        ],
        "throughput_msgs_per_sec":12958.929
      },
      "proxy_overhead_percent":50.690727,
      "throughput_msgs_per_sec":12958.929
    },
    "recording":{
      "avg_encoding_lag_ms":67.676765,
      "concurrent_recordings":1,
      "cpu_overhead_percent":0,
      "disk_write_mbps":0.015905762,
      "frame_rate_impact":{"before_recording_fps":22.500116,"during_recording_fps":23.468586,"impact_percent":-4.304292},
      "frames_captured":131,
      "frames_dropped":0,
      "memory_overhead_mb":0.050559998
    },
    "webrtc_live_view":{
      "bitrate_kbps":{"audio":0,"total":690.22424,"video":690.22424},
      "codecs":{"audio":"unknown","video":"video/VP8"},
      "concurrent_viewers":1,
      "connection_state":"connected",
      "cpu_usage_percent":0,
      "frame_latency_ms":{"p50":40.002,"p95":1995.6,"p99":1995.6},
      "frame_rate_fps":{"achieved":22.222351,"max":25.005001,"min":0.50110245,"target":30},
      "frames":{"corrupted":0,"decoded":400,"dropped":20,"key_frames_decoded":16,"received":400},
      "ice_connection_state":"connected",
      "jitter_ms":{"audio":0,"video":1},
      "memory_mb":{"baseline":19.667969,"per_viewer":19.667969},
      "network":{"available_outgoing_bitrate_kbps":0,"bytes_received":0,"bytes_sent":0,"rtt_ms":1},
      "packets":{"audio_lost":0,"audio_received":0,"loss_percent":0,"video_lost":0,"video_received":1502},
      "resolution":{"height":1080,"width":1920}
    }
  },
  "startup_timing":{
    "phase_summary":{"fastest_ms":2,"fastest_phase":"shm_setup","slowest_ms":19259,"slowest_phase":"pulseaudio_start"},
    "phases":[
      {"duration_ms":2,"name":"shm_setup","percentage":0.010383677},
      {"duration_ms":5,"name":"scale_to_zero_disable","percentage":0.025959192},
      {"duration_ms":32,"name":"user_dirs_setup","percentage":0.16613883},
      {"duration_ms":35,"name":"log_aggregator_start","percentage":0.18171434},
      {"duration_ms":313,"name":"supervisord_start","percentage":1.6250454},
      {"duration_ms":2620,"name":"xorg_start","percentage":13.602616},
      {"duration_ms":5220,"name":"mutter_start","percentage":27.101397},
      {"duration_ms":7405,"name":"dbus_start","percentage":38.445564},
      {"duration_ms":12909,"name":"chromium_start","percentage":67.02144},
      {"duration_ms":16095,"name":"neko_start","percentage":83.56264},
      {"duration_ms":19069,"name":"kernel_api_start","percentage":99.00317},
      {"duration_ms":19259,"name":"pulseaudio_start","percentage":99.98962}
    ],
    "total_startup_time_ms":19261
  },
  "system":{"arch":"amd64","cpu_count":16,"memory_total_mb":8,"os":"linux"},
  "timestamp":"2025-10-31T18:55:12.601706786-07:00"
}

Next Steps :

onkernel/neko
- Merge PR (Benchmark tools [neko])
- Release tag v3.0.8-v1.3.1
onkernel/kernel-images
- Merge this PR (Benchmark tools [kernel-images]). Notes :
  - Dockerfile already set to ghcr.io/onkernel/neko/base:3.0.8-v1.3.1
  - after onkernel/neko:v3.0.8-v1.3.1 tag release and build completion, would use new neko build without needing changes

[ @Sayan- @rgarcia ]

Note

Introduces a full benchmarking suite (CDP, WebRTC, recording) exposed via /dev/benchmark, adds client-side WebRTC stats reporting, container/server startup timing, and regenerates OpenAPI and proxy integrations.

API/Backend:
- New Endpoint: Adds GET /dev/benchmark returning consolidated results (BenchmarkResults) with system info, per-component metrics, and startup timing.
- Benchmarks: Implements CDP proxy runtime benchmark (proxied vs direct, scenarios, throughput/latency, proxy overhead), WebRTC benchmark (reads client-exported stats with fallback), and recording profiler (parses ffmpeg stderr; CPU/mem impact).
- Startup Timing: Tracks server init phases and surfaces container timing from /tmp/kernel_startup_timing.json.
- OpenAPI: Extends spec/types for benchmark models and new endpoint; regenerates client/server code; bumps Playwright exec default timeout to 60s.
- DevTools Proxy: Exposes /json/version and adds bench tests.
- Recorder: Captures ffmpeg stderr and exposes GetStderr() for profiling.
Client (WebRTC):
- Stats Collector: Adds webrtc-stats-collector and sends periodic benchmark/webrtc_stats over WS; integrates start/stop with ICE state; updates events/messages.
Runtime/Infra:
- Wrapper: Adds detailed startup phase logging/export and readiness waits.
- Image: Bumps base to ghcr.io/onkernel/neko/base:3.0.8-v1.3.1 and wires artifacts.

^{Written by Cursor Bugbot for commit 8cba71d. This will update automatically on new commits. Configure here.}

mesa-dot-dev · 2025-11-03T05:42:26Z

Mesa Description

Added benchmark tools to measure various metrics

curl http://localhost:444/dev/benchmark?components=all

Example output (ran on local) :

{
  "elapsed_seconds":23.927925,
  "errors":[],
  "results":{
    "cdp":{
      "concurrent_connections":5,
      "direct_endpoint":{
        "endpoint_url":"http://localhost:9223",
        "latency_ms":{"p50":0.16,"p95":0.346,"p99":0.586},
        "scenarios":[
          {"category":"Runtime","description":"Evaluate simple JavaScript expression","latency_ms":{"p50":0.159,"p95":0.328,"p99":0.568},"name":"Runtime.evaluate","operation_count":7303,"success_rate":100,"throughput_ops_per_sec":1460.5398}
          // omitted ...
        ],
        "throughput_msgs_per_sec":26280.918
      },
      "latency_ms":{"p50":0.352,"p95":0.576,"p99":0.843},
      "memory_mb":{"baseline":1.1285477,"per_connection":0.7372101},
      "message_size_bytes":{"avg":100,"max":200,"min":50},
      "proxied_endpoint":{
        "endpoint_url":"http://localhost:9222",
        "latency_ms":{"p50":0.352,"p95":0.576,"p99":0.843},
        "scenarios":[
          {"category":"Runtime","description":"Evaluate simple JavaScript expression","latency_ms":{"p50":0.349,"p95":0.573,"p99":0.907},"name":"Runtime.evaluate","operation_count":3602,"success_rate":100,"throughput_ops_per_sec":720.31824}
          // omitted ...
        ],
        "throughput_msgs_per_sec":12958.929
      },
      "proxy_overhead_percent":50.690727,
      "throughput_msgs_per_sec":12958.929
    },
    "recording":{
      "avg_encoding_lag_ms":67.676765,
      "concurrent_recordings":1,
      "cpu_overhead_percent":0,
      "disk_write_mbps":0.015905762,
      "frame_rate_impact":{"before_recording_fps":22.500116,"during_recording_fps":23.468586,"impact_percent":-4.304292},
      "frames_captured":131,
      "frames_dropped":0,
      "memory_overhead_mb":0.050559998
    },
    "webrtc_live_view":{
      "bitrate_kbps":{"audio":0,"total":690.22424,"video":690.22424},
      "codecs":{"audio":"unknown","video":"video/VP8"},
      "concurrent_viewers":1,
      "connection_state":"connected",
      "cpu_usage_percent":0,
      "frame_latency_ms":{"p50":40.002,"p95":1995.6,"p99":1995.6},
      "frame_rate_fps":{"achieved":22.222351,"max":25.005001,"min":0.50110245,"target":30},
      "frames":{"corrupted":0,"decoded":400,"dropped":20,"key_frames_decoded":16,"received":400},
      "ice_connection_state":"connected",
      "jitter_ms":{"audio":0,"video":1},
      "memory_mb":{"baseline":19.667969,"per_viewer":19.667969},
      "network":{"available_outgoing_bitrate_kbps":0,"bytes_received":0,"bytes_sent":0,"rtt_ms":1},
      "packets":{"audio_lost":0,"audio_received":0,"loss_percent":0,"video_lost":0,"video_received":1502},
      "resolution":{"height":1080,"width":1920}
    }
  },
  "startup_timing":{
    "phase_summary":{"fastest_ms":2,"fastest_phase":"shm_setup","slowest_ms":19259,"slowest_phase":"pulseaudio_start"},
    "phases":[
      {"duration_ms":2,"name":"shm_setup","percentage":0.010383677},
      {"duration_ms":5,"name":"scale_to_zero_disable","percentage":0.025959192},
      {"duration_ms":32,"name":"user_dirs_setup","percentage":0.16613883},
      {"duration_ms":35,"name":"log_aggregator_start","percentage":0.18171434},
      {"duration_ms":313,"name":"supervisord_start","percentage":1.6250454},
      {"duration_ms":2620,"name":"xorg_start","percentage":13.602616},
      {"duration_ms":5220,"name":"mutter_start","percentage":27.101397},
      {"duration_ms":7405,"name":"dbus_start","percentage":38.445564},
      {"duration_ms":12909,"name":"chromium_start","percentage":67.02144},
      {"duration_ms":16095,"name":"neko_start","percentage":83.56264},
      {"duration_ms":19069,"name":"kernel_api_start","percentage":99.00317},
      {"duration_ms":19259,"name":"pulseaudio_start","percentage":99.98962}
    ],
    "total_startup_time_ms":19261
  },
  "system":{"arch":"amd64","cpu_count":16,"memory_total_mb":8,"os":"linux"},
  "timestamp":"2025-10-31T18:55:12.601706786-07:00"
}

Next Steps :

onkernel/neko
- Merge PR (Benchmark tools [neko])
- Release tag v3.0.8-v1.3.1
onkernel/kernel-images
- Merge this PR (Benchmark tools [kernel-images]). Notes :
  - Dockerfile already set to ghcr.io/onkernel/neko/base:3.0.8-v1.3.1
  - after onkernel/neko:v3.0.8-v1.3.1 tag release and build completion, would use new neko build without needing changes

[ @Sayan- @rgarcia ]

^{Description generated by Mesa. Update settings}

cursor · 2025-11-03T05:43:39Z

server/lib/benchmarks/cpu_linux.go

+	}
+
+	return (float64(deltaTotal) / 100.0) // Convert clock ticks to percentage
+}


Bug: Incorrect CPU percent calculation in benchmarks

The CalculateCPUPercent function incorrectly calculates CPU usage. It divides the change in total CPU clock ticks (deltaTotal) by an arbitrary 100.0. This doesn't account for elapsed time or total available CPU time, leading to meaningless CPU usage percentages in benchmarks.

mesa-dot-dev

Performed full review of b51da91...8cba71d

Analysis

Incorrect CPU Percentage Calculation: The CalculateCPUPercent function in server/lib/benchmarks/cpu_linux.go uses a flawed calculation method that will produce entirely inaccurate CPU metrics, invalidating benchmark results.
Misleading Hardcoded Fallback Values: When WebRTC stats collection fails, the system returns hardcoded estimates (28.0 FPS, 35ms latency) instead of errors, masking real failures and potentially providing false confidence in results.
Race Condition in WebSocket Communication: The code uses non-null assertions when sending WebSocket data which could fail if the connection closes between the connection check and the send operation.
Inconsistent API Design and Silent Error Handling: Several functions ignore errors from stat collection and have confusing signatures (like accepting duration parameters that are ignored).

Tip

Help

Configure your agents

Mesa Docs

Slash Commands:

/review - Request a full code review
/review latest - Review only changes since the last review
/describe - Generate PR description. This will update the PR body or issue comment depending on your configuration
/help - Get help with Mesa commands and configuration options

^{20 files reviewed | 0 comments | Edit Agent Settings}

rgarcia · 2025-11-04T21:12:12Z

images/chromium-headful/client/src/neko/base.ts

          this.onConnected()
+          // Start WebRTC stats collection
+          if (this._peer) {
+            this._webrtcStatsCollector.start(this._peer)


@raiden-staging just double checking my understanding--the client would always be collecting these stats and sending them to the server, and then when you hit the /dev/benchmark endpoint it observes the incoming metrics from all connected webrtc clients and computes stats?

raiden-staging added 2 commits November 3, 2025 06:06

benchmark tools

b69e21f

benchmark tools*

8cba71d

cursor bot reviewed Nov 3, 2025

View reviewed changes

mesa-dot-dev bot reviewed Nov 3, 2025

View reviewed changes

rgarcia reviewed Nov 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark tools [kernel-images] #89

Benchmark tools [kernel-images] #89

Uh oh!

raiden-staging commented Nov 3, 2025 •

edited by cursor bot

Loading

Uh oh!

mesa-dot-dev bot commented Nov 3, 2025

Uh oh!

cursor bot Nov 3, 2025

Uh oh!

mesa-dot-dev bot left a comment

Uh oh!

rgarcia Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Benchmark tools [kernel-images] #89

Are you sure you want to change the base?

Benchmark tools [kernel-images] #89

Uh oh!

Conversation

raiden-staging commented Nov 3, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Next Steps :

Uh oh!

mesa-dot-dev bot commented Nov 3, 2025

Mesa Description

Next Steps :

Uh oh!

cursor bot Nov 3, 2025

Choose a reason for hiding this comment

Bug: Incorrect CPU percent calculation in benchmarks

Uh oh!

mesa-dot-dev bot left a comment

Choose a reason for hiding this comment

Analysis

Uh oh!

rgarcia Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

raiden-staging commented Nov 3, 2025 •

edited by cursor bot

Loading