Skip to content

Conversation

@raiden-staging
Copy link
Contributor

@raiden-staging raiden-staging commented Dec 15, 2025

Aims

Rationale:

  • Media-first projects difficult to test / automate remotely due to infra limitations and lack of tooling.

  • Realtime audiovisual AI agents, use cases being highly restricted by automation / API limitations in realtime contexts / difficulty to setup. Constraints current gen meeting AI agents to passive low level features like transcriptions and summaries. Remote browser makes for perfect environment to solve these restrictions.

  • This PR features open doors for a wider range of use cases.

    • GUI-first apps, APIs limitations|unavailability → Remote browsers automations to operate but for virtual agents.

The new capabilities enables:

  • livestreams (audio|video) and virtual input sources (microphone| ~camera)
  • for different sources (file | stream url | chunks)
  • across different protocols (rtmp | webrtc | websockets) for write/consumption
  • websockets in/out tested extensively for realtime capabilities

Limitations / Current Workarounds

  • v4l2loopback, which enables, virtual video inputs unusable due to container kernel limitations (see this issue ).

    • A source stream | file can be used as fake camera source in chrome ({ "type":"stream"|"file","url":"..." }) but isnt truly realtime as it is meant to be a chrome test feature, which replays the source from start on any load/refresh event.
    • The current workaround is to have a pipe for the virtual feed that can be consumed from the browser, which ie. enables screensharing the virtual video input. The live feed is automatically setup at http://localhost:444/input/devices/virtual/feed?fit=cover to simplify consuming it.
  • The limitations do not apply in the case of audio.
    Writing to / exposing audio devices works properly without additional methods.

Next Steps

  • SDKs to wrap all virtual inputs & output livestreams, managing + sync media in realtime with simple function calls.
    You can check samples/agent-live-demo for a prototype that demonstrates everything at once.

  • Fork + new build for chromium to resolve the virtual input device limitations by extending --use-fake-device-for-media-stream --use-file-for-fake-video-capture=source.y4m relayer, which was designed for mock playback rather than live, to enable consuming in realtime (or add alternative methods)


References

Examples

1.1 Virtual Video Input - WebSocket Feed

Real-time video chunks via WebSocket.
Uses MPEG-1 video in MPEG-TS container for JSMpeg playback.

Configure WebSocket Video Input

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{
    "video": {
      "type": "socket",
      "format": "mpegts",
      "width": 1280,
      "height": 720,
      "frame_rate": 30
    }
  }' | jq

Expected Response:

{
  "state": "running",
  "video": {
    "type": "socket",
    "format": "mpegts",
    "width": 1280,
    "height": 720,
    "frame_rate": 30
  },
  "ingest": {
    "video": {
      "protocol": "socket",
      "format": "mpegts",
      "url": "ws://localhost:10001/input/devices/virtual/socket/video"
    }
  }
}

Encode Source Video to MPEG-1

# Convert any video to MPEG-1 (required for JSMpeg)
ffmpeg -i input.mp4 -c:v mpeg1video -b:v 1500k -r 25 -f mpegts output.ts

Feed Video Chunks (Node.js)

import { createReadStream } from 'node:fs';
import WebSocket from 'ws';

const ws = new WebSocket('ws://localhost:444/input/devices/virtual/socket/video');
const delay = ms => new Promise(r => setTimeout(r, ms));

ws.on('open', async () => {
  for await (const chunk of createReadStream('video.ts', { highWaterMark: 64*1024 })) {
    ws.send(chunk);
    await delay(35); // ~realtime pacing
  }
  console.log('Streaming... socket left open for more chunks');
});

Real-time Behavior

  • Feed page shows video only when chunks arrive
  • Refresh = no cached replay; shows "Loading..." until new chunks
  • Stop sending = black screen; resume = video resumes
  • This is true real-time: no buffering of past data

Preview Feed

Open in browser: http://localhost:444/input/devices/virtual/feed?fit=cover


1.2 Virtual Video Input - WebRTC Feed

Real-time video via WebRTC (VP8/VP9 in IVF format internally).

Configure WebRTC Video Input

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{"video": {"type": "webrtc"}}' | jq

Expected Response:

{
  "state": "running",
  "video": {"type": "webrtc"},
  "ingest": {
    "video": {
      "protocol": "webrtc",
      "format": "ivf",
      "url": "http://localhost:10001/input/devices/virtual/webrtc/offer"
    }
  }
}

Send Video via WebRTC (Python)

import asyncio, aiohttp
from aiortc import RTCPeerConnection, RTCSessionDescription
from aiortc.contrib.media import MediaPlayer

async def main():
    pc = RTCPeerConnection()
    player = MediaPlayer("video.mp4")
    if player.video:
        pc.addTrack(player.video)

    offer = await pc.createOffer()
    await pc.setLocalDescription(offer)

    async with aiohttp.ClientSession() as s:
        resp = await s.post(
            "http://localhost:444/input/devices/virtual/webrtc/offer",
            json={"sdp": pc.localDescription.sdp}
        )
        answer = await resp.json()

    await pc.setRemoteDescription(
        RTCSessionDescription(sdp=answer["sdp"], type="answer")
    )
    print("Streaming...")
    await asyncio.Future()

asyncio.run(main())

Real-time Factor

  • WebRTC provides lowest latency (~100-300ms typical)
  • Feed page refreshes show current frame, not cached history
  • Track stops = black screen; track resumes = video resumes

1.3 Virtual Audio Input - WebSocket Feed

Real-time audio chunks via WebSocket (MP3 format).

Configure WebSocket Audio Input (to Virtual Mic)

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{
    "audio": {
      "type": "socket",
      "format": "mp3",
      "destination": "microphone"
    }
  }' | jq

Expected Response:

{
  "state": "running",
  "audio": {
    "type": "socket",
    "format": "mp3",
    "destination": "microphone"
  },
  "ingest": {
    "audio": {
      "protocol": "socket",
      "format": "mp3",
      "destination": "microphone",
      "url": "ws://localhost:10001/input/devices/virtual/socket/audio"
    }
  }
}

Visual Examples

Feed Page States

Streaming State - Video chunks actively being received:

screenshot_feed_streaming_annotated

Shows test pattern video being streamed via WebSocket. The feed displays video frames only when chunks are actively arriving.

Idle State - No video configured or chunks stopped:

screenshot_feed_stopped_annotated

After stopping or refreshing with no active stream, the feed shows "No virtual video feed configured" message. No cached data is displayed.


Audio Destinations

Destination PulseAudio Sink Use Case
microphone (default) audio_input Virtual mic for apps reading mic input
speaker audio_output Monitor/playback through container audio

Feed Audio Chunks (Node.js)

import { createReadStream } from 'node:fs';
import WebSocket from 'ws';

const ws = new WebSocket('ws://localhost:444/input/devices/virtual/socket/audio');
const delay = ms => new Promise(r => setTimeout(r, ms));

ws.on('open', async () => {
  for await (const chunk of createReadStream('audio.mp3', { highWaterMark: 16*1024 })) {
    ws.send(chunk);
    await delay(50);
  }
  console.log('Audio streaming... socket open for more');
});

Example Logs (Real-time Audio Ingest)

[virtual-input] audio socket connected
[virtual-input] audio chunk received: 16384 bytes
[virtual-input] routing to microphone (audio_input sink)
[virtual-input] audio chunk received: 16384 bytes
[virtual-input] audio chunk received: 8192 bytes
[virtual-input] audio ingest idle, waiting for chunks...
[virtual-input] audio chunk received: 16384 bytes
[virtual-input] routing resumed to microphone

1.4 Virtual Audio Input - WebRTC Feed

Real-time audio via WebRTC (Opus codec).

Configure WebRTC Audio Input

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{
    "audio": {"type": "webrtc", "destination": "microphone"}
  }' | jq

Route to Speaker Instead

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{
    "audio": {"type": "webrtc", "destination": "speaker"}
  }' | jq

Send Audio via WebRTC (Python)

import asyncio, aiohttp
from aiortc import RTCPeerConnection, RTCSessionDescription
from aiortc.contrib.media import MediaPlayer

async def main():
    pc = RTCPeerConnection()
    player = MediaPlayer("audio.mp3")
    if player.audio:
        pc.addTrack(player.audio)

    offer = await pc.createOffer()
    await pc.setLocalDescription(offer)

    async with aiohttp.ClientSession() as s:
        resp = await s.post(
            "http://localhost:444/input/devices/virtual/webrtc/offer",
            json={"sdp": pc.localDescription.sdp}
        )
        answer = await resp.json()

    await pc.setRemoteDescription(
        RTCSessionDescription(sdp=answer["sdp"], type="answer")
    )
    await asyncio.Future()

asyncio.run(main())

1.5 Combined Virtual Input (Video + Audio)

WebSocket Video + Audio

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{
    "video": {"type": "socket", "format": "mpegts", "width": 1280, "height": 720},
    "audio": {"type": "socket", "format": "mp3"}
  }' | jq

Then feed both sockets simultaneously:

  • Video: ws://localhost:444/input/devices/virtual/socket/video
  • Audio: ws://localhost:444/input/devices/virtual/socket/audio

WebRTC Video + Audio

curl -s http://localhost:444/input/devices/virtual/configure \
  -H "Content-Type: application/json" \
  -d '{
    "video": {"type": "webrtc"},
    "audio": {"type": "webrtc"}
  }' | jq

Both tracks use the same WebRTC peer connection.


1.6 Livestream - WebRTC Playback

Expose container display as WebRTC stream for browser consumption.

Start WebRTC Livestream

curl -s http://localhost:444/stream/start \
  -H "Content-Type: application/json" \
  -d '{"mode": "webrtc", "id": "webrtc-live"}' | jq

Expected Response:

{
  "id": "webrtc-live",
  "mode": "webrtc",
  "ingest_url": "",
  "webrtc_offer_url": "http://localhost:10001/stream/webrtc/offer",
  "is_streaming": true,
  "started_at": "2024-01-15T10:30:00Z"
}

Connect from Browser (JavaScript)

const pc = new RTCPeerConnection();
pc.ontrack = e => {
  document.getElementById('video').srcObject = e.streams[0];
};

const offer = await pc.createOffer({ offerToReceiveVideo: true, offerToReceiveAudio: true });
await pc.setLocalDescription(offer);

const resp = await fetch('http://localhost:444/stream/webrtc/offer', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ id: 'webrtc-live', sdp: pc.localDescription.sdp })
});
const answer = await resp.json();
await pc.setRemoteDescription({ type: 'answer', sdp: answer.sdp });

Real-time Factor

  • Sub-second latency typical
  • No buffering; frame drops on slow connections

1.7 Livestream - WebSocket Audio

Stream container audio output as MP3 chunks over WebSocket.

Start Socket Audio Livestream

curl -s http://localhost:444/stream/start \
  -H "Content-Type: application/json" \
  -d '{"mode": "socket", "id": "audio-live"}' | jq

Expected Response:

{
  "id": "audio-live",
  "mode": "socket",
  "ingest_url": "",
  "websocket_url": "ws://localhost:10001/stream/socket/audio-live",
  "is_streaming": true
}

Consume Audio Stream (Node.js)

import WebSocket from 'ws';
import fs from 'node:fs';

const ws = new WebSocket('ws://localhost:444/stream/socket/audio-live');
const out = fs.createWriteStream('captured_audio.ts');

ws.on('message', chunk => out.write(chunk));
ws.on('close', () => out.end());

Example Logs (Audio Livestream)

[livestream] starting socket mode stream: audio-live
[livestream] capturing audio from pulse audio_output
[livestream] websocket client connected to audio-live
[livestream] streaming audio chunk: 4096 bytes
[livestream] streaming audio chunk: 4096 bytes
[livestream] client disconnected from audio-live

1.8 Livestream - RTMP (Local & Remote)

Internal RTMP Server

curl -s http://localhost:444/stream/start \
  -H "Content-Type: application/json" \
  -d '{"mode": "internal"}' | jq

Expected Response:

{
  "id": "default",
  "mode": "internal",
  "ingest_url": "rtmp://localhost:1935/live/default",
  "playback_url": "rtmp://localhost:1935/live/default",
  "is_streaming": true
}

Play with ffplay

ffplay -fflags nobuffer -i rtmp://localhost:1935/live/default

Push to Remote RTMP

curl -s http://localhost:444/stream/start \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "remote",
    "target_url": "rtmp://live.example.com/app/stream-key"
  }' | jq

1.9 Control Commands

Pause (Black Frames/Silence)

curl -X POST http://localhost:444/input/devices/virtual/pause

Resume

curl -X POST http://localhost:444/input/devices/virtual/resume

Stop All

curl -X POST http://localhost:444/input/devices/virtual/stop

Stop Livestream

curl -X POST http://localhost:444/stream/stop

API Reference

Virtual Inputs

Endpoint Method Description
/input/devices/virtual/configure POST Configure video/audio virtual inputs
/input/devices/virtual/status GET Get current virtual input status
/input/devices/virtual/pause POST Pause with black frames/silence
/input/devices/virtual/resume POST Resume live media
/input/devices/virtual/stop POST Stop and release resources
/input/devices/virtual/feed GET HTML page for live preview
/input/devices/virtual/feed/socket/info GET WebSocket URL info for feed
/input/devices/virtual/webrtc/offer POST WebRTC SDP negotiation
/input/devices/virtual/socket/video WS WebSocket video ingest
/input/devices/virtual/socket/audio WS WebSocket audio ingest

Livestream

Endpoint Method Description
/stream/start POST Start livestream (internal/remote/webrtc/socket)
/stream/stop POST Stop livestream
/stream/list GET List active streams
/stream/webrtc/offer POST WebRTC SDP for livestream playback
/stream/socket/{id} WS WebSocket MPEG-TS stream

Request/Response Schemas

VirtualInputsRequest

interface VirtualInputsRequest {
  video?: {
    type: "stream" | "file" | "socket" | "webrtc";
    url?: string;           // For stream/file types
    format?: string;        // "mpegts" for socket, "ivf" for webrtc
    width?: number;
    height?: number;
    frame_rate?: number;
  };
  audio?: {
    type: "stream" | "file" | "socket" | "webrtc";
    url?: string;
    format?: string;        // "mp3" for socket
    destination?: "microphone" | "speaker";  // Default: microphone
  };
  start_paused?: boolean;
}

VirtualInputsStatus

interface VirtualInputsStatus {
  state: "idle" | "running" | "paused";
  mode: "device" | "virtual-file";
  video_device: string;
  audio_sink: string;
  microphone_source: string;
  video?: VirtualInputVideo;
  audio?: VirtualInputAudio;
  ingest?: {
    video?: { protocol: string; format: string; url: string; };
    audio?: { protocol: string; format: string; destination: string; url: string; };
  };
  started_at?: string;
  last_error?: string;
}

StartStreamRequest

interface StartStreamRequest {
  id?: string;
  mode: "internal" | "remote" | "webrtc" | "socket";
  target_url?: string;      // Required for "remote" mode
  framerate?: number;       // 1-20 fps
}

StreamInfo

interface StreamInfo {
  id: string;
  mode: "internal" | "remote" | "webrtc" | "socket";
  ingest_url: string;
  playback_url?: string;
  websocket_url?: string;
  webrtc_offer_url?: string;
  is_streaming: boolean;
  started_at: string;
}

Video Encoding Notes

The feed page uses JSMpeg for WebSocket video playback, which requires MPEG-1 video codec.

Encoding Command

ffmpeg -i source.mp4 -c:v mpeg1video -b:v 1500k -r 25 -f mpegts output.ts

Parameters

Parameter Value Notes
-c:v mpeg1video MPEG-1 Required for JSMpeg
-b:v 1500k 1.5 Mbps Adjust for quality/bandwidth
-r 25 25 fps Match source or reduce
-f mpegts MPEG-TS Container format for streaming

Audio Format Notes

WebSocket Audio Ingest

  • Format: MP3 chunks
  • Chunk size: 16-64 KB typical
  • Pacing: ~50ms between chunks for real-time

WebRTC Audio

  • Codec: Opus
  • Handled automatically by WebRTC stack

Real-time Behavior Summary

Feature Latency Buffer Refresh Behavior
WebSocket Video ~100-500ms None Shows "Loading..." until chunks arrive
WebRTC Video ~100-300ms Minimal Current frame only
WebSocket Audio ~50-200ms None Silence when idle
WebRTC Audio ~50-150ms Minimal Silence when idle
RTMP Internal ~1-3s Some Standard RTMP behavior

Key Principle: No caching of past data. When chunks stop, output shows idle state. When chunks resume, output resumes from current data.


Real-time Factor

The virtual inputs and livestream features are designed for true real-time behavior:

  1. No Replay on Refresh: Unlike buffered video players, refreshing the feed page does not replay cached content. Each refresh shows the current state.

  2. Immediate State Reflection:

    • Chunks arriving → video/audio plays
    • Chunks stop → idle state shown
    • Chunks resume → playback resumes from current data
  3. Connection Independence: Opening multiple tabs shows the same real-time stream, not separate cached copies.

Real-time Video | WebRTC

┌─────────────┐     WebRTC      ┌─────────────┐
│   Source    │ ──────────────► │    Feed     │
│  (aiortc)   │   ~100-300ms    │   (page)    │
└─────────────┘    latency      └─────────────┘
  • Codec: VP8/VP9 (video), Opus (audio)
  • Latency: 100-300ms typical

WebSocket Real-time Characteristics

┌─────────────┐    WebSocket    ┌─────────────┐
│   Source    │ ──────────────► │   JSMpeg    │
│  (chunks)   │   ~100-500ms    │  (decoder)  │
└─────────────┘    latency      └─────────────┘
  • Codec: MPEG-1 video (JSMpeg), MP3 (audio)
  • Latency: 100-500ms typical

Verified Real-time Behavior

The following behaviors have been tested and verified:

Scenario Expected Actual
Page refresh while streaming Shows current frame
Page refresh after stop Shows idle message
Start streaming on idle page Video appears immediately
Stop streaming while viewing Shows idle/blank
Multiple tabs same stream All show same real-time content

Audio Real-time Routing

Audio chunks can be routed to two destinations:

                     ┌──────────────────┐
                     │  Virtual Input   │
                     │    (chunks)      │
                     └────────┬─────────┘
                              │
              ┌───────────────┴───────────────┐
              │                               │
              ▼                               ▼
    ┌─────────────────┐             ┌─────────────────┐
    │   audio_input   │             │  audio_output   │
    │  (virtual mic)  │             │   (speaker)     │
    └────────┬────────┘             └────────┬────────┘
             │                               │
             ▼                               ▼
    Apps read from mic              Playback/monitor
  • destination: "microphone" (default): Routes to virtual mic input. Apps reading from microphone receive this audio.
  • destination: "speaker": Routes to audio output. Useful for monitoring/playback.

[ @rgarcia @juecd ]


Note

Introduce real-time livestreaming (RTMP/RTMPS/WebRTC/WebSocket) and virtual audio/video inputs with WebSocket/WebRTC ingest, preview feed, and full server/image plumbing, plus samples and minor UI tweaks.

  • Backend/API:
    • Livestreaming: Implement FFmpeg streamers (RTMP/RTMPS internal server, WebRTC, WebSocket), manager, and endpoints: POST /stream/start, POST /stream/stop, GET /stream/list, POST /stream/webrtc/offer; WS playback at /stream/socket/{id}.
    • Virtual Inputs: Add manager for virtual mic/cam with sources stream|file|socket|webrtc; endpoints: POST /input/devices/virtual/configure|pause|resume|stop, GET /input/devices/virtual/status, preview page GET /input/devices/virtual/feed, feed socket info, and WS ingest at /input/devices/virtual/socket/{video|audio}.
    • Spec/Tests: Expand OpenAPI and add extensive unit tests for stream/virtual-inputs.
  • Image/Runtime:
    • Enable PulseAudio (configs, dbus), add v4l2loopback, rtkit, open perms/groups, start PulseAudio via supervisord; expose ports 1935/1936; set env for PulseAudio.
  • Samples:
    • Add samples/agent-live-demo, samples/livestream, samples/virtual-inputs (scripts, docs, WS/WebRTC senders, feed capture).
  • Frontend/UI:
    • New real-time feed page (socket/WebRTC); minor client theme color change; loader and mute-indicator/auto-unmute UX.
  • Tooling:
    • Update Makefile (use npm down-convert), add deps (ws, etc.).

Written by Cursor Bugbot for commit bb79584. This will update automatically on new commits. Configure here.

…al camera and microphone

handle v4l2loopback and pulseaudio devices, manage start/pause/resume/stop and scale-to-zero
refactor virtualinputs: new ctor with resolution/framerate, reset last config on stop, rebuild ffmpeg arg construction to handle video/audio indexing, paused/shared sources and unified input arg builder
…te virtual inputs manager into apiservice lifecycle
…ice; add mock virtualinputs manager and assert stop called on shutdown
install v4l2loopback, set pulseaudio default sink/source, switch to npm for openapi-down-convert
switch makefile to pnpm for openapi-down-convert and update generated code metadata
avoid err shadowing, normalize numeric types/enums, cast status and add virtual input defaults
…ack unavailable

prepare fifo video/audio files, set mode and file paths on manager/status and log fallback warning
…al inputs

reset manager mode/files on stop, handle fake video/audio files in ffmpeg args, add fifo preparation and improved device checks and flag file merging/restart logic
…nput mode to device

expose mode, video_file and audio_file in api status types
… paths/descriptions

increase ffmpeg stop timeout to 7s and remove processAlive check
rename fake-file to virtual-file, update comments and add configure/pause/resume/get/stop methods with json body alias
preserve cmd/state instead of clearing if the process remains alive
…ar manager state when process already exited
use stored pid and processAlive(pid) to avoid referencing m.cmd.Process.Pid when cmd may be nil
add curl examples to exercise endpoints under /input/devices/virtual: status, configure, resume, pause, stop and ffmpeg verification
…earing manager state

revise virtual input quick curl flow docs wording and expectations
… and update docs

ensure lingering ffmpeg processes are terminated during virtual input shutdown
…l neko image

clarify use_example curl flow and tweak expected outputs
add killAllFFmpeg helper using pkill (TERM then KILL) with short delay; change log to use virtual capture files
…dependently

fix chromium fake-device flags, prepare capture dir and safe file cleanup; stop treating webrtc disconnected as failure
clarify virtual input audio/video format hints (socket vs webrtc) in oapi models
…ify ws chunk sender defaults and update readme
store up to 512kb of early stream data and replay to new connections; reset intro on format changes
…nt fifo blocking

close keepalives if ffmpeg fails to start; apply to configure, pause and resume
…et and webrtc sources

prevent use-file-for-fake-video-capture for realtime feeds
…am restarts

add findIvfHeader and update processing to discard stale bytes when a header is found mid-buffer and reset decoder state on stream restarts.
only build ffmpeg inputs/outputs for non-realtime audio/video, avoid mapping errors when paused and simplify pulse routing logic
…tual feed broadcaster, audio to pulseaudio via ffmpeg; remove fifo pipe usage and adjust manager/webrtc
…ing ffmpeg for realtime sources

start ffmpeg only when args are present in pause/resume, adjust webrtc test error message
…ter for client adds, intro sends and broadcasts
… mpeg1 requirement

set sample_video_mpeg1.ts as default ingest video in ws_chunk_ingest.js
…vf preamble, remove intro buffering

openapi: add virtual input audio destination enum microphone,speaker
route ingest audio to microphone or speaker and wire destination through api, webrtc and ffmpeg
…nation handling

handle virtual input audio destination (default microphone), update webrtc Configure calls/tests to new signature
… and docs

add destination field and enum for virtual input audio; update README with examples for microphone vs speaker routing and realtime feed notes
…o shared schema

use $ref for VirtualInputAudio.destination and add description/default
add destination field to openapi schema for ingest audio
…ts and expose destination on ingest endpoint

simplify textevent stream handlers by replacing manual flush loops with io.Copy
…ation

flush text/event-stream responses with buffered reads and http.Flusher, falling back to io.Copy
adjust virtual inputs, webrtc and socket ingest to place pulse as output placeholder
function assertEnv() {
if (!REMOTE_RTMP_URL) throw new Error('REMOTE_RTMP_URL is required');
if (!ELEVENLABS_API_KEY) throw new Error('ELEVENLABS_API_KEY is required');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Unused environment variable required by validation

The assertEnv() function requires REMOTE_RTMP_URL to be set, but ENABLE_REMOTE_LIVESTREAM is hardcoded to false on line 14, meaning the remote RTMP feature is disabled and the URL is never actually used. This causes users to receive an unnecessary "REMOTE_RTMP_URL is required" error when running the sample, even though the variable isn't needed. The validation in assertEnv() and sequence.sh should be conditional on whether ENABLE_REMOTE_LIVESTREAM is true.

Additional Locations (1)

Fix in Cursor Fix in Web

async run(meetingUrl) {
this.meetingUrl = meetingUrl;
this.feedUrl = `${KERNEL_API_BASE}/input/devices/virtual/feed?fit=cover`;
const feedPage = await this.agent.newPage(); // new tab
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: BrowserFlow agent property never initialized

The BrowserFlow class initializes this.agent to null in the constructor (line 613) but never assigns it a value. When run() is called via the /browser/join endpoint, calling this.agent.newPage() will throw a null reference error ("Cannot read properties of null"). The browser automation agent is expected to exist but is never created or passed into the BrowserFlow instance.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant